{
 "metadata": {
  "name": ""
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# Example: Coefficient field inversion in an elliptic partial differential equation\n",
      "\n",
      "We consider the estimation of a coefficient in an elliptic partial\n",
      "differential equation as a first model problem. Depending on the\n",
      "interpretation of the unknowns and the type of measurements, this\n",
      "model problem arises, for instance, in inversion for groundwater flow\n",
      "or heat conductivity.  It can also be interpreted as finding a\n",
      "membrane with a certain spatially varying stiffness. Let\n",
      "$\\Omega\\subset\\mathbb{R}^n$, $n\\in\\{1,2,3\\}$ be an open, bounded\n",
      "domain and consider the following problem:\n",
      "\n",
      "$$\n",
      "\\min_{a} J(a):=\\frac{1}{2}\\int_\\Omega (u-u_d)^2\\, dx + \\frac{\\gamma}{2}\\int_\\Omega|\\nabla a|^2\\,dx,\n",
      "$$\n",
      "\n",
      "where $u$ is the solution of\n",
      "\n",
      "$$\n",
      "\\begin{split}\n",
      "\\quad -\\nabla\\cdot(a\\nabla u) &= f \\text{ in }\\Omega,\\\\\n",
      "u &= 0 \\text{ on }\\partial\\Omega.\n",
      "\\end{split}\n",
      "$$\n",
      "\n",
      "Here $a\\in U_{ad}:=\\{a\\in L^{\\infty}(\\Omega)\\}$ the unknown coefficient field, $u_d$ denotes (possibly noisy) data, $f\\in H^{-1}(\\Omega)$ a given force, and $\\gamma\\ge 0$ the regularization parameter.\n",
      "\n",
      "### The variational (or weak) form of the state equation:\n",
      "\n",
      "Find $u\\in H_0^1(\\Omega)$ such that $(a\\nabla u,\\nabla v) - (f,v) = 0, \\text{ for all } v\\in H_0^1(\\Omega),$\n",
      "where $H_0^1(\\Omega)$ is the space of functions vanishing on $\\partial\\Omega$ with square integrable derivatives. Here, $(\\cdot\\,,\\cdot)$ denotes the $L^2$-inner product, i.e, for scalar functions $u,v$ defined on $\\Omega$ we denote $(u,v) := \\int_\\Omega u(x) v(x) \\,dx$.\n",
      "\n",
      "### Optimality System:\n",
      "\n",
      "The Lagrangian functional $\\mathscr{L}:L^\\infty(\\Omega)\\times H_0^1(\\Omega)\\times H_0^1(\\Omega)\\rightarrow \\mathbb{R}$, which we use as a tool to derive the optimality system, is given by\n",
      "\n",
      "$$\n",
      "\\mathscr{L}(a,u,p):= \\frac{1}{2}(u-u_d,u-u_d) +\n",
      "\\frac{\\gamma}{2}(\\nabla a, \\nabla a) +  (a\\nabla u,\\nabla p) - (f,p).\n",
      "$$\n",
      "\n",
      "The Lagrange multiplier theory shows that, at a solution all variations of the Lagrangian functional with respect to all variables must vanish. These variations of $\\mathscr{L}$ with respect to $(p,u,a)$ in directions $(\\tilde{u}, \\tilde{p}, \\tilde{a})$ are given by\n",
      "\n",
      "$$\n",
      "  \\begin{alignat}{2}\n",
      "    \\mathscr{L}_p(a,u,p)(\\tilde{p})  &= (a\\nabla u, \\nabla \\tilde{p}) -\n",
      "    (f,\\tilde{p}) &&= 0,\\\\\n",
      "     \\mathscr{L}_u(a,u,p)(\\tilde{u}) &= (a\\nabla p, \\nabla \\tilde{u}) +\n",
      "     (u-u_d,\\tilde{u}) && = 0,\\\\\n",
      "     \\mathscr{L}_a(a,u,p)(\\tilde{a})  &= \\gamma(\\nabla a, \\nabla \\tilde{a}) +\n",
      "     (\\tilde{a}\\nabla u, \\nabla p) &&= 0,\n",
      "  \\end{alignat}\n",
      "$$\n",
      "\n",
      "where the variations $(\\tilde{u}, \\tilde{p}, \\tilde{a})$ are taken from the same spaces as $(u,p,a)$. \n",
      "\n",
      "The gradient of the cost functional $\\mathcal{J}(a)$ therefore is\n",
      "\n",
      "$$\n",
      "    \\mathcal{G}(a)(\\tilde a) = \\gamma(\\nabla a, \\nabla \\tilde{a}) +\n",
      "     (\\tilde{a}\\nabla u, \\nabla \\tilde{p}).\n",
      "$$\n",
      "\n",
      "### Goals:\n",
      "\n",
      "By the end of this notebook, you should be able to:\n",
      "\n",
      "- solve the forward and adjoint Poisson equations\n",
      "- understand the inverse method framework\n",
      "- visualise and understand the results\n",
      "- modify the problem and code\n",
      "\n",
      "### Mathematical tools used:\n",
      "\n",
      "- Finite element method\n",
      "- Derivation of gradiant via the adjoint method\n",
      "- Armijo line search\n",
      "\n",
      "### List of software used:\n",
      "\n",
      "- <a href=\"http://fenicsproject.org/\">FEniCS</a>, a parallel finite element element library for the discretization of partial differential equations\n",
      "- <a href=\"http://www.mcs.anl.gov/petsc/\">PETSc</a>, for scalable and efficient linear algebra operations and solvers\n",
      "- <a href=\"http://matplotlib.org/\">Matplotlib</a>, a python package used for plotting the results\n",
      "- <a href=\"http://www.numpy.org/\">Numpy</a>, a python package for linear algebra"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Set up\n",
      "\n",
      "### Import dependencies"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "from dolfin import *\n",
      "\n",
      "import numpy as np\n",
      "import time\n",
      "import logging\n",
      "\n",
      "import matplotlib.pyplot as plt\n",
      "%matplotlib inline\n",
      "import nb\n",
      "\n",
      "start = time.clock()\n",
      "\n",
      "logging.getLogger('FFC').setLevel(logging.WARNING)\n",
      "logging.getLogger('UFL').setLevel(logging.WARNING)\n",
      "set_log_active(False)\n",
      "\n",
      "np.random.seed(seed=1)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Model set up:\n",
      "\n",
      "As in the introduction, the first thing we need to do is to set up the numerical model.\n",
      "\n",
      "In this cell, we set the mesh ``mesh``, the finite element spaces ``Va`` and ``Vu`` corresponding to the parameter space and state/adjoint space, respectively. In particular, we use linear finite elements for the parameter space, and quadratic elements for the state/adjoint space.\n",
      "\n",
      "The true parameter ``atrue`` is the finite element interpolant of the function\n",
      "$$ a_{\\rm true} = \\left\\{ \\begin{array}{l} 4 \\; \\forall \\,(x,y) \\, {\\rm s.t.}\\, \\sqrt{ (x-.5)^2 + (y-.5)^2} \\leq 0.2 \\\\ 8 \\; {\\rm otherwise}. \\end{array}\\right. $$\n",
      "\n",
      "The forcing term ``f`` and the boundary conditions ``u0`` for the forward problem are\n",
      "$$ f = 1 \\; \\forall {\\bf x} \\in \\Omega, \\quad u = 0 \\; \\forall {\\bf x} \\in \\partial \\Omega. $$"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# create mesh and define function spaces\n",
      "nx = 32\n",
      "ny = 32\n",
      "mesh = UnitSquareMesh(nx, ny)\n",
      "Va = FunctionSpace(mesh, 'Lagrange', 1)\n",
      "Vu = FunctionSpace(mesh, 'Lagrange', 2)\n",
      "\n",
      "# The true and inverted parameter\n",
      "atrue = interpolate(Expression('8. - 4.*(pow(x[0] - 0.5,2) + pow(x[1] - 0.5,2) < pow(0.2,2))'), Va)\n",
      "a = interpolate(Expression(\"4.\"),Va)\n",
      "\n",
      "# define function for state and adjoint\n",
      "u = Function(Vu)\n",
      "p = Function(Vu)\n",
      "\n",
      "# define Trial and Test Functions\n",
      "u_trial, p_trial, a_trial = TrialFunction(Vu), TrialFunction(Vu), TrialFunction(Va)\n",
      "u_test, p_test, a_test = TestFunction(Vu), TestFunction(Vu), TestFunction(Va)\n",
      "\n",
      "# initialize input functions\n",
      "f = Constant(\"1.0\")\n",
      "u0 = Constant(\"0.0\")\n",
      "\n",
      "# plot\n",
      "plt.figure(figsize=(15,5))\n",
      "nb.plot(mesh,subplot_loc=121, mytitle=\"Mesh\", show_axis='on')\n",
      "nb.plot(atrue,subplot_loc=122, mytitle=\"True parameter field\")\n",
      "plt.show()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# set up dirichlet boundary conditions\n",
      "def boundary(x,on_boundary):\n",
      "    return on_boundary\n",
      "\n",
      "bc_state = DirichletBC(Vu, u0, boundary)\n",
      "bc_adj = DirichletBC(Vu, Constant(0.), boundary)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### The cost functional evaluation:\n",
      "\n",
      "$$\n",
      "J(a):=\\underbrace{\\frac{1}{2}\\int_\\Omega (u-u_d)^2\\, dx}_{\\text misfit} + \\underbrace{\\frac{\\gamma}{2}\\int_\\Omega|\\nabla a|^2\\,dx}_{\\text reg}\n",
      "$$\n",
      "\n",
      "In the code below, $W$ and $R$ are symmetric positive definite matrices that stem from finite element discretization of the misfit and regularization component of the cost functional, respectively."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Regularization parameter\n",
      "gamma = 1e-10\n",
      "\n",
      "# weak for for setting up the misfit and regularization compoment of the cost\n",
      "W_equ   = inner(u_trial, u_test) * dx\n",
      "R_equ   = gamma * inner(nabla_grad(a_trial), nabla_grad(a_test)) * dx\n",
      "\n",
      "W = assemble(W_equ)\n",
      "R = assemble(R_equ)\n",
      "\n",
      "# Define cost function\n",
      "def cost(u, ud, a, W, R):\n",
      "    diff = u.vector() - ud.vector()\n",
      "    reg = 0.5 * a.vector().inner(R*a.vector() ) \n",
      "    misfit = 0.5 * diff.inner(W * diff)\n",
      "    return [reg + misfit, misfit, reg]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Set up synthetic observations:\n",
      "\n",
      "To generate the synthetic observation we first solve the PDE for the state variable ``utrue`` corresponding to the true parameter ``atrue``.\n",
      "More specifically, we solve the variational problem\n",
      "\n",
      "Find $u\\in H_0^1(\\Omega)$ such that $\\underbrace{(a_{\\text true} \\nabla u,\\nabla v)}_{\\; := \\; a_{\\rm goal}} - \\underbrace{(f,v)}_{\\; := \\;L_{\\rm goal}} = 0, \\text{ for all } v\\in H_0^1(\\Omega)$.\n",
      "\n",
      "Then we perturb the true state variable and write the observation ``ud`` as\n",
      "$$ u_{d} = u_{\\rm true} + \\eta, \\quad {\\rm where} \\; \\eta \\sim \\mathcal{N}(0, \\sigma^2).$$\n",
      "Here the standard variation $\\sigma$ is proportional to ``noise_level``."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# noise level\n",
      "noise_level = 0.01\n",
      "\n",
      "# weak form for setting up the synthetic observations\n",
      "a_goal = inner( atrue * nabla_grad(u_trial), nabla_grad(u_test)) * dx\n",
      "L_goal = f * u_test * dx\n",
      "\n",
      "# solve the forward/state problem to generate synthetic observations\n",
      "goal_A, goal_b = assemble_system(a_goal, L_goal, bc_state)\n",
      "\n",
      "utrue = Function(Vu)\n",
      "solve(goal_A, utrue.vector(), goal_b)\n",
      "\n",
      "ud = Function(Vu)\n",
      "ud.assign(utrue)\n",
      "\n",
      "# perturb state solution and create synthetic measurements ud\n",
      "# ud = u + ||u||/SNR * random.normal\n",
      "MAX = ud.vector().norm(\"linf\")\n",
      "noise = Vector()\n",
      "goal_A.init_vector(noise,1)\n",
      "noise.set_local( noise_level * MAX * np.random.normal(0, 1, len(ud.vector().array())) )\n",
      "bc_adj.apply(noise)\n",
      "\n",
      "ud.vector().axpy(1., noise)\n",
      "\n",
      "# plot\n",
      "nb.multi1_plot([utrue, ud], [\"State solution with atrue\", \"Synthetic observations\"])\n",
      "plt.show()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Setting up the state equations, right hand side for the adjoint and the necessary matrices:\n",
      "\n",
      "$$\n",
      "  \\begin{alignat}{2}\n",
      "    \\mathscr{L}_p(a,u,p)(\\tilde{p})  &= (a\\nabla u, \\nabla \\tilde{p}) -\n",
      "    (f,\\tilde{p}) &&= 0,\\\\\n",
      "     \\mathscr{L}_u(a,u,p)(\\tilde{u}) &= (a\\nabla p, \\nabla \\tilde{u}) +\n",
      "     (u-u_d,\\tilde{u}) && = 0,\\\\\n",
      "     \\mathscr{L}_a(a,u,p)(\\tilde{a})  &= \\gamma(\\nabla a, \\nabla \\tilde{a}) +\n",
      "     (\\tilde{a}\\nabla u, \\nabla p) &&= 0,\n",
      "  \\end{alignat}\n",
      "$$"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# weak form for setting up the state equation\n",
      "a_state = inner( a * nabla_grad(u_trial), nabla_grad(u_test)) * dx\n",
      "L_state = f * u_test * dx\n",
      "\n",
      "# weak form for setting up the adjoint equations\n",
      "a_adj = inner( a * nabla_grad(p_trial), nabla_grad(p_test) ) * dx\n",
      "L_adjoint = -inner(u - ud, u_test) * dx\n",
      "\n",
      "\n",
      "# weak form for setting up matrices\n",
      "CT_equ   = inner(a_test * nabla_grad(u), nabla_grad(p_trial)) * dx\n",
      "M_equ   = inner(a_trial, a_test) * dx\n",
      "\n",
      "\n",
      "# assemble matrices M\n",
      "M = assemble(M_equ)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Initial guess\n",
      "We solve the state equation and compute the cost functional for the initial guess of the parameter ``a_ini``"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# solve state equation\n",
      "A, state_b = assemble_system (a_state, L_state, bc_state)\n",
      "solve (A, u.vector(), state_b)\n",
      "\n",
      "# evaluate cost\n",
      "[cost_old, misfit_old, reg_old] = cost(u, ud, a, W, R)\n",
      "\n",
      "# plot\n",
      "plt.figure(figsize=(15,5))\n",
      "nb.plot(a,subplot_loc=121, mytitle=\"a_ini\", vmin=atrue.vector().min(), vmax=atrue.vector().max())\n",
      "nb.plot(u,subplot_loc=122, mytitle=\"u(a_ini)\")\n",
      "plt.show()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## The steepest descent with Armijo line search:\n",
      "\n",
      "We solve the constrained optimization problem using the steepest descent method with Armijo line search.\n",
      "\n",
      "The stopping criterion is based on a relative reduction of the norm of the gradient (i.e. $\\frac{\\|g_{n}\\|}{\\|g_{0}\\|} \\leq \\tau$).\n",
      "\n",
      "The gradient is computed by solving the state and adjoint equation for the current parameter $a$, and then substituing the current state $u$, parameter $a$ and adjoint $p$ variables in the weak form expression of the gradient:\n",
      "$$ (g, \\tilde{a}) = \\gamma(\\nabla a, \\nabla \\tilde{a}) +(\\tilde{a}\\nabla u, \\nabla p).$$\n",
      "\n",
      "The Armijo line search uses backtracking to find $\\alpha$ such that a sufficient reduction in the cost functional is achieved.\n",
      "More specifically, we use backtracking to find $\\alpha$ such that:\n",
      "$$J( a - \\alpha g ) \\leq J(a) - \\alpha c_{\\rm armijo} (g,g). $$\n"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# define parameters for the optimization\n",
      "tol = 1e-4\n",
      "maxiter = 1000\n",
      "plot_any = 30\n",
      "c_armijo = 1e-5\n",
      "\n",
      "# initialize iter counters\n",
      "iter = 1\n",
      "converged = False\n",
      "\n",
      "# initializations\n",
      "g = Vector()\n",
      "R.init_vector(g,0)\n",
      "\n",
      "a_prev = Function(Va)\n",
      "\n",
      "print \"Nit  cost          misfit        reg         ||grad||       alpha  N backtrack\"\n",
      "\n",
      "while iter <  maxiter and not converged:\n",
      "\n",
      "    # assemble matrix C\n",
      "    CT =  assemble(CT_equ)\n",
      "\n",
      "    # solve the adoint problem\n",
      "    adj_A, adjoint_RHS = assemble_system(a_adj, L_adjoint, bc_adj)\n",
      "    solve(adj_A, p.vector(), adjoint_RHS)\n",
      "\n",
      "    # evaluate the  gradient\n",
      "    MG = CT*p.vector() + R * a.vector()\n",
      "    solve(M, g, MG)\n",
      "\n",
      "    # calculate the norm of the gradient\n",
      "    grad_norm2 = g.inner(MG)\n",
      "    gradnorm = sqrt(grad_norm2)\n",
      "    \n",
      "    if iter == 1:\n",
      "        gradnorm0 = gradnorm\n",
      "\n",
      "    # linesearch\n",
      "    it_backtrack = 0\n",
      "    a_prev.assign(a)\n",
      "    alpha = 8.e5\n",
      "    backtrack_converged = False\n",
      "    for it_backtrack in range(20):\n",
      "        \n",
      "        a.vector().axpy(-alpha, g )\n",
      "\n",
      "        # solve the state/forward problem\n",
      "        state_A, state_b = assemble_system(a_state, L_state, bc_state)\n",
      "        solve(state_A, u.vector(), state_b)\n",
      "\n",
      "        # evaluate cost\n",
      "        [cost_new, misfit_new, reg_new] = cost(u, ud, a, W, R)\n",
      "\n",
      "        # check if Armijo conditions are satisfied\n",
      "        if cost_new < cost_old - alpha * c_armijo * grad_norm2:\n",
      "            cost_old = cost_new\n",
      "            backtrack_converged = True\n",
      "            break\n",
      "        else:\n",
      "            alpha *= 0.5\n",
      "            a.assign(a_prev)  # reset a\n",
      "            \n",
      "    if backtrack_converged == False:\n",
      "        print \"Backtracking failed. A sufficient descent direction was not found\"\n",
      "        converged = False\n",
      "        break\n",
      "\n",
      "    sp = \"\"\n",
      "    print \"%3d %1s %8.5e %1s %8.5e %1s %8.5e %1s %8.5e %1s %8.5e %1s %3d\" % \\\n",
      "        (iter, sp, cost_new, sp, misfit_new, sp, reg_new, sp, \\\n",
      "        gradnorm, sp, alpha, sp, it_backtrack)\n",
      "\n",
      "    if (iter % plot_any)==0 :\n",
      "        nb.multi1_plot([a,u,p], [\"a\",\"u\",\"p\"], same_colorbar=False)\n",
      "        plt.show()\n",
      "    \n",
      "    # check for convergence\n",
      "    if gradnorm < tol*gradnorm0 and iter > 1:\n",
      "        converged = True\n",
      "        print \"Steepest descent converged in \",iter,\"  iterations\"\n",
      "        \n",
      "    iter += 1\n",
      "    \n",
      "if not converged:\n",
      "    print \"Steepest descent did not converge in \", maxiter, \" iterations\"\n",
      "\n",
      "print \"Time elapsed: \", time.clock()-start"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "nb.multi1_plot([atrue, a], [\"atrue\", \"a\"])\n",
      "nb.multi1_plot([u,p], [\"u\",\"p\"], same_colorbar=False)\n",
      "plt.show()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    }
   ],
   "metadata": {}
  }
 ]
}