Tamme Claus1, Gaurav Achuda2, Silvia Richter2, Manuel Torrilhon1
1 ACoM, Applied and Computational Mathematics, RWTH Aachen University
2 GFE, Central Facility for Electron Microscopy, RWTH Aachen University
Computational Methods for Inverse Problems, ECCOMAS 2024
"Material imaging based on characteristic x-ray emission"
Microprobe at GFE (source: gfe.rwth-aachen.de)
where to bracket?
different computational costs: $\color{red}{I\times N^2 + IJ\times N} \text{ or } \color{blue}{J\times N^2 + IJ\times N}$
Efficient evaluation of the objective function and its gradient is crucial!
Definition of the Adjoint ($H, G$ Hilbert spaces, $A: G \to H$ continuous, linear) \[ \langle h, A(g) \rangle_H := \langle A^*(h), g \rangle_G \quad \forall h \in H \, \forall g \in G \]
non-adjoint implementation($\color{green}{J} > \color{orange}{I}$) |
adjoint implementation ($\color{orange}{I} > \color{green}{J}$) |
|
\begin{align*}
&\text{foreach } \color{orange}{i = 1, ..., I} \text{ do} \\
&\qquad \color{orange}{v^{(i)} \leftarrow A(g^{(i)})}\\
&\qquad \text{foreach } \color{green}{j = 1, ..., J} \text{ do}\\
&\qquad \qquad \Sigma^{(ji)} \leftarrow \langle \color{green}{h^{(j)}}, \color{orange}{v^{(i)}} \rangle_H\\
&\qquad \text{end}\\
&\text{end}
\end{align*}
|
\begin{align*}
&\text{foreach } \color{green}{j = 1, ..., J} \text{ do} \\
&\qquad \color{green}{\lambda^{(j)} \leftarrow A^*(h^{(j)})}\\
&\qquad \text{foreach } \color{orange}{i = 1, ..., I} \text{ do}\\
&\qquad \qquad \Sigma^{(ji)} \leftarrow \langle \color{green}{\lambda^{(j)}}, \color{orange}{g^{(i)}} \rangle_G\\
&\qquad \text{end}\\
&\text{end}
\end{align*}
|
|
|
|
Derivation: we introduce $\lambda \in U$ (in continuous adjoint method (for $\partial_p$): "Lagrange-multiplier") \begin{align*} \langle h, A(g) \rangle_H = \langle h, u \rangle_H &= c(u) \\ &=c(u) + \color{orange}{\underbrace{a(u, \lambda) + b(\lambda)}_{=0 \, \forall \lambda \in U}} \\ &=\color{green}{\underbrace{c(u) + a(u, \lambda)}_{=0 \, \forall u \in U}} + b(\lambda) \\ & \hphantom{= c(u) + a(u, \lambda)} = b(\lambda) = \langle \lambda, g \rangle_G = \langle A^*(h), g \rangle_G \end{align*}
Let $(H, \langle \cdot{}, \cdot{} \rangle_H)$ be a (real) Hilbert space with inner product $\langle \cdot{}, \cdot{} \rangle_H$. For every continuous linear functional $f_h \in H^*$ (the dual of $H$), there exists a unique vector $h \in H$, called the Riesz Representation of $f_h$, such that \[ f_h(x) = \langle h, x \rangle_H \quad \forall x \in H. \]
Let $(G, \langle \cdot{}, \cdot{} \rangle_G)$ be another (real) Hilbert space, $A:G\to H$ a continuous linear operator between $G$ and $H$ and $f_h \in H^*$ a continuous linear functional. \[ f_h(A(g)) = \langle h, A(g) \rangle_H \quad \forall g \in G \] But $f_h(A(g))$ is also a continuous linear functional $f_\lambda(g)$ in $G$ with a Riesz Representation $\lambda \in G$ \[ f_h(A(g)) = f_\lambda(g) = \langle \lambda, g \rangle_G := \langle A^*(h), g \rangle_G. \]
Derivation: \begin{equation*} \langle \mu, Ag \rangle_{\mathbb{R}} = \langle \mu, a \cdot{} g \rangle_{\mathbb{R}} = \langle a \cdot{} \mu, g \rangle_{\mathbb{R}} = \langle A^*\mu , g \rangle_{\mathbb{R}} \quad \forall g \in \mathbb{R}\, \forall \mu \in \mathbb{R} \end{equation*}
Derivation: \begin{equation*} \langle \mu, Ag \rangle_{\mathbb{R}^m} = \langle \mu, M \cdot{} g \rangle_{\mathbb{R}^m} = \langle M^T \cdot{} \mu, g \rangle_{\mathbb{R}^n} = \langle A^*\mu, g \rangle_{\mathbb{R}^n} \quad \forall g \in \mathbb{R}^n \, \forall \mu \in \mathbb{R}^m \end{equation*}
Derivation: \begin{equation*} \langle \mu, Ag \rangle_\mathbb{R} = \mu \cdot{} \int_\Omega g(x) dx = \int_\Omega \mu \cdot{} 1_\Omega(x) g(x) dx = \langle (A^*\mu)(\cdot{}), g(\cdot{}) \rangle_{L^2(\Omega)} \quad \forall g \in L^2(\Omega) \, \forall \mu \in \mathbb{R} \end{equation*}
Derivation (intentionally complicated): We reinterpret \begin{equation*} M \cdot{} h = g \quad \Leftrightarrow \quad \langle v, M \cdot{} h \rangle = \langle v, g \rangle \quad \forall v \in G \end{equation*} in particular (for a fixed but unspecified $\lambda \in G$) \begin{equation} 0 = \langle \lambda, g \rangle - \langle \lambda, M \cdot{} h \rangle \quad \Leftrightarrow \quad 0 = \langle \lambda, g \rangle - \langle M^T \cdot{} \lambda, h \rangle \end{equation} \begin{align*} \langle \mu, Ag \rangle &= \langle \mu, h \rangle\\ &= \langle \mu, h \rangle - \langle M^T \cdot{} \lambda, h \rangle + \langle \lambda, g \rangle \\ & \quad \text{ with } \langle \mu, h \rangle - \langle M^T \cdot{} \lambda, h \rangle = 0 \quad \forall h \in H \\ & \quad \text{ or } M^T \cdot{} \lambda = \mu \\ &= \langle A^* \mu, g \rangle \end{align*} (alternatively) \begin{equation*} \langle \mu, Ag \rangle_{\mathbb{R}^n} = \langle \mu, M^{-1} g \rangle_{\mathbb{R}^n} = \langle M^{-T} \mu, g \rangle_{\mathbb{R}^n} = \langle A^*\mu, g \rangle_{\mathbb{R}^n} \end{equation*}
Derivation: \begin{align*} \langle \mu, Ag \rangle_H = f_\mu(h) &= f_\mu(h) + \underbrace{a(h, \lambda) + f_g(\lambda)}_{=0\quad \forall \lambda \in G} \\ &= \underbrace{f_\mu(h) + a(h, \lambda)}_{!= 0 \quad \forall h \in H} + f_g(\lambda) = f_g(\lambda) = \langle \lambda, g \rangle_G \end{align*}
we define:
Also we define ($y = f_x$):
For every single assignment $\varphi^{(n)}_{(v^{(j)})_{j \prec n}}$ we know
Model ($u^{(i)} \in U(\mathcal{R}))$:\begin{align*} a_p(u^{(i)}, v) + b^{(i)}(v) &= 0 \quad \forall v \in U(\mathcal{R}) \\ \Sigma^{(i)} &= c(u^{(i)}) \\ C &= g_{\boldsymbol \Sigma} \\ \end{align*} |
1st "adjoint method" ($\lambda \in U(\mathcal{R})$):\begin{align*} a_p(v, \lambda) + c(v) &= 0 \quad \forall v \in U(\mathcal{R}) \\ \Sigma^{(i)} &= b^{(i)}(\lambda) \\ C &= g_{\boldsymbol \Sigma} \end{align*} |
Tangent model/sensitivity ($\dot{\lambda}^{(n)} \in U(\mathcal{R})$):
\begin{align*}
a_p(v, \dot{\lambda}^{(n)}) + \dot{a}_p(v, \lambda, e^{(n)}) &= 0 \quad \forall v \in U(\mathcal{R})\\
\dot{C}^{(n)} &= \dot{g}_{\boldsymbol \Sigma}(\boldsymbol b(\dot{\lambda}^{(n)}))
\end{align*}
\begin{align*}
a_p(v, \dot{\lambda}^{(n)}) + \overbrace{\dot{a}_p(v, \lambda, e^{(n)})}^{=\beta^{(n)}(v)} &= 0 \quad \forall v \in U(\mathcal{R})\\
\dot{C}^{(n)} &= \underbrace{\dot{g}_{\boldsymbol \Sigma}(\boldsymbol b(\dot{\lambda}^{(n)}))}_{=\alpha(\dot{\lambda}^{(n)})}
\end{align*}
|
2nd (continuous) "adjoint method" ($\bar{\lambda} \in U(\mathcal{R})$):
\begin{align*}
a_p(\bar{\lambda}, v) + \dot{g}_{\boldsymbol \Sigma}(\boldsymbol b(v)) &= 0 \quad \forall v \in U(\mathcal{R})\\
\dot{C}^{(n)} &= \dot{a}_p(\bar{\lambda}, \lambda, e^{(n)})
\end{align*}
\begin{align*}
\bar{\Sigma} &= \bar{g}_{\boldsymbol \Sigma}(\bar{C})\\
a_p(\bar{\lambda}, v) + \bar{\boldsymbol \Sigma}^T \boldsymbol b(v) &= 0 \quad \forall v \in U(\mathcal{R})\\
\bar{p} &= \bar{a}_p(\bar{\lambda}, \lambda, \bar{C})\\
\nabla_p C &= \bar{p}
\end{align*}
|
adjoint forward + adjoint derivative |
non-adjoint forward + tangent derivative |
\begin{align} a_p(u, \lambda) + c(u) &= 0 \quad \forall u \in V(\mathcal{R})\\ \boldsymbol \Sigma &= \boldsymbol b(\lambda) \\ C &= g_{\boldsymbol \Sigma} \\ \bar{\boldsymbol \Sigma} &= \bar{g}_{\boldsymbol \Sigma}(\bar{C})\\ a_p(\bar{\lambda}, \dot{\lambda}) + \bar{\boldsymbol{\Sigma}}^T \boldsymbol b(\dot{\lambda}) &= 0 \quad \forall \dot{\lambda} \in V(\mathcal{R}) \\ \bar{p} &= \bar{a}_p(\bar{\lambda}, \lambda, \bar{C})\\ (\nabla_p C)^{(n)} &= \bar{p}^{(n)} \end{align} | \begin{align*} a_p(u^{(i)}, v) + b^{(i)}(v) &= 0 \quad \forall v \in V(\mathcal{R}) \\ \Sigma^{(i)} &= c(u^{(i)}) \\ C &= g_{\boldsymbol \Sigma} \\ a_p(\dot{u}^{(i, n)}, v) + \dot{a}_p(u^{(i)}, v, e^{(n)}) &= 0 \quad \forall v \in V(\mathcal{R})\\ \dot{\Sigma}^{(i, n)} &= c(\dot{u}^{(i, n)}) \\ (\nabla_p C)^{(n)} &= \dot{g}_{\boldsymbol \Sigma}(\dot{\boldsymbol \Sigma}^{(n)}) \end{align*} |
forward (strong form)
\begin{align*}
&\begin{cases}
-\nabla \cdot{} m \nabla u^{(i)} = 0 \quad &\forall x \in \mathcal{R}\\
u^{(i)} = g^{(i)} \quad &\forall x \in \partial\mathcal{R}
\end{cases}\\
&\Sigma^{(i, j)} = \int_{\mathcal{R}} h^{(j)} u^{(i)} dx\\
&C = \frac{1}{2 IJ}\sum_{i, j=1}^{I,J} (\Sigma^{(i, j)} - \tilde{\Sigma}^{(i, j)})^2
\end{align*}
|
|
forward solutions $u^{(i)}$ |
measurements $\Sigma^{(i, j)}$ |
|
adjoint forward (strong form)
\begin{align*}
&\begin{cases}
-\nabla \cdot{} m \nabla \lambda^{(j)} = -h^{(j)} &\forall x \in \mathcal{R} \\
\lambda^{(j)} = 0 \quad &\forall x \in \partial \mathcal{R}
\end{cases}\\
&\Sigma^{(i, j)} = \int_{\partial \mathcal{R}} \nabla_n \lambda^{(j)} g^{(i)} \, d x\\
&C = \frac{1}{2 I J} \sum_{i, j}^{I, J} (\Sigma^{(i, j)} - \tilde{\Sigma}^{(i, j)})^2
\end{align*}
|
adjoint forward $\lambda^{(j)}$ |
|
adjoint derivative (strong form)
\begin{align*}
&\bar{\Sigma}^{(i, j)} = \frac{1}{IJ} (\Sigma^{(i, j)} - \tilde{\Sigma}^{(i, j)}) \bar{C}\\
&\begin{cases}
-\nabla \cdot{} m \nabla \bar{\lambda}^{(j)} = 0 \quad &\forall x \in \mathcal{R} \\
\bar{\lambda}^{(j)} = \sum_{i=1}^{I} \bar{\Sigma}^{(i, j)} g^{(i)} \quad &\forall x \in \partial \mathcal{R}\\
\end{cases} \\
&\bar{m} = \bar{C} \sum_{j=1}^{J} \nabla \bar{\lambda}^{(j)} \cdot{} \nabla \lambda^{(j)} \\
\end{align*}
|
gradient $\bar{m}$ |
|
measurements $\Sigma^{(i, j)}$ and $\tilde{\Sigma}^{(i, j)}$ |
optimized material $m$ |
objective $\text{MSE}(\Sigma^{(i, j)}, \tilde{\Sigma}^{(i,j)})$true material $\tilde{m}$ |
|
|
forward $\psi^{(i)}$ |
adjoint forward $\lambda^{(j)}$ |
gradient $\bar{\rho}_e$ |
forward $\psi^{(i)}$ |
adjoint forward $\lambda^{(j)}$ |
cost: $I \times \mathcal{C}(a) \sim I \times 2\text{min}$ | cost: $J \times \mathcal{C}(a) \sim J \times 2\text{min}$ |
|
adjoint derivative $\bar{\lambda}^{(j)}$ |
gradient $\bar{\rho}_e$ |