for a fluctuation in u to propagate from ${x}^{\left(0\right)}$ to $x$, and $A\left(x,\omega \right)$ represents its amplitude. The details of the ray solution depend on $\mathcal{L}\left(m\right)$ ; we consider the simple (and common) case $\mathcal{L}\left(m\right)={s}^{2}{\nabla }^{2}$, where $s\left(x,m\right)$ is a slowness function; that is, a material property that is inversely proportional to the local propagation velocity. Inserting (4) into the differential equation and equating equal powers of $\omega$ lead to the Eikonal equation for $T\left(x\right)$ :

$\nabla T\cdot \nabla T={s}^{2}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{with}\text{\hspace{0.17em}}\text{boundary}\text{\hspace{0.17em}}\text{condition}\text{\hspace{0.17em}}\text{ }T\left({x}^{\left(0\right)}\right)=0$ (7)

and a sequence of equations for ${A}^{\left(k\right)}$, the lowest order of which is the transport equation  :

$2\nabla T\cdot \nabla {A}^{\left(0\right)}+{A}^{\left(0\right)}{\nabla }^{2}T=0$ (8)

The unit normal to a surface of equal travel time is $\stackrel{^}{t}\left(x\right)={s}^{-1}\nabla T$. A sequence of these vectors connecting surfaces of increasing travel times defines a ray; that is, a parametric curve $x\left(\mathcal{l}\right)$ with arclength $\mathcal{l}$ and tangent $\stackrel{^}{t}\left(\mathcal{l}\right)$ (Figure 1(A)). The volume enclosed by a group of rays is called a ray tube. The Eikonal equation, written as two coupled first order equations in $x\left(\mathcal{l}\right)$ and $\stackrel{^}{t}\left(\mathcal{l}\right)$ is:

$\frac{\text{d}x}{\text{d}\mathcal{l}}=\stackrel{^}{t}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\frac{\text{d}\stackrel{^}{t}}{\text{d}\mathcal{l}}=\stackrel{^}{t}×\left({s}^{-1}\nabla s×\stackrel{^}{t}\right)$

with boundary conditions

$x\left(0\right)={x}^{\left(0\right)}$ and $\stackrel{^}{t}\left(0\right)={\stackrel{^}{t}}^{\left(0\right)}$ (9)

The ray’s starting point is ${x}^{\left(0\right)}$ and its take-off direction is ${\stackrel{^}{t}}^{\left(0\right)}$. Travel time is then the path integral of the slowness along the ray, as can be seen by manipulating the formula for the directional derivative $\text{d}/\text{d}\mathcal{l}={\stackrel{^}{t}}_{0}\cdot \nabla$ :

$T\left(x\left(\mathcal{l}\right)\right)={\int }_{0}^{\mathcal{l}}\left(\stackrel{^}{t}\cdot \nabla T\right)\text{d}{\mathcal{l}}^{\prime }={\int }_{0}^{\mathcal{l}}s\left({\mathcal{l}}^{\prime }\right)\text{d}{\mathcal{l}}^{\prime }$ (10)

The transport equation, written in terms of $\stackrel{^}{t}\left(\mathcal{l}\right)$, is:

$-\frac{\nabla \mathcal{E}}{\mathcal{E}}\cdot \stackrel{^}{t}=\nabla \cdot \stackrel{^}{t}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{with}\text{\hspace{0.17em}}\text{ }\mathcal{E}\equiv s{\left({A}^{\left(0\right)}\right)}^{2}$ (11)

Figure 1. (A) Basic ray theory nomenclature. Wave propagates outward from a source at ${x}^{\left(0\right)}$ (black circle), through the medium, to the surface ${x}_{S}$ (with normal $\stackrel{^}{n}$ ). Surfaces of equal travel time (wave fronts, grey curves) are labeled with their travel times ${T}_{1}$, ${T}_{2}$, etc., Normals to wave fronts define rays (blue curves) with tangents $\stackrel{^}{t}$. Neighboring rays enclosing a solid angle $\text{d}\Omega$ at the source define a ray tube. (B) Relationship between ray tangents $\stackrel{^}{t}$ and ray tube cross-sectional area S. Gauss’s theorem is applied to a small volume V along the ray tube, with the shape of a section of a cone, whose cross-sectional area S changes with arc-length $\mathcal{l}$ and whose volume is $V=S\text{d}\mathcal{l}$. The tangent $\stackrel{^}{t}$ is parallel to the sides of the section and normal to its ends. See text for further discussion.

The quantity $\nabla \cdot \stackrel{^}{t}$ has a simple geometric interpretation, as can be seen by applying Gauss’ theorem (e.g. ) to a volume V along a ray tube, which has the shape of a section of a cone (Figure 1(B)). The cross-sectional area of the ray tube increases from S on the end nearest to the source, to $S+\text{d}S$ at a distance $\text{d}\mathcal{l}$ further away. For small volumes, the integral in Gauss’ theorem is $\left(\nabla \cdot \stackrel{^}{t}\right)V$ where $V=S\text{d}\mathcal{l}$. The surface integral in Gauss’ theorem has contributions only from the two ends of the cone, of $-S$ and $\left(S+\text{d}S\right)$ respectively, which sum to dS. Consequently, Gauss’s theorem implies $\left(\nabla \cdot \stackrel{^}{t}\right)={S}^{-1}\text{d}S/\text{d}\mathcal{l}$ and the transport equation becomes:

$-\frac{1}{\mathcal{E}}\frac{\text{d}\mathcal{E}}{\text{d}\mathcal{l}}=\frac{1}{S}\frac{\text{d}S}{\text{d}\mathcal{l}}$ (12)

According to the transport equation, the fractional decrease in $\mathcal{E}$, measured along a ray, is equal to the fractional increase in area S of the ray tube. In many cases, the quantity $\mathcal{E}$ has the interpretation of the energy density, so the transport equation embodies conservation of energy. Conventionally, the area of the ray tube is written $S\left(\mathcal{l}\right)={R}^{2}\left(\mathcal{l}\right)\text{d}\Omega$, where ${R}^{2}\left(\mathcal{l}\right)$ is the geometrical spreading function and $\text{d}\Omega$ is the solid angle subtended by the ray tube at the source (e.g. ). Consequently, $\mathcal{E}\left(\mathcal{l}\right)=c{R}^{-2}\left(\mathcal{l}\right)$ where c is a constant. Ray-tracing algorithms that solve (9) typically tabulate both T and R (e.g.  ).

4. Adjoint Equation for Travel Time Tomography

The main purpose of this section is to derive and solve the adjoint equation needed to compute the quantity ${H}_{j}$, the derivative of the total travel time error with respect to a model parameter controlling the slowness of the medium. Our derivation focuses on expressing the equation in terms of quantities that vary along rays, so that it can be readily compared to the transport Equation (11). Our derivation is equivalent to, but different than, the one by , being a direct application of perturbation theory, as contrasted to one that employs Lagrange multipliers.

In travel time tomography, travel time observations $T\left({x}^{\left(j\right)}\right)$ are considered to be the data, and the slowness $s\left(x\right)$, or rather its approximation $s\left(x,m\right)$, is the image function. In order to apply the adjunct methodology as outlined in the Introduction, the non-linear Eikonal equation must be linearized about a “background” solution. Let the slowness equal a background slowness ${s}_{0}$ plus a small perturbation $\epsilon {s}_{1}$, where $\epsilon$ is a small parameter, and the corresponding travel time equal a background travel time ${T}_{0}$ plus a small perturbation $\epsilon {T}_{1}$. Then to first order in $\epsilon$, the Eikonal equation becomes:

$\nabla T\cdot \nabla T=\nabla \left({T}_{0}+\epsilon {T}_{1}\right)\cdot \nabla \left({T}_{0}+\epsilon {T}_{1}\right)={\left({s}_{0}+\epsilon {s}_{1}\right)}^{2}\approx {s}_{0}^{2}+2\epsilon {s}_{0}{s}_{1}$ (13)

Equating terms of equal order in $\epsilon$ yields equations for the background travel time ${T}_{0}$ and the perturbation in travel time ${T}_{1}$ :

$\nabla {T}_{0}\cdot \nabla {T}_{0}={s}_{0}^{2}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\stackrel{^}{t}}_{0}\cdot \nabla {T}_{1}={s}_{1}$ (14a,b)

Equation (14b) indicates that the component of $\nabla {T}_{1}$ in the direction of the background ray direction ${\stackrel{^}{t}}_{0}$ is ${s}_{1}$. Since ${s}_{1}$ plays the role the source term in the differential equation, the formulation in (3) is applicable. If we define $\text{d}\mathcal{l}$ to be an increment of arc length along the unperturbed ray, then this is just an equation involving the directional derivative $\text{d}/\text{d}\mathcal{l}={\stackrel{^}{t}}_{0}\cdot \nabla$ :

$\frac{\text{d}{T}_{1}}{\text{d}\mathcal{l}}={s}_{1}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{which}\text{\hspace{0.17em}}\text{has}\text{\hspace{0.17em}}\text{solution}\text{\hspace{0.17em}}{T}_{1}\left(\mathcal{l}\right)={\int }_{0}^{\mathcal{l}}{s}_{1}\left({\mathcal{l}}^{\prime }\right)\text{d}{\mathcal{l}}^{\prime }$ (15)

The perturbation in travel time is the integral of the perturbation in slowness along the unperturbed ray. We rewrite the equation for ${T}_{1}$ as:

$\mathcal{L}{T}_{1}={s}_{1}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{where}\text{\hspace{0.17em}}\text{ }\mathcal{L}={s}_{0}^{-1}\nabla {T}_{0}\cdot \nabla ={s}_{0}^{-1}\left[\begin{array}{ccc}\frac{\partial {T}_{0}}{\partial x}& \frac{\partial {T}_{0}}{\partial y}& \frac{\partial {T}_{0}}{\partial z}\end{array}\right]\left[\begin{array}{c}\partial /\partial x\\ \partial /\partial y\\ \partial /\partial z\end{array}\right]$ (16)

Using the rules ${\left({\mathcal{L}}_{1}{\mathcal{L}}_{2}\right)}^{†}={\mathcal{L}}_{2}^{\text{T}†}{\mathcal{L}}_{1}^{\text{T}†}$ and ${\left(\text{d}/\text{d}x\right)}^{†}=-\text{d}/\text{d}x$ (e.g., ) we obtain an expression for the adjoint equation:

${\mathcal{L}}^{†}\lambda =-\left[\begin{array}{ccc}\frac{\partial }{\partial x}& \frac{\partial }{\partial y}& \frac{\partial }{\partial z}\end{array}\right]\left[\begin{array}{c}\partial {T}_{0}/\partial x\\ \partial {T}_{0}/\partial y\\ \partial {T}_{0}/\partial z\end{array}\right]\left({s}_{0}^{-1}\lambda \right)=-\nabla \cdot \left[\nabla {T}_{0}\left({s}_{0}^{-1}\lambda \right)\right]=-\nabla \cdot \left[{\stackrel{^}{t}}_{0}\lambda \right]={e}_{0}$ (17)

As is typical of first-order equations, the “left hand” boundary condition associated with $\mathcal{L}$ implies a “right hand” boundary condition for ${\mathcal{L}}^{†}$ (e.g. ); that is, while $T=0$ at the source $\mathcal{l}=0,\lambda =0$ at as the end point of the ray $\mathcal{l}={\mathcal{l}}_{B}$ (where it touches the boundary of the medium).

The adjoint Equation (17) can be further manipulated:

$\begin{array}{l}-\frac{\nabla \lambda }{\lambda }\cdot {\stackrel{^}{t}}_{0}-\frac{{e}_{0}}{\lambda }=\nabla \cdot {\stackrel{^}{t}}_{0}\text{\hspace{0.17em}}\text{ }\text{or}\text{\hspace{0.17em}}\text{ }\frac{\text{d}\lambda }{\text{d}\mathcal{l}}+P\left(\mathcal{l}\right)\lambda =Q\left(\mathcal{l}\right)\\ \text{with}\text{ }\text{\hspace{0.17em}}P\left(\mathcal{l}\right)=\frac{1}{S}\frac{\text{d}S}{\text{d}\mathcal{l}}\text{\hspace{0.17em}}\text{ }\text{and}\text{ }\text{\hspace{0.17em}}Q\left(\mathcal{l}\right)=-{e}_{0}\end{array}$ (18)

The formal solution to (17) is well-known (e.g. ):

$\lambda \left(\mathcal{l}\right)=\frac{\left(C+v\left(\mathcal{l}\right)\right)}{\mu \left(\mathcal{l}\right)}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{with}\text{\hspace{0.17em}}\mu \left(\mathcal{l}\right)=\mathrm{exp}\left\{\stackrel{\mathcal{l}}{\int }P\left({\mathcal{l}}^{\prime }\right)\text{d}{\mathcal{l}}^{\prime }\right\}\text{\hspace{0.17em}}\text{ }\text{and}\text{\hspace{0.17em}}\text{ }v\left(\mathcal{l}\right)=\stackrel{\mathcal{l}}{\int }\mu \left({\mathcal{l}}^{\prime }\right)Q\left({\mathcal{l}}^{\prime }\right){\mathcal{l}}^{\prime }$ (19)

Here the constant C is chosen to enforce the boundary condition $\lambda \left({\mathcal{l}}_{B}\right)=0$.

5. Analysis of the Role of the Geometrical Spreading

The main purpose of this section is show that the solution to the adjoint equation can be constructed from the geometrical spreading function, and to interpret this result.

In any region in which ${e}_{0}=0$, the adjoint Equation (18) has the same form as the transport Equation (12). Since the error $e\left(x\right)$ is rarely known within the medium, but rather only on its boundary ${x}_{B}$, this restriction is satisfied by all commonly-encountered cases. As we will show below, the similarity of form provides considerable insight into the behavior of the adjoint field $\lambda$.

Ray divergence enters into the adjoint equation through the $\nabla \cdot {\stackrel{^}{t}}_{0}$ term. In order to highlight its contribution, we first examine a solution in which this term is zero. Consider a plane wave propagating in the z-direction through a homogenous layer with $0\le z\le {z}_{B}$ (Figure 2). The background travel time is ${T}_{0}={s}_{0}z$, the ray direction is $\stackrel{^}{t}={s}_{0}^{-1}\nabla {T}_{0}=\stackrel{^}{z}$ and $\mathcal{l}=z$. The plane wave satisfies the background Eikonal equation (14a),since $\nabla {T}_{0}\cdot \nabla {T}_{0}={s}_{0}^{2}\left(\stackrel{^}{z}\cdot \stackrel{^}{z}\right)={s}_{0}^{2}$. Since the rays of a plane wave do not diverge, $\nabla \cdot {\stackrel{^}{t}}_{0}=0$.

Now consider the case where the background slowness is everywhere too small by an amount b, so that the background error ${e}_{0}={T}^{obs}-{T}_{0}$ grows linearly with distance z; that is, ${e}_{0}\left(x,y,z\right)=bz$. We will assume that this error is known only on the boundary $z={z}_{B}$. Following (5), the adjoint equation is $\text{d}\lambda /\text{d}z=-b{z}_{B}\delta \left(z-{z}_{B}\right)$. Because of the Dirac impulse function, the boundary condition for $\lambda$ requires some scrutiny. We will consider that the error is defined just below the boundary, at $\equiv {z}_{B}-{ϵ}^{2}$, where ${ϵ}^{2}\ll {z}_{B}$. In order to satisfy both the boundary condition of $\lambda \left({z}_{B}\right)=0$ and the adjoint equation, the solution must be discontinuous at ${z}_{B}^{-}$ ; and in the immediate vicinity of ${z}_{B}^{-}$ must be $\lambda \left(z\right)=b{z}_{B}^{-}H\left({z}_{B}^{-}-z\right)$. Effectively, the boundary condition is $\lambda \left({z}_{S}\right)={e}_{0}\left(x,y,{z}_{S}\right)$. The solution of the adjoint equation is $\lambda \left(z\right)=b{z}_{B}$ ; note that it does not depend upon z.

Now consider a slowness perturbation in the form of a very thin rectangular prism, centered at ${z}_{H}$, of thickness D, and having sides at ${x}_{1}$ and ${x}_{2}={x}_{1}+L$, and ${y}_{1}$ and ${y}_{2}={y}_{1}+L$ (so that its volume is $D{L}^{2}$ ). Since the prism is very thin, it can be approximated as a Dirac impulse function in depth z:

$\begin{array}{l}\epsilon {s}_{1}\left(x,y,z\right)={m}_{1}W\left(x,{x}_{1},{x}_{2}\right)W\left(y,{y}_{1},{y}_{2}\right)D\delta \left(z-{z}_{H}\right)\\ \text{with}\text{\hspace{0.17em}}\text{ }W\left(x,{x}_{1},{x}_{2}\right)\equiv H\left(x-{x}_{1}\right)H\left({x}_{2}-x\right)\end{array}$ (20)

Here, $H\left(.\right)$ is the Heaviside function, which is unity when its argument is positive and zero otherwise. The partial derivative of total error is:

$\begin{array}{c}{H}_{1}=-2\left(\frac{\text{d}\left(\epsilon {s}_{1}\right)}{\text{d}{m}_{1}},\lambda \right)\\ =-2D\iiint W\left(x,{x}_{1},{x}_{2}\right)W\left(x,{y}_{1},{y}_{2}\right)\delta \left(z-{z}_{H}\right)b{z}_{B}^{}\text{d}x\text{d}y\text{d}z\\ =-2b{z}_{B}D{L}^{2}\end{array}$ (21)

An expected, ${H}_{1}<0$, since increasing ${m}_{1}$ lowers the error. Also as expected, ${H}_{1}$ is proportional to the area ${L}^{2}$ of the prism, since the larger its area, the larger the region to which the slowness perturbation is applied. Interestingly, ${H}_{1}$ is independent of the position ${z}_{H}$ of the prism; that is, the prism can be moved up or down without affecting the error. As we will show below, this insensitivity to position is due to the absence of ray divergence in this plane wave case.

We now consider a spherical wave propagating in the r-direction in through a homogenous sphere with $0\le r\le {r}_{B}$ (Figure 3), described by spherical polar coordinates $\left(r,\theta ,\phi \right)$. The background travel time is ${T}_{0}={s}_{0}r$, the ray direction is $\stackrel{^}{t}={s}_{0}^{-1}\nabla {T}_{0}=\stackrel{^}{r}$ and $\mathcal{l}=r$. The spherical wave satisfies the background Eikonal Equation (14a), since $\nabla {T}_{0}\cdot \nabla {T}_{0}={s}_{0}^{2}\left(\stackrel{^}{r}\cdot \stackrel{^}{r}\right)={s}_{0}^{2}$. The area of a ray tube is

Figure 2. Rays (blue) of a plane wave cross a layer with bottom and top surfaces at $z=0$ and $z={z}_{B}$, respectively. A prismatic slowness perturbation (red rectangle) is placed within the layer, with left and right edges at $x={x}_{1}$ and $x={x}_{2}$, respectively. The travel time error $e\left(x,{z}_{B}\right)$, measured on the upper surface (top plot), is reduced in the region ${x}_{1} where the rays project the prism. Because the rays do not diverge, the size of this region is independent of the depth of the perturbation.

Figure 3. Rays (blue) of a spherical wave start at a source at the center of the sphere at $r=0$ and propagate outward through the sphere to its surface at $r={r}_{B}$. A slowness perturbation with the shape of a spherical cap (red cap) is placed within the sphere at radius ${r}_{H}$, with left and right edges at polar angle $-{\theta }_{H}$ and $+{\theta }_{H}$, respectively. The travel time error $e\left(\theta ,{r}_{B}\right)$, measured on the upper surface, is reduced in the region where the perturbation is projected by the rays (graph at top).

$S={r}^{2}\text{d}\Omega$, from whence we conclude that the geometrical spreading function is ${R}^{2}\left(r\right)={r}^{2}$ and the ray divergence is $\nabla \cdot {\stackrel{^}{t}}_{0}={S}^{-1}\text{d}S/\text{d}\mathcal{l}=2/r$. As in the plane wave case, the background slowness is everywhere too small by an amount b, leading to a background error ${e}_{0}\left(r,\theta ,\phi \right)=br$. We will assume that this error is known only on the boundary $r={r}_{B}$. The adjoint Equation (18) reduces to:

$\frac{\text{d}\lambda }{\text{d}r}+\frac{2}{r}\lambda =0\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{with}\text{\hspace{0.17em}}\text{boundary}\text{\hspace{0.17em}}\text{condition}\text{\hspace{0.17em}}\text{ }\lambda \left({r}_{B},\theta ,\phi \right)=b{r}_{H}$ (22)

The solution is $\lambda \left(r,\theta ,\phi \right)=\left(b{r}_{B}\right)\left({r}_{B}^{2}/{r}^{2}\right)$. As is asserted in the Introduction, the solution to this transport-like equation is related to the geometrical spreading function by $\lambda \left(r,\theta ,\phi \right)\propto {R}^{-2}$.

Now consider a slowness perturbation in the form of a very thin spherical cap of fixed thickness D, centered at ${r}_{H}$ and ${\phi }_{H}=0$ and subtending a variable polar angle area ${\theta }_{H}$ such that its area is fixed as ${L}^{2}=2\pi {r}_{H}^{2}\left(1-\mathrm{cos}{\theta }_{H}\right)$ :

$\epsilon {s}_{1}\left(x,y,z\right)={m}_{1}H\left({\theta }_{H}-\theta \right)D\delta \left(r-{r}_{H}\right)$ (23)

For a position ${r}_{H}$ away from the origin where a spherical cap of thickness D and area ${L}^{2}$ is possible, the partial derivative of total error is:

$\begin{array}{c}{H}_{1}=-2\left(\frac{\text{d}\left(\epsilon {s}_{1}\right)}{\text{d}{m}_{1}},\lambda \right)\\ =-2b{r}_{B}^{3}D\iiint \left[H\left({\theta }_{H}-\theta \right)\delta \left(r-{r}_{H}\right)\right]\left[{r}^{-2}\right]{r}^{2}\mathrm{sin}\theta \text{d}r\text{d}\theta \text{d}\phi \\ =-2b{r}_{B}^{3}D\iint H\left({\theta }_{H}-\theta \right)\mathrm{sin}\theta \text{d}\theta \text{d}\phi \int \delta \left(r-{r}_{H}\right)\text{d}r\\ =\left(-2bD{r}_{B}^{3}\right)2\pi \left(1-\mathrm{cos}{\theta }_{H}\right)\frac{{r}_{H}^{2}}{{r}_{H}^{2}}=\frac{\left(-2b{r}_{B}^{3}D\right){L}^{2}}{{r}_{H}^{2}}\\ =-2b{r}_{B}D{L}^{2}{\left(\frac{{r}_{B}}{{r}_{H}}\right)}^{2}=-2b{r}_{B}D{L}^{2}\frac{{R}^{2}\left({r}_{B}\right)}{{R}^{2}\left({r}_{H}\right)}\end{array}$ (24)

The spherical wave solution (24) differs from the plane wave solution (21) by a factor that involves the ratio of geometric spreading functions, $R\left({r}_{B}\right)/R\left({r}_{H}\right)$, evaluated at the heterogeneity and the surface. The area, on the surface of the sphere, subtended by the prism decreases with its radius ${r}_{H}$, decreasing the error $e\left({r}_{S},\theta ,\phi \right)$ over wider region (Figure 4). This example illustrates the importance of geometric spreading on the amplitude of the adjoint field and on the effectiveness of a given perturbation to reduce the error E. Given several perturbations of equal size, the most effective is one whose projection on the boundary, by rays interacting with it, is the largest.

Although the adjoint field is singular at the source (ray starting point) $r=0$, the partial derivative ${H}_{k}$ is finite there, as can be seen by considering a spherical heterogeneity of radius ${r}_{H}$ centered on the origin of the form $\epsilon {s}_{1}\left(x,y,z\right)={m}_{1}H\left({r}_{H}-r\right)$ :

$\begin{array}{c}{H}_{1}=-2\left(\frac{\text{d}\left(\epsilon {s}_{1}\right)}{\text{d}{m}_{1}},\lambda \right)=-2b{r}_{B}^{3}\iiint \left[H\left({r}_{H}-r\right)\right]\left[{r}^{-2}\right]\text{d}r\text{ }r\text{d}\theta r\mathrm{sin}\theta \text{d}\phi \\ =-2b{r}_{B}^{3}\iint \mathrm{sin}\theta \text{d}\theta \text{d}\phi \underset{0}{\overset{{r}_{H}}{\int }}\text{d}r=\left(-2b{r}_{B}^{3}\right)\left(4\pi \right){r}_{H}=-8\pi b{r}_{B}^{3}{r}_{H}\end{array}$ (25)

when the background slowness ${s}_{0}\left(x\right)$ is spatially varying, the rays have a complicated spatial pattern and the background error ${e}_{0}\left({x}_{B}\right)$, measured on the boundary ${x}_{B}$, is spatially varying. Suppose that the medium has a surface ${x}_{B}$ with outward pointing normal ${\stackrel{^}{n}}_{B}\left({x}_{B}\right)$. A ray connecting an interior point $x$ to ${x}_{B}$ can be labeled by ${x}_{B}$. Then, ${x}_{B}\left(x\right)$ means the point on a boundary at which a ray passing through $x$ ends, and arc-length $\mathcal{l}\left(x,{x}_{B}\right)$ means the distance at $x$ along a ray that ends at ${x}_{B}\left(x\right)$. Similarly, the geometrical spreading

Figure 4. Rays (blue) of a spherical wave, as in Figure 3. One of two alternate slowness perturbations (green and red caps) are placed within the sphere, at radii ${r}_{{H}_{1}}$ and ${r}_{{H}_{2}}$, respectively, with ${r}_{{H}_{1}}<{r}_{{H}_{2}}$. These caps have equal area ${L}^{2}$ and equal thickness D. The travel time error $e\left(\theta ,{r}_{B}\right)$, measured on the surface of the sphere, is reduced in the region where the perturbation is projected by the rays (green and red curves in top plot). The reduction in error in this region is the same in both cases, because the thicknesses of the perturbations are equal. However, because the rays diverge, the size affected region is larger for the perturbation at ${r}_{{H}_{1}}$.

function can be written as $R\left(x,{x}_{B}\right)$ ; that is, the geometrical spreading function at $x$ associated with the ray that ends at ${x}_{B}$. Then, the adjoint field is then:

$\lambda \left(x\right)=\frac{{e}_{0}\left({x}_{B}\right)}{\stackrel{^}{t}\left({x}_{B}\right)\cdot {\stackrel{^}{n}}_{B}\left({x}_{B}\right)}\frac{{R}^{2}\left({x}_{B},{x}_{B}\right)}{{R}^{2}\left(x,{x}_{B}\right)}$ (26)

Here, the dot product between the ray tangent and surface normal is introduced to account for the increased surface area intersected by the ray tube, in the case (unlike the examples, above) where the ray tube obliquely impinges upon the boundary. Now, suppose that slowness perturbation is represented with voxels, where voxel k has volume ${V}_{k}$, amplitude ${m}_{k}$, and centroid position ${x}^{\left(k\right)}$. When the adjoint field varies slowly compared to the length scale of a voxel (a requirement that excludes the source point) the error derivative is:

${H}_{k}=-2\left(\frac{\text{d}\left(\epsilon {s}_{1}\right)}{\text{d}{m}_{k}},\lambda \right)\approx -2{V}_{k}\frac{{e}_{0}\left({x}_{B}\right)}{\stackrel{^}{t}\left({x}_{B}\right)\cdot {\stackrel{^}{n}}_{B}\left({x}_{B}\right)}\frac{{R}^{2}\left({x}_{B},{x}_{B}\right)}{{R}^{2}\left({x}^{\left(k\right)},{x}_{B}\right)}$ (27)

Here ${x}_{B}$ is the end point of the ray passing through ${x}^{\left(k\right)}$. This result emphasizes the link between the geometrical spreading function R and the partial derivative of total error E. (When the voxel is close to, or overlaps the origin, ${H}_{k}$ is still well-defined and finite, but the inner product in (27) must be computed appropriately).

6. Conclusion

The key result in this paper is the demonstration that the adjoint equation in ray-based travel time tomography has the same form as the well-known transport equation for ray theoretical amplitudes. Consequently, the spatial variation of the adjoint field $\lambda$ is completely controlled by the geometrical spreading function R. This result provides an intuitive understanding of the primary factor controlling the size of the partial derivative ${H}_{j}=\partial E/\partial m$ of total ${L}_{2}$ error E with respect to the slowness ${m}_{j}$ of a voxel. The partial derivative ${H}_{j}$ is large when ray divergence causes the projection of the voxel on the measurement surface to be large. Since this result provides an explicit formula for $\lambda$ in terms of R, it enables ${H}_{j}$ to be calculated without resorting to the numerical solution of the adjoint equation. Only an inner product needs to be calculated, and in the case of a voxel parameterization of the slowness image, it can be calculated trivially.

Acknowledgements

I thank the graduate students who participated in Columbia University’s 2017 Seminar in Adjoint Methods for helpful discussion.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

  Aki, K., Christoffersson, A. and Husebye, E.S. (1976) Three-Dimensional Seismic Structure of the Lithosphere under Montana Lasa. Bulletin of the Seismological Society of America, 66, 501-524.  Menke, W. (1977) Lateral in Homogeneities in P Velocity under the Tarbella Array of the Lesser Himalayas of Pakistan. Bulletin of the Seismological Society of America, 67, 725-734.  Munk, W., Worcester, P. and Wunsch, C. (1995) Ocean Acoustic Tomography. Cambridge University Press, Cambridge, 433 p.  Justice, J.H., Vassiliou, A.A., Singh, S., Logel, J.D., Hansen, P.A., Hall, B.R., Hurt, P.R. and Solanki, J.J. (1989) Acoustic Tomography for Monitoring Enhanced Oil Recovery. Leading Edge, 8, 12-19. https://doi.org/10.1190/1.1439605  Santamarina, J.C. (1994) An Introduction to Geotomography. In: Woods, R.D., Ed., Geophysical Characterization of Sites, Oxford & IBH Publishing Co., New Delhi, 35-44.  Müller, T. (2014) GeoViS—Relativistic Ray Tracing in Four-Dimensional Spacetimes. Computer Physics Communications, 185, 2301-2308. https://doi.org/10.1016/j.cpc.2014.04.013  Nolet, G. (1987) Seismic Wave Propagation and Seismic Tomography. In: Nolet, G., Ed., Seismic Tomography with Applications in Global Seismology and Exploration Geophysics, Springer, New York, 1-23.  Dahlen, F., Hung, S.-H. and Nolet, G. (2002) Fréchet Kernels for Finite-Frequency Travel Times—I. Theory. Geophysical Journal International, 141, 157-174. https://doi.org/10.1046/j.1365-246X.2000.00070.x  Hall, M.C.G., Cacuci, D.G. and Schlesinger, M.E. (1982) Sensitivity Analysis of a Radiative-Convective Model by the Adjoint Method. Journal of the Atmospheric Sciences, 29, 2038-2050. https://doi.org/10.1175/1520-0469(1982)039<2038:SAOARC>2.0.CO;2  Bin Waheed, U., Flagg, G. and Yarman, C.E. (2016) First-Arrival Traveltime Tomography for Anisotropic Media Using the Adjoint-State Method. Geophysics, 81, R147-R155. https://doi.org/10.1190/geo2015-0463.1  Levenberg, K. (1944) A Method for the Solution of Certain Non-Linear Problems in Least-Squares. Quarterly of Applied Mathematics, 2, 164-168. https://doi.org/10.1090/qam/10666  Engl, H. (1993) Regularization Methods for the Stable Solution of Inverse Problems. Surveys in Mathematics for Industry, 3, 71-143.  Menke, W. and Eilon, Z. (2015) Relationship between Data Smoothing and the Regularization of Inverse Problems. Pure and Applied Geophysics, 172, 2711-2726. https://doi.org/10.1007/s00024-015-1059-0  Menke, W. (2018) Geophysical Data Analysis: Discrete Inverse Theory. Fourth Edition, Elsevier, Amsterdam, 350 p.  Lawson, C.L. and Hanson, R.J. (1995) Solving Least Squares Problems. Prentice-Hall, Englewoods Cliffs, New Jersey, 337 p.  Snyman, J.A. and Wilke, D.N. (2018) Practical Mathematical Optimization—Basic Optimization Theory and Gradient-Based Algorithms. Springer Optimization and Its Applications, Second Edition, Springer, New York.  Tromp, J., Tape, C. and Liu, Q. (2005) Seismic Tomography, Adjoint Methods, Time Reversal and Banana-Doughnut Kernels. Geophysical Journal International, 160, 195-216.https://doi.org/10.1111/j.1365-246X.2004.02453.x  Cerveny, V. (2001) Seismic Ray Theory. Cambridge University Press, Cambridge, 722 p.  Aki, K. and Richards, P.G. (2009) Quantitative Seismology. Second Edition, University Science Books, Mill Valley, 742 p.  Colley, S.J. (2012) Vector Calculus. Fourth Edition, Pearson, New York, 603 p.  Phillips, W.S. and Fehler, M.C. (1991) Traveltime Tomography: A Comparison of Popular Methods. Geophysics, 56, 1522-1692. https://doi.org/10.1190/1.1442974  Menke, W. (2005) Case Studies of Seismic Tomography and Earthquake Location in a Regional Context. In: Levander, A. and Nolet, G., Eds., Seismic Earth: Array Analysis of Broadband Seismograms, Geophysical Monograph Series 157, American Geophysical Union, Washington DC, 7-36. https://doi.org/10.1029/157GM02  Lanczos, C., 1961. Linear Differential Operators. Van Nostrand-Reinhold (London, U.K.), 580 pp., ISBN: 978-1-614-27302-8.  Adkins, W.A. and Davidson, M.G. (2012) Ordinary Differential Equations. Springer, New York, 799 p. https://doi.org/10.1007/978-1-4614-3618-8_1 