Approximation of an Integral Markov Process Arising in the Approximation of Stochastic Differential Equation

Mohammad Rahman

doi:10.4236/apm.2022.121003

Advances in Pure Mathematics > Vol.12 No.1, January 2022

Approximation of an Integral Markov Process Arising in the Approximation of Stochastic Differential Equation

Mohammad Rahman
Department of Mathematics & Statistics, University of North Florida, Jacksonville, FL-32224, USA.
DOI: 10.4236/apm.2022.121003 PDF HTML XML 233 Downloads 927 Views

Abstract

We provide the derivation of a new formula for the approximation of an integral Markov process arising in the approximation of stochastic differential equations. This formula extends an existing formula derived in [1]. We have shown numerically that the leading order approximation of the differential equation with noise by solving an associated averaged problem and estimating the difference between them and the result is illustrated through some examples.

Keywords

Variance, Markov Process, Parametric Noise, Differential Equation, Approximations

Share and Cite:

Rahman, M. (2022) Approximation of an Integral Markov Process Arising in the Approximation of Stochastic Differential Equation. Advances in Pure Mathematics, 12, 29-47. doi: 10.4236/apm.2022.121003.

1. Introduction

Nonlinear ordinary and partial differential equations arise in various fields of sciences, particularly fluid mechanics, solid state physics, plasma physics, nonlinear optics, and mathematical biology. Powerful numerical methods and its implementation can be obtained in [2] - [7]. However, nonlinear differential equations (DE) with parametric noise play a significant role in a range of application areas, including engineering, physics, mechanics, epidemiology, and neuroscience. It is important to mention that noisy systems can be modeled in several ways: for example, Langevin’s equation describes a linear physical system to which white noise is added, and the linear theory for it has been extended to nonlinear stochastic differential equations with additive white noise [8]. Another approach is to derive models, such as Markov chains, for the systems state variables as being random, and then use the method of probability for analysis, see [9]. Third approach was derived by averaging nonlinear oscillatory systems. In this approach, parameters in the system are allowed to be random processes, and the method based on averaging and ergodic theory provides useful predictions from the model [10]. Indeed, solutions of differential or dynamical systems are functions, say of time. If they are in addition random, we must describe both randomness and time dependence simultaneously. Thus we refer to the system as either a random process, or a stochastic process. A complete understanding of DE theory with perturbed noise requires familiarity with advanced probability and stochastic processes (see [8] [11] ).

In this paper, our approach to modeling randomness in differential or dynamical system is through allowing parameters b in a system to be a random process. For example, in the case of differential equation we write $\dot{x} (t) = f (t, ω, x (t), b (y (t)))$ , indicating that the parameters can change with some random process $y (t)$ . The resulting solutions $x (t, b)$ will also be a random process. This approach will be based on assuming noise process faster time scale than the system time scale. This work has been dedicated to the question in particular when y is a discrete-space Markov process. Most recent contributions have aimed in general at relaxing the assumptions on y [1] [12]. This result is known as Functional Central Limit Theorem [13] [14] [15].

In Section 2 we reviewed and discussed limit of variance for discrete-space Markov Chain [1]. In Section 3, a new equivalent expression for a limit of a variance is given for circulant n-state markov chain. In Section 3, stochastic approximation and numerical simulation were discussed. It is observed that computer simulations of this type of stochastic ordinary differential equation with standard methods have some issues needed to be explored so that the reader will be benefitted while solving these type problems numerically.

2. Discrete-Space Markov Chains

We consider the case where $Y = {y_{1}, \dots, y_{n}}$ , and let

$\underline{y} = [\begin{matrix} y_{1} \\ ⋮ \\ y_{n} \end{matrix}], φ (\underline{y}) = [\begin{matrix} φ (y_{1}) \\ ⋮ \\ φ (y_{n}) \end{matrix}], and \underline{e} = [\begin{matrix} 1 \\ ⋮ \\ 1 \end{matrix}] .$

The transition probability $P (Δ t, y, d y^{'})$ can be represented by an $n \times n$ matrix P such that $P_{i, j} = P (Δ t, y_{i}, y_{j})$ , denotes the probability that $y (Δ t) = y_{j}$ , if $y (0) = y_{i}$ (P depends on $Δ t$ but this dependence is omitted from the notation for simplicity). The matrix P satisfies the following properties.

· P is nonnegative (its entries are probabilities).

· P is stochastic i.e., $P \underline{e} = \underline{e}$ (one of the $y_{j}$ must be the outcome of a transition from $y_{i}$ ).

Then $λ_{1} = 1$ , is an eigenvalue of P and $1 \leq ρ (P) \leq {‖ P ‖}_{\infty} = 1$ , shows that all other eigenvalues have modulus at most 1. We shall assume that

· P is irreducible, i.e., any state $y_{j}$ can be eventually be reached in a finite number of steps with a nonzero probability (this is the case if $P > 0$ ). This implies that $λ_{1} = 1 = ρ (P)$ , has multiplicity one (e.g., see [16] ).

· P is aperiodic (or acyclic, e.g., not a permutation). This implies that $| λ_{j} | < 1$ for $j = 2, \dots, n$ (e.g., see [9]; the chain is then called regular).

The above conditions guarantee the existence of a unique vector $\underline{v} > 0$ such that

${\underline{v}}^{T} \underline{e} = 1,$ (2.1)

and $\lim_{N \to \infty} P^{N} = \underline{e} {\underline{v}}^{T}$ . The vector $\underline{v}$ is the unique positive left eigenvector associated to $λ_{1} = 1$ , satisfying (2.1). It is natural to consider the (spectral) decomposition

$P = \underline{e} {\underline{v}}^{T} + S,$ (2.2)

where

${\underline{v}}^{T} S = {\underline{0}}^{T}, S \underline{e} = \underline{0},$ (2.3)

and $ρ (\tilde{P}) < 1$ . From Chapman-Kolmogorov equation and the homogeneity property (see, [1] ) we have

$P (2 Δ t, y_{i}, y_{j}) = \sum_{k = 1}^{n} P (Δ t, y_{i}, y_{k}) P (Δ t, y_{k}, y_{j}) = \sum_{k = 1}^{n} P_{i, k} P_{k, j} = P_{i, j}^{2},$

and by induction

$P (N Δ t, y_{i}, y_{j}) = P_{i, j}^{N} = {(\underline{e} {\underline{v}}^{T} + S^{N})}_{i, j} \approx {(\underline{e} {\underline{v}}^{T})}_{i, j} = v_{j},$

as $N \to \infty$ , independently of i (i.e., $y_{i}$ ). Thus

$\underline{v} = ρ_{Δ t} (\underline{y}),$ (2.4)

defines the limit distribution. An explicit expression of the coefficients of $\underline{v}$ in terms of the coefficients of P can be found in [ [17], p. 21].

The relation of the expected value yields

$\begin{matrix} E_{y (0) = y_{j}} [φ (y (N Δ t))] = \sum_{j = 1}^{n} φ (y_{j}) P (N Δ t, y_{j}, y_{i}) \\ = \sum_{j = 1}^{n} {(P^{N})}_{i, j} φ (y_{j}) \\ = {(P^{N} φ (\underline{y}))}_{j}, \end{matrix}$ (2.5)

i.e.,

$E_{y (0) = \underline{y}} [φ (y (N Δ t))] = [\begin{matrix} E_{y (0) = y_{1}} [φ (y (N Δ t))] \\ ⋮ \\ E_{y (0) = y_{n}} [φ (y (N Δ t))] \end{matrix}] = P^{N} φ (\underline{y}) .$

Then the zero average condition on $φ$ becomes

$0 = \int_{Y} φ (y) ρ (d y) = \sum_{j = 1}^{n} φ (y_{j}) ρ (y_{j}) = ρ {(\underline{y})}^{T} φ (\underline{y}) = {\underline{v}}^{T} φ (\underline{y}) .$ (2.6)

The relation (2.6) implies $P φ (y) = φ (y)$ . Therefore

$R^{Δ t, 1} φ (\underline{y}) = Δ t \sum_{N = 1}^{\infty} E_{y (0) = \underline{y}} [φ (y (N Δ t))]$

$= Δ t \sum_{N = 1}^{\infty} P^{N} φ (\underline{y})$ (2.7)

$= Δ t \sum_{N = 1}^{\infty} S^{N} φ (\underline{y})$ (2.8)

$= Δ t {(I - S)}^{- 1} S φ (\underline{y}),$ (2.9)

where $\frac{1}{\sqrt{ε}} R^{Δ t,1} φ (\underline{y})$ can be interpreted as the expected value of the random variable $\frac{1}{\sqrt{ε}} R^{Δ t,1} φ (\underline{y})$ obtained after one transition probability applied to the random variable y (for more detail, see [1] ). Note that (2.7) implies

$R^{Δ t,1} φ (\underline{y}) - P R^{Δ t,1} φ (\underline{y}) = Δ t P φ (\underline{y}) .$

Because of (2.3) we also have

${\underline{v}}^{T} R^{Δ t,1} φ (\underline{y}) = 0,$ (2.10)

With $V = diag (\underline{v})$ we obtain

$\begin{matrix} σ_{Δ t}^{2} = 2 \sum_{j = 1}^{n} φ (y_{j}) ρ (y_{j}) {(R^{Δ t,1} φ (\underline{y}))}_{j} + Δ t \sum_{j = 1}^{n} φ^{2} (y_{j}) ρ (y_{j}) \\ = 2 Δ t φ {(\underline{y})}^{T} V {(I - S)}^{- 1} S φ (\underline{y}) + Δ t φ {(\underline{y})}^{T} V φ (\underline{y}) \\ = Δ t φ {(\underline{y})}^{T} V {(I - S)}^{- 1} (I + S) φ (\underline{y}), \end{matrix}$ (2.11)

as $t \to \infty$ . The expression (2.11) represents the limit of a variance and thus expected to be nonnegative.

Two-State Markov Chain

Let $n = 2$ and

$P = P (Δ t) = I + Δ t Q, Q = [\begin{matrix} - a & a \\ b & - b \end{matrix}],$ (2.12)

with $0 < a, b < 1$ , $0 < Δ t < \min (\frac{1}{a}, \frac{1}{b}) \leq \frac{2}{a + b}$ . Then (2.2) holds with

$\begin{array}{l} \underline{v} = \frac{1}{a + b} [\begin{matrix} b \\ a \end{matrix}] \\ and S = \frac{1 - Δ t (a + b)}{a + b} [\begin{matrix} a & - a \\ - b & b \end{matrix}] . \end{array}$

As a result $V = \frac{1}{a + b} [\begin{matrix} b \\ a \end{matrix}]$ and

$Δ t V {(I - S)}^{- 1} (I + S) = \frac{1}{{(a + b)}^{3}} [\begin{matrix} b (2 a + Δ t (b^{2} - a^{2})) & - 2 a b (1 - Δ t (a + b)) \\ - 2 a b (1 - Δ t (a + b)) & a (2 b + Δ t (a^{2} - b^{2})) \end{matrix}]$ (2.13)

is (symmetric) positive definite for any choice $0 < a, b < 1$ and $0 < Δ t < \frac{2}{a + b}$ . A simplified expression for (2.11) is obtained using (2.6), i.e.,

$φ (\underline{y}) = [\begin{matrix} 1 \\ - \frac{b}{a} \end{matrix}] φ (y_{1}) .$

We obtain

$σ_{ε}^{2} = \frac{b (2 - Δ t (a + b))}{a (a + b)} {(φ (y_{1}))}^{2}$ (2.14)

$= \frac{a b (2 - Δ t (a + b))}{{(a + b)}^{3}} {(φ (y_{1}) - φ (y_{2}))}^{2} .$

3. Circulant n-State Markov Chain

If P is doubly stochastic (i.e., P and P^T are stochastic) and irreducible aperiodic then

$\underline{v} = \frac{1}{n} \underline{e},$

i.e., the limit distribution is uniform and

$D = \frac{1}{n} I .$ (3.16)

Stochastic Toeplitz, hence circulant, matrices constitute an example of doubly stochastic matrices. Let

$P = [\begin{matrix} a_{1} & a_{2} & ⋱ & a_{n} \\ a_{n} & ⋱ & ⋱ & ⋱ \\ ⋱ & ⋱ & ⋱ & a_{2} \\ a_{2} & ⋱ & a_{n} & a_{1} \end{matrix}],$

with $a_{i} \geq 0$ and $\sum_{j = 1}^{n} a_{j} = 1$ . A sufficient condition for P to be irreducible and aperiodic is for two consecutive $a_{j}$ to be nonzero (i.e., positive, see e.g. [ [18], p. 5]). The symbol of P is the polynomial

$p (z) = \sum_{j = 1}^{n} a_{j} z^{j - 1} .$

Since

$P [\begin{matrix} 1 \\ z \\ ⋮ \\ z^{n - 1} \end{matrix}] = p (z) [\begin{matrix} 1 \\ z \\ ⋮ \\ z^{n - 1} \end{matrix}],$

for $z^{n} = 1$ , P admits the spectral decomposition

$P = V Λ V^{- 1} = V Λ V^{H},$

with

$V = \frac{1}{\sqrt{n}} [\begin{matrix} 1 & 1 & \dots & 1 \\ 1 & z_{1} & \dots & z_{n - 1} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & z_{1}^{n - 1} & \dots & z_{n - 1}^{n - 1} \end{matrix}] = [\begin{matrix} \frac{1}{\sqrt{n}} \underline{e} & \tilde{V} \end{matrix}],$ (3.17)

$Λ = [\begin{matrix} 1 \\ p (z_{1}) \\ ⋱ \\ p (z_{n - 1}) \end{matrix}] = [\begin{matrix} 1 \\ \tilde{Λ} \end{matrix}],$ (3.18)

where $z_{j} = e^{\frac{2 i π j}{n}} = z_{1}^{j}$ . Multiplication of a vector by the matrix $V^{H}$ performs a (normalized) discrete Fast Fourier Transform (FFT), while multiplication by V results in the inverse (normalized) discrete FFT.

We write $P = \frac{1}{n} \underline{e} {\underline{e}}^{T} + \tilde{V} \tilde{Λ} {\tilde{V}}^{H} = \frac{1}{n} \underline{e} {\underline{e}}^{T} + \tilde{P}$ , with $\tilde{P} = \tilde{V} \tilde{Λ} {\tilde{V}}^{H}$ . Since $| p (z_{k}) | < 1$ for $k = 1, \dots, n - 1$ the matrix $I - \tilde{Λ}$ is nonsingular. The matrix $\tilde{V} {\tilde{V}}^{H}$ represents the (orthogonal) projection onto $Span {\underline{v}}^{⊥} = Span {\underline{e}}^{⊥}$ . Hence ${\underline{v}}^{T} φ (\underline{y}) = 0$ implies $φ (\underline{y}) = \tilde{V} {\tilde{V}}^{H} φ (\underline{y})$ . Then

$\begin{matrix} (I + \tilde{P}) φ (\underline{y}) = (I + \tilde{V} \tilde{Λ} {\tilde{V}}^{H}) \tilde{V} {\tilde{V}}^{H} φ (\underline{y}) \\ = \tilde{V} (I + \tilde{Λ}) {\tilde{V}}^{H} φ (\underline{y}) \\ = \tilde{V} (I - \tilde{Λ}) {(I - \tilde{Λ})}^{- 1} (I + \tilde{Λ}) {\tilde{V}}^{H} φ (\underline{y}) \\ = (I - \tilde{P}) \tilde{V} {(I - \tilde{Λ})}^{- 1} (I + \tilde{Λ}) {\tilde{V}}^{H} φ (\underline{y}) . \end{matrix}$ (3.19)

Now

$\begin{matrix} {({\tilde{V}}^{H} φ (\underline{y}))}_{k} = \frac{1}{\sqrt{n}} [\begin{matrix} 1 & \bar{z_{k}} & \dots & {\bar{z_{k}}}^{n - 1} \end{matrix}] [\begin{matrix} φ (y_{1}) \\ φ (y_{2}) \\ ⋮ \\ φ (y_{n}) \end{matrix}] \\ = \frac{1}{\sqrt{n}} \sum_{j = 1}^{n} φ (y_{j}) {\bar{z_{k}}}^{j - 1} \\ = \bar{Φ (z_{k})}, \end{matrix}$

for $k = 1, \dots, n - 1$ ( $φ (y)$ is real), with

$Φ (z) = \frac{1}{\sqrt{n}} \sum_{j = 1}^{n} φ (y_{j}) z^{j - 1} .$ (3.20)

From (3.19) and (3.16) we obtain

$σ_{Δ t}^{2} = \frac{Δ t}{n} {({\tilde{V}}^{H} φ (\underline{y}))}^{H} {(I - \tilde{Λ})}^{- 1} (I + \tilde{Λ}) ({\tilde{V}}^{H} φ (\underline{y}))$

$= \frac{Δ t}{n} \sum_{k = 1}^{n - 1} \frac{1 + p (z_{k})}{1 - p (z_{k})} {| Φ (z_{k}) |}^{2}$ (3.21)

$= \frac{Δ t}{n} \sum_{k = 1}^{n - 1} ℜ (\frac{1 + p (z_{k})}{1 - p (z_{k})}) {| Φ (z_{k}) |}^{2}$ (3.22)

$= \frac{Δ t}{n} \sum_{k = 1}^{n - 1} \frac{1 - {| p (z_{k}) |}^{2}}{{| 1 - p (z_{k}) |}^{2}} {| Φ (z_{k}) |}^{2},$ (3.23)

since $σ_{Δ t}^{2}$ is real and $\bar{p (z_{k})} = p (\bar{z_{k}})$ , for $k = 1, \dots, n - 1$ . The condition $| p (z_{k}) | < 1$ for $k = 1, \dots, n - 1$ clearly shows that (3.23) is nonnegative.

3.1. Example: Uniform Distribution

If $a_{j} = \frac{1}{n}$ , $j = 1, \dots, n$ , (i.e., $P = \frac{1}{n} \underline{e} {\underline{e}}^{T}$ ) we obtain

$p (z_{k}) = \frac{1}{n} \sum_{j = 1}^{n} z_{k}^{j - 1} = \frac{1}{n} \frac{1 - z_{k}^{n}}{1 - z_{k}} = 0$

for $k = 1, \dots, n - 1$ . Parseval’s identity yields

$\begin{matrix} \sum_{k = 1}^{n - 1} {| Φ (z_{k}) |}^{2} = \sum_{k = 0}^{n - 1} {| Φ (z_{k}) |}^{2} = \sum_{k = 0}^{n - 1} {| {(V^{H} φ (\underline{y}))}_{k} |}^{2} \\ = φ {(\underline{y})}^{T} V V^{H} φ (\underline{y}) = φ {(\underline{y})}^{T} φ (\underline{y}) = \sum_{j = 1}^{n} {| φ (y_{j}) |}^{2} . \end{matrix}$

Then (3.23) reduces to

$σ_{Δ t}^{2} = \frac{Δ t}{n} \sum_{k = 1}^{n - 1} {| Φ (z_{k}) |}^{2} = \frac{Δ t}{n} \sum_{j = 1}^{n} {| φ (y_{j}) |}^{2} \approx Δ t \int_{Y} φ^{2} (y) ρ_{Δ t} (d y), as n \to \infty .$

3.2. Example: Big World Transition Probability

Assume now that the probability $y_{i} \to y_{j}$ , $j \neq i$ , is independent of j but distinct from the probability $y_{i} \to y_{i}$ , i.e.,

$a_{1} = 1 - a Δ t, a_{2} = \dots = a_{n} = \frac{a Δ t}{n - 1},$

with $0 < a < 1 / Δ t$ . Then

$p (z) = 1 - a Δ t + \frac{a Δ t}{n - 1} (z + \dots + z^{n - 1}) = 1 - a Δ t + \frac{a Δ t}{n - 1} \frac{z - z^{n}}{1 - z}$

for $z \neq 1$ . In particular

$p (z_{k}) = 1 - a Δ t + \frac{a Δ t}{n - 1} \frac{z_{k} - 1}{1 - z_{k}} = 1 - \frac{n a Δ t}{n - 1},$

for $k = 1, \dots, n - 1$ . Therefore, by Parseval again,

$\begin{matrix} σ_{Δ t}^{2} = \frac{2 - \frac{n a Δ t}{n - 1}}{\frac{n a}{n - 1}} \frac{1}{n} \sum_{j = 1}^{n} {| φ (y_{j}) |}^{2} \\ \approx \frac{2 - Δ t a}{a} \int_{Y} φ^{2} (y) ρ_{Δ t} (d y) as n \to \infty \\ \approx \frac{2}{a} \int_{Y} φ^{2} (y) ρ_{Δ t} (d y) as Δ t \to 0. \end{matrix}$ (3.24)

3.3. Example: Small World Transition Probability

A case of particular interest in the study of the transmission of a signal around a cyclic biological chain corresponds to

$a_{1} = 1 - a Δ t, a_{2} = a Δ t, a_{3} = \dots = a_{n} = 0,$

with $0 < a < 1 / Δ t$ . The quantity $a_{2}$ represents the transition probability of a state $y_{j}$ to the next state $y_{j + 1}$ . Then $p (z) = 1 - a Δ t + a z Δ t$ , and

$\frac{1 - {| p (e^{i y}) |}^{2}}{{| 1 - p (e^{i y}) |}^{2}} = \frac{1 - {(1 - a Δ t + a Δ t)}^{2} - {(a Δ t)}^{2}}{{(a Δ t - a Δ t)}^{2} + {(a Δ t)}^{2}} = \frac{1 - a Δ t}{a Δ t},$

for $0 < y < 2 π$ . Therefore

$σ_{Δ t}^{2} = \frac{1 - a Δ t}{a} \frac{1}{n} \sum_{j = 1}^{n} {| φ (y_{j}) |}^{2}$ (3.25)

$\approx \frac{1 - a Δ t}{a} \int_{Y} φ^{2} (y) ρ_{Δ t} (d y) as n \to \infty$

$\approx \frac{1}{a} \int_{Y} φ^{2} (y) ρ_{Δ t} (d y), as Δ t \to 0.$ (3.26)

For $n = 2$ , (3.25) reduces to $\frac{1 - Δ t a}{a} \frac{φ {(y_{1})}^{2} + φ {(y_{2})}^{2}}{2}$ . Note that (3.26) is half

of (3.24), as could be expected from a less dispersive signal.

In the following section our approach to modeling randomness in dynamical systems is through allowing parameters in a system to be a random process. This approach will determine when solution of this type of stochastic problem do or do not persist when the system is perturbed.

4. Mathematical Derivation for Numerical Approximation of Stochastic Differential Equations (SDE)

For the simplicity, consider

${\begin{cases} \dot{x} (t) = f (t, ω, x (t), y), \\ y = y (t / ε), \\ x (0) = x_{0}, \end{cases}$ (27)

The average system is defined by the differential equation

${\begin{cases} \dot{\bar{x}} (t) = f (t, ω, \bar{x} (t)), \\ \bar{x} (0) = x_{0}, \end{cases}$ (28)

We want to compute the deviation of the perturbed system to be average one. We consider $\tilde{x} = x - \bar{x}$ , then

$\begin{matrix} \dot{\tilde{x}} = \dot{x} - \dot{\bar{x}} \\ = f (t, ω, \bar{x} + \tilde{x}, y) - \bar{f} (t, ω, \bar{x}) \\ = f (t, ω, \bar{x}, y) + f (t, ω, \bar{x}, y) \tilde{x} + \dots - \bar{f} (t, ω, \bar{x}) \\ ≃ {\bar{f}}_{x} (t, ω, \bar{x}) \tilde{x} + (f (t, ω, \bar{x}, y) - \bar{f} (t, ω, \bar{x})) . \end{matrix}$

Note that $\tilde{x} (0) = 0$ , $\Rightarrow$

$\tilde{x} = \int_{0}^{t} {\bar{f}}_{x} (s, ω, \bar{x} (s)) \tilde{x} d s + \int_{0}^{t} [f (s, ω, \bar{x} (s), y (s / ε)) - \bar{f} (s, ω, \bar{x} (s))] d s .$

It follows from limit theorem of stochastic processes (see, [2] ),

$\frac{1}{\sqrt{ε}} \int_{0}^{t} [f (s, ω, \bar{x} (s), y (s / ε)) - \bar{f} (s, ω, \bar{x} (s))] d s ≃ N (0, σ^{2} (t)) .$

Detailed derivation of $σ^{2} (t)$ is shown in Section 2. The stochastic processes

$\frac{x (t) - \bar{x} (t)}{\sqrt{ε}} \approx \tilde{x} (t),$

converge in expected sense to the solution $\tilde{x} (t)$ that is the solution to the integral equation

$\tilde{x} (t) = \int_{0}^{t} {\bar{f}}_{x} (s, ω, \bar{x} (s)) \tilde{x} (s) d s + σ (t) .$

The distribution of $x (t)$ is close to the distribution of the stochastic processes $\bar{x} (t) + \sqrt{ε} \tilde{x} (t)$ in the sense that

$E (x (t)) = E (\bar{x} (t) + \sqrt{ε} \tilde{x} (t)) .$

Thus $x (t) \approx \bar{x} (t) + \sqrt{ε} \tilde{x} (t) + o (\sqrt{ε})$ , as $ε \to 0$ . Thus the nature of this convergence and the sense in which the error in the formula of the expansion x is small are in the sense of distributions over some time interval.

4.1. Forward Euler Scheme for SDE

The Euler method to approximate the analytic solution of the IVP

${\begin{cases} \dot{x} (t) = f (t, x, y (t / ε)), \\ x (0) = x_{0} . \end{cases}$ (4.29)

We can derive an entire family of discrete numerical methods (including the Euler method) by truncating the Taylor series and utilizing Taylor’s Theorem. First, we rewrite the Taylor series expansion in differential form,

$\begin{matrix} x (t + h) = x (t) + \frac{h}{1!} x^{'} (t) + \frac{h^{2}}{2!} y^{″} (t) + \dots + \frac{h^{n}}{n!} y^{″} (t) + \dots \\ = x (t) + h f (t, x, y (t / ε)) + \frac{h^{2}}{2} f^{'} (t, x, y (t / ε)) f (t, x, y (t / ε)) + \dots \\ + \frac{h^{3}}{6} (f^{″} (f, f) + f^{'} (f^{'} (f))) + \dots . \end{matrix}$

The Euler method is found by truncating Taylor series at the first derivative, giving

$x (t + h) = x (t) + h f (t, x, y (t / ε)) + o (h^{2}),$ (4.30)

where $o (h^{2})$ is the error term, or Taylors remainder term, which is of order $h^{2}$ . So if we define $x_{n} = x (t + h)$ , and $x_{n + 1} = x (t + h)$ , we obtain

$x_{n + 1} = x_{n} + h f (n, x_{n}, y (n / ε)),$

which has local error term (error at each step) of $o (h^{2})$ , giving a global error $o (h)$ .

4.2. Discrete Approximation for SDE

Numerical schemes for solving SDE can be classified into either explicit or implicit methods. Explicit methods compute approximations that are dependent on previous approximations only, whereas the implicit methods compute approximations that are dependent on previous and current approximations. The Euler method presented earlier is also known as the explicit Euler method. The explicit Euler method is defined as

$x_{i + 1} = x_{i} + h f (t_{i}, x_{i}, y_{i / ε}),$ (4.31)

where $x_{i} = x (t_{i})$ and $x_{i + 1} = x (t_{i + h})$ . Here h is the step size. This method has a global error $\approx o (h)$ . The implicit method is defined as

$x_{i + 1} = x_{i} + h f (t_{i}, x_{i + 1}, y_{(i + 1) / ε}),$ (4.32)

where $x_{i} = x (t_{i})$ , and $x_{i + 1} = x (t_{i} + h)$ . This method has a global error $\approx o (h)$ .

4.3. Example 1

We write the algorithm for the following equation, which we will be used as a test equation since it has an analytic solution.

${\begin{cases} \dot{x} (t) = y (t / ε), \\ x (0) = 0, \end{cases}$ (4.33)

over $[0,1]$ and then

$x (t + h) = x (t) + h y (t / ε) + o (h) .$ Now,

1) choose a step $h = (1 - 0) / N$ . Set $x_{n} = 0 + n h$ , $n = 0 : N$ .

2) Generate approximation $x_{n}$ to from the following recursion:

$\begin{matrix} x_{n} = 0 + h \sum_{k = 0}^{n - 1} y (k h / ε) \\ = h \sum_{k = 0}^{n - 1} y (k h / ε) \\ = h n \bar{y} + h \sum_{l = 0}^{h (n - 1) / ε} (y (l) - \bar{y}) \\ = h n \bar{y} + H \sqrt{ε} \frac{1}{1 / \sqrt{ε}} \sum_{l = 0}^{h (n - 1) / ε} (y (l) - \bar{y}), \end{matrix}$

for $n = 0 : N - 1$ .

As $ε \to 0$ ,

$x_{n} \to h n \bar{y} + H \sqrt{ε} W (h n) + E r r o r,$

where $h = H ε$ in our numerical simulation and W is realized by the normally distributed random number whose expectation and variance are 0 and 1, respectively. A computer simulation of the distribution of solution of $x (1, ε)$ using Euler without max step size, ode15s without max step, and ode15s using max step size are given in Figure 1.

Figure 1. Horizontal bar plot: x vs. bins. This figure shows the distribution of $x (1, ε)$ represented by the vertical axis of 500 sample paths versus bins in the horizontal axis at the final time t = 1 of the system (4.33). In this case, $x (0) = 1$ , =0.67, $0 \leq t \leq 1$ , and $H = 0.1$ , $ε = 0.01$ . The left figure shows the distribution of $x (1, ε)$ using Euler without max step size. The middle figure shows the distribution of $x (1, ε)$ using ode15s solver without max step size. The right figure shows the distribution of $x (1, ε)$ using ode15s using max step size.

4.4. Example 2

We consider a stochastic process $x_{ε} (t)$ is defined by the differential equation

${\begin{cases} \dot{x} (t) = - t x (y (\frac{t}{ε}) - 0.5) \equiv f (t, ε, x, y), \\ x (0) = 1, \end{cases}$ (4.34)

with

$Y = {0,1},$

$P = (\begin{matrix} 0.8 & 0.2 \\ 0.1 & 0.9 \end{matrix}),$

$Q = (\begin{matrix} - 0.2 & 0.2 \\ 0.1 & - 0.1 \end{matrix}),$

and limiting distribution

$ρ = {1 / 3, 2 / 3}$

We want to solve (4.34) numerically over time interval $[0,1]$ using

1) Forward Euler with smaller time scale.

2) evaluate $\bar{x} + \sqrt{ε} \tilde{x}$ , where

$\dot{\bar{x}} = - t x (\frac{1}{3} y_{1} + \frac{2}{3} y_{2} - 0.5) \equiv \bar{f} (t, ε, x),$ (4.35)

and

$d \tilde{x} = {\bar{f}}_{x} (t, ε, x) \tilde{x} d t + σ (t) d W .$ (4.36)

3) compare 2. with 1., where $\bar{x} (t)$ and $\tilde{x} (t)$ are the solutions of (4.35) (using large time scale) and (4.36) respectively. And $σ (t)$ is computed using (2.15) along with $ρ, f, \bar{f}$ , and $d W = \sqrt{d t} N (0,1)$ . Comparison of the averaged system to the perturbed system as well as error convergence are illustrated in Figure 2 and Figure 3.

4.5. Example 3

We consider a stochastic process $x_{ε} (t)$ is defined by the differential equation

${\begin{cases} \dot{x} (t) = 1 / 2 + \cos (y (\frac{t}{ε})) \cos (x) \equiv f (t, ε, x, y), \\ x (0) = 1, \end{cases}$ (4.37)

with

$Y = {0,1},$

Figure 2. Left figure: $x$ and $\bar{x}$ vs. t. Comparison of the averaged system to the perturbed system. Red curve shows the solution of the averaged system; other curves are solutions of the perturbed system with different $ε$ . Right figure: horizontal bar plot, where the vertical axis represents the distribution of solutions for 500 sample paths of the system (4.34) at the final time t = 1. In this case, $ε = 0.01$ , $0 \leq t \leq 1$ , $y$ is generated with two state Markov process with $\bar{y}$ = 0.67.

Figure 3. Error plots: $E | x (t) - (\bar{x} (t) + \sqrt{ε} \tilde{x} (t)) |$ represents vertical axis vs $ε$ , which represents horizontal axis. Dashed line is the line with the reference slope 1/2. Graphs are drawn on log-log scale.

$P = (\begin{matrix} 0.8 & 0.2 \\ 0.1 & 0.9 \end{matrix}),$

$Q = (\begin{matrix} - 0.2 & 0.2 \\ 0.1 & - 0.1 \end{matrix}),$

and limiting distribution

$ρ = {1 / 3, 2 / 3}$

We want to solve (4.37) numerically over time interval $[0,1]$ using

1) Forward Euler with smaller time scale.

2) evaluate $\bar{x} + \sqrt{ε} \tilde{x}$ (using smaller time scale, h), where

$\dot{\bar{x}} = 1 / 2 + (\frac{1}{3} \cos 0 + \frac{2}{3} \cos 1) \cos (\bar{x}) \equiv \bar{f} (t, ε, x),$ (4.38)

and

$d \tilde{x} = {\bar{f}}_{x} (t, ε, x) \tilde{x} d t + σ (t) d W .$ (4.39)

3) compare 2. with 1., where $\bar{x} (x)$ and $\tilde{x} (t)$ is the solution of (4.38) (using large time scale), and (4.39) respectively. And $σ (t)$ is computed using (2.15) along with $ρ, f, \bar{f}$ and $d W = \sqrt{d t} N (0,1)$ . Comparison of the averaged system to the perturbed system as well as error convergence are illustrated in Figure 4 and Figure 5.

4.6. Example 4

We consider a stochastic process $x_{ε} (t)$ is defined by the differential equation

Figure 4. Left figure: $x$ and $\bar{x}$ vs. t. Comparison of the averaged system to the perturbed system. Red curve shows the solution of the averaged system; other curves are solutions of the perturbed system with different $ε$ . Right figure: horizontal bar plot, where the vertical axis represents the distribution of solutions for 500 sample paths of the system (4.37) at the final time t = 1. Here $ε = 0.01$ , $0 \leq t \leq 1$ , $y$ is generated with two sate Markov process with $\bar{y}$ = 0.67.

Figure 5. Error plots: $E | x (t) - (\bar{x} (t) + \sqrt{ε} \tilde{x} (t)) |$ represents vertical axis vs $ε$ , which represents horizontal axis. Dashed line is the line with slope 1/2. Graphs are drawn on log-log scale.

${\begin{cases} \dot{x} (t) = 1 + y (\frac{t}{ε}) - (3 + y (\frac{t}{ε})) \cos (x) \equiv f (t, ε, x, y) \\ x (0) = 1, \end{cases}$ (4.40)

with

$Y = {0, 1},$

$P = (\begin{matrix} 0.8 & 0.2 \\ 0.1 & 0.9 \end{matrix}),$

$Q = (\begin{matrix} - 0.2 & 0.2 \\ 0.1 & - 0.1 \end{matrix}),$

and limiting distribution

$ρ = {1 / 3, 2 / 3} .$

We want to solve (4.40) numerically over time interval $[0,1]$ using

1) Forward Euler with smaller time scale.

2) evaluate $\bar{x} + \sqrt{ε} \tilde{x}$ (using smaller time scale, h), where

$\dot{\bar{x}} = 1 + y - (3 + y) \cos (x) \equiv \bar{f} (t, ε, x) .$ (4.41)

and

$d \tilde{x} = {\bar{f}}_{x} (t, ε, x) \cos (\bar{x}) \tilde{x} d t + σ (t) d W .$ (4.42)

3) compare 2. with 1. where $\bar{x} (t)$ and $\tilde{x} (t)$ is the solution of (4.41) (using large time scale) and (4.42) respectively. And $σ (t)$ is computed using (2.15) along with $ρ, f, \bar{f}$ , and $d W = \sqrt{d t} N (0,1)$ . Comparison of the averaged system to the perturbed system as well as error convergence are illustrated in Figure 6 and Figure 7.

4.7. Strong Convergence

In the examples above, directly simulated solution $x_{ε} (t)$ with smaller time scale matches more closely to the solution $\bar{x} (t) + \sqrt{ε} \tilde{x} (t)$ , (where $\bar{x} (t)$ solved using larger time scale and $\tilde{x} (t)$ is the solution of $d \tilde{x} = {\bar{f}}_{x} (t, y, \bar{x}) \tilde{x} d t + σ (t) d w$ , using Euler with large time scale H) as $ε$ is decreased, the convergence seems to take place. Using $E | x_{ε} (t) - (\bar{x} (t) + \sqrt{ε} \tilde{x} (t)) |$ , where E denotes the expected value, leads the concept of strong convergence. A method is said to have strong order of convergence equal to m if there exists a constants K such that

$E | x_{ε} (t) - (\bar{x} (t) + \sqrt{ε} \tilde{x} (t)) | \leq K {(ε)}^{m},$ (4.43)

and $ε$ is sufficiently small. It can be shown that perturbation method has strong order of convergence $m = 1$ . In our numerical tests, we will focus on the error at the end point $t = t f i n a l$ , so let

$E r r o r^{s t r o n g} = E | x_{ε} (t) - (\bar{x} (t) + \sqrt{ε} \tilde{x} (t)) | .$ (4.44)

If the bound in (4.43) holds with $m = 1$ at any fixed point in $[0, t f i n a l]$ , then it certainly holds at the end point, so we have

Figure 6. Left figure: $x$ and $\bar{x}$ vs. t. Comparison of the averaged system to the perturbed system. Red curve shows the solution of the averaged system; other curves are solutions of the perturbed system with different $ε$ . Right figure: horizontal bar plot, where the vertical axis represents the distribution of solutions for 500 sample paths of the system (4.40) at the final time t = 1. In this case, $ε = 0.01$ , $0 \leq t \leq 1$ , $y$ is generated with two state Markov process with $\bar{y}$ = 0.67.

Figure 7. Error plots: The vertical axis represents $E | x (t) - (\bar{x} (t) + \sqrt{ε} \tilde{x} (t)) |$ vs $ε$ , which represents the horizontal axis. Dashed line is the line with the reference slope 1/2. Graphs are drawn on log-log scale.

$E r r o r^{s t r o n g} \leq K ε^{1},$ (4.45)

for sufficiently small $ε$ . It is shown that perturbation method has strong order of convergence $m = 1$ . While experimenting the error $E r r o r^{s t r o n g}$ , we implicitly assumed that number of other sources of error are negligible, including error arising from approximating an expected value by sample mean, inherent error in the random generator, and floating point roundoff errors. For a typical computation the sampling error is likely to be the most significant of these three. In preparing the programs for these simulations we found that some experimentation is required to make the number of samples sufficiently large and the time step is sufficiently small for the predicted order of convergence to be observable. The sampling error decays like $1 / \sqrt{n}$ , where n is the number of sample paths used. A study in ( [19] ) indicates that as step size decreases, the lack of independence in the samples from a random generator typically degrades the computation before rounding errors becomes significant.

Although the definition of strong convergence involves an expected value, it has implications for individual simulations. The Markov inequality says that if a random variable $ξ$ has a finite expected value, then for any $a > 0$ the probability that $| ξ | \geq a$ is bounded above by $E | ξ | / a$ , that is,

$P (| ξ | > a) \leq \frac{E | ξ |}{a} .$

Hence taking $a = ε^{1 / 2}$ , we see that perturbation method’s strong convergence of order $m = 1$ is

$P (| x_{ε} (t) - (\bar{x} (t) + \sqrt{ε} \tilde{x} (t)) | \geq ε^{1 / 2}) \leq K ε^{1 / 2},$

or, equivalently,

$P (| x_{ε} (t) - (\bar{x} (t) + \sqrt{ε} \tilde{x} (t)) | < ε^{1 / 2}) \geq 1 - K ε^{1 / 2} .$

This shows that the error at a fixed point in $[0, t f i n a l]$ is small with probability close to 1.

4.8. Conclusion

Stochastic approximations for the process $\dot{x} (t)$ with parametric noise can be used to analyze aspects of noise. The result of the analysis is an approximation of the form $x (t) \approx \bar{x} (t) + \sqrt{ε} \tilde{x} (t) + o (\sqrt{ε})$ for $x (t)$ of the system. This can be used to evaluate the impact of parametric noise in the neural network. First of all, the system can be averaged. The system for $\bar{x}$ may or may not be analytically solvable but we develop numerical method for its solution. Second, the next order term solves a linear system forced by a Gaussian process, whose statistics depends on the nature of noise in the model.

Our systems are characterized by some system components which combine very fast and very slow behavior. These systems require adaptable step-size, as only in certain phases they require very small step size. It is important to use integration method that allows an efficient step size control. A system is called stiff when integrated with an explicit algorithm and a local error tolerance $10^{- n}$ , the step size of the algorithm is forced down to below a value indicated by the local error estimate due to constraints imposed on it by the limited size of the numerical stable region. Ode15s is a variable-order solver based on numerical differentiations formulas, optionally uses the backward differentiations formulas (also known a Gear’s method) like ode113, ode15s is multi-step solver. If one suspects that the problem is stiff or if ode45 fails or is very inefficient, try ode15s. But this slow vs. fast time scale problem, if we do not use max step size for the ode15s, the method does not give the correct result which is reflected in Figure 1.

Future work will address systems involving noisy neural network and the impact of noise on a node’s information processing capability which is determined by its signal-to-noise ratio which can be estimated by spectral methods.

Acknowledgements

The author would like to thank Prof. Samir K. Bhowmik for his opinion and discussion. The author also thanks the unanimous referees for their careful reading the manuscript and useful suggestions that improved the paper significantly.

Conflicts of Interest

The author declares no conflict of interest regarding the publication of the paper.

References

[1]	Rahman, M. and Welfert, B. (2013) Functional Central Limit Theorem for Markov Processes and Chains. Journal of Probability and Statistical Science (JPSS), 11, 111-127.
[2]	Battal Gazi Karakoc, S. and Zeybek, H. (2016) Solitary Wave Solutions of the GRLW Equation Using Septic B Spline Collocation Method. Applied Mathematics and Computation, 289, 159-172. https://doi.org/10.1016/j.amc.2016.05.021
[3]	Zeybek, H. and Battal Gazi Karakoc, S. (2017) Application of the Collocation Method with B-Splines to the GEW Equation. Electronic Transactions on Numerical Analysis, 46, 77-88.
[4]	Battal Gazi Karakoc, S., Geyikli, T. and Bashana, A. (2013) A Numerical Solution of the Modified Regularized Long Wave MRLW Equation Using Quartic B Splines. TWMS Journal of Applied and Engineering Mathematics, 3, 231-244. https://doi.org/10.1186/1687-2770-2013-27
[5]	Turgut, A.K. and Battal Gazi Karakoc, S. (2018) A Numerical Technique Based on Collocation Method for Solving Modified Kawahara Equation. Journal of Ocean Engineering and Science, 3, 67-75. https://doi.org/10.1016/j.joes.2017.12.004
[6]	Turgut, A.K., Battal Gazi Karakoc, S. and Triki, H. (2016) Numerical Simulation for Treatment of Dispersive Shallow Water Waves with Rosenau KdV Equation. The European Physical Journal Plus, 131, 1-15. https://doi.org/10.1140/epjp/i2016-16356-3
[7]	Bhowmik, S. and Rahman, M. (2019) Stability and Accuracy Analysis of Theta Scheme for a Convolutional Integro-Differential Equation. Differential Equations and Dynamical Systems, 28, 633-646. https://doi.org/10.1007/s12591-019-00476-w
[8]	Kloeden, P.E. and Platen, E. (1999) Numerical Solution of Stochastic Differential Equations, Applications of Mathematics. Vol. 23, Corrected Third Printing, Springer-Verlag, Berlin.
[9]	Karlin, S. and Taylor, H.W. (1975) A First Course in Stochastic Processes. Academic Press, New York. https://doi.org/10.1016/B978-0-08-057041-9.50005-2
[10]	Gikhmann (1973) Differential Equations with Random Functions. AMS Transl. 12.
[11]	Borodin, A.N. and Salminen, P. (2002) Handbook Brownian Motion-Facts and Formulae. 2nd Edition, Probability and Its Applications, Birkhäuser. https://doi.org/10.1007/978-3-0348-8163-0
[12]	Rahman, M. (2018) Asymptotic Estimate of Variance with Applications to Stochastic Differential Equations Arises in Mathematical Neuroscience. Communications in Statistics—Theory and Methods, 27, 289-306. https://doi.org/10.1080/03610926.2017.1303729
[13]	Skorokhod, A.V. (2000) On Randomly Perturbed Linear Oscillating Mechanical Systems. Ukrainian Mathematical Journal, 52, 1483-1495. https://doi.org/10.1023/A:1010392421925
[14]	Skorokhod, A.V., Hoppensteadt, F.C. and Salehi, H. (2002) Random Perturbations Methods with Applications in Science and Engineering, Applied Mathematical Sciences. Vol. 150, Springer-Verlag, New York. https://doi.org/10.1007/b98905
[15]	Bhattacharya, R.N. (1982) Functional Central Limit Theorem for Markov Processes. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 60, 185-201. https://doi.org/10.1007/BF00531822
[16]	Berman, A. and Plemmons, R.J. (1994) Nonnegative Matrices in the Mathematical Sciences. Classics in Applied Mathematics, SIAM, Philadelphia. https://doi.org/10.1137/1.9781611971262
[17]	Romanovsky, V. (1970) Discrete Markov Chain. Wolters-Noordhoff Publishing, Groningen.
[18]	Gordin, M.I. and Lifsic, B.A. (1978) The Central Limit Theorem for Stationary Processes. Soviet Mathematics—Doklady, 19, 392-394.
[19]	Komory, Y., Sato, Y. and Mitsui, T. (1994) Some Issues in Discrete Approximate Solution for Stochastic Differential Equations. Computers & Mathematics with Applications, 28, 269-278. https://doi.org/10.1016/0898-1221(94)00197-9

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies