Parameter Identification in Traveling Wave Solutions of a Modified Fisher’s Equation ()
1. Introduction
Many important processes such as fluid flow, heat and mass transport, wave motion, and others involve quantities that vary in space and time and can be modeled by partial differential equations (PDEs). Most PDEs, specially if nonlinear, do not have explicit analytical solutions and numerical methods must be used to solve them. The inverse problem for PDEs/ODEs is one of determining unknown coefficients, parameters, or functions in the equation from a set of observations or measurements. Normally, the inverse problem is more challenging than the forward problem due to its ill-posed nature, in which small errors in the measurements can lead to large errors in the estimated parameters or functions. Inverse problems arise in many scientific and engineering applications, such as medical imaging, geophysics, non-destructive testing, and material characterization. For example, PDE inverse problems can be used to determine properties of materials from measurements of waves or fields, or to reconstruct images of internal organs from X-rays or CT scans. In this article, we focus on the traveling wave solution of a nonlinear diffusion-reaction equation, namely Fisher’s equation but with some modifications. We develop methods for solving the inverse problem, which aims to discover the PDEs that govern a space-time-dependent process and the parameters within those PDEs using various approaches, including machine learning and optimization. Given a numerical solution, the goal is to determine which PDE and what parameters gave rise to that solution.
Fisher’s equation was first introduced in 1930 [1] to describe the spread of advantageous genes. Canosa extended it to depict population growth in 1973 [2] . In Murray’s book [3] , the history and various applications of Fisher’s equation are discussed in detail. There has been growing interest in studying the traveling wave solutions of Fisher’s equation. Some researchers have found analytic solutions in specific cases. For example, [4] provided an exact solution with wave speed
; [5] provided the solution of quadratic Fisher equation in both one and two dimensions; and [6] provided analytic approximate solutions of some nonlinear parabolic dynamical wave equations (including Fisher’s equation) with a modified variational iteration algorithm.
In addition to analytic solutions, there are also many articles that focus on numerical solutions of Fisher’s equation. For example, [7] presented the numerical solution of several types of Fisher’s reaction-diffusion equation by using cubic trigonometric B-spline functions and the differential quadrature method; [8] used forward-in-time and central-in-space (FTCS) method; and [9] applied Atangana-Baleanu fractional derivative with spectral collocation methods.
In this paper, a relaxation effect is introduced in an equation relating flux to density; the linear part of the operator then becomes hyperbolic and yields a finite speed for the propagation of disturbances. This adds a parameter making the identification problem a bit more challenging. A similar relaxation effect has been introduced into the convection-diffusion equation; for example, [10] solved the unsteady convection-diffusion equation with such a relaxation effect by using third-order accurate implicit-explicit (IMEX) Runge-Kutta method in time and fifth-order finite difference WENO scheme in space. [11] provided an analytic approach with fractional operators, including Liouville-Caputo and Atangana-Baleanu-Caputo operators; [12] used a system similar to the one discussed in the current paper but with different parameters, and they provided large-time solutions with a specific nonlinear function on the right-hand side:
taking
and
.
Recently, instead of using traditional PDE numerical solvers, researchers have started to solve direct and inverse PDE problems using machine learning algorithms. For example, [13] solved PDEs with deep learning; they provided some examples like the Black-Scholes equation, and the Hamilton-Jacobi-Bellman equation. [14] introduced a new feed-forward deep network, PDE-Net, to solve PDEs. Additionally, a new deep learning framework called Physics-Informed Neural Network (PINN) has been used to solve PDEs. For this method, [15] provided the general structure and algorithms, and [16] obtained numerical solutions by using a Bayesian PINN (B-PINN) in both forward and inverse nonlinear directions with noisy data.
As to parameter estimation, [17] proposed two methods (parameter cascading and Bayesian approach) to estimate the parameters in PDEs, [18] estimated the parameters of a laminar flow model with the help of machine learning, and [19] designed a data-driven model that can be used to discover the key parameters in PDEs through time series measurements in the spatial domain.
Fisher’s equation combines the logistic population growth model,
, where k is the growth rate at low densities and C is the carrying capacity, with a diffusion equation in which density
is a function of space x and time t and satisfies the PDE
. Upon scaling u with C, time t with
and x with
, the dimensionless form of Fisher’s equation becomes
. By introducing the flux
, we can rewrite this scaled reaction-diffusion equation in the form
(1)
where
. Here, we have modified the logistic expression for the growth rate
by allowing for higher order polynomials to see their effect on the parameter identification problem.
The second equation in the system above relates the flux q, instantaneously, to the negative of the gradient of the density u, which is common in diffusion models (such as Fick’s law of diffusion). However, in some cases, it may take a finite time for the flux to adjust to the variations in the gradient. To account for this relaxation process, one can introduce a dimensionless relaxation time
in the flux relation. The new system then reads:
(2)
By eliminating
, we can also write this system as a single second-order equation in the form:
(3)
In the special case
and
, the equation becomes the original Fisher’s equation which is a parabolic PDE; when
, it becomes hyperbolic.
In this paper, we focus on traveling wave solutions of this system. To that end, we take both
and
to depend on the single variable
with velocity
in the system (2) and equation (3). The result is a system of coupled first-order ordinary differential equations:
(4)
and a second order ODE:
(5)
where
. These equations are our starting points to analyze the properties of the traveling wave solutions.
In this article, we mostly focus on the ODE system (4), and the second-order ODE (5). In our experiments, we find that larger values of
are needed to show a significant change in the solution u as the order of nonlinearity increases. For example, if we set
and
to 0.6 and 0.8, respectively, the wave front occurs when
is around 750. Most of the time, we restrict our attention on the nonlinearity
with just a single parameter
. As such, our modified Fisher’s equation includes three parameters:
, V, and
, which we try to identify from a given numerical solutions of the system (2) or the Equation (5).
The traveling wave fronts that are of interest to us have a u profile that smoothly goes from 1 to 0 (two of the fixed points of the system) as
ranges from
to
. To justify the existence of such traveling waves, we first carry out a stability analysis of the nonlinear dynamical system for arbitrary values of V,
, and
. Then, we consider the inverse problems of finding the parameters given some numerical solution. We consider two different cases of the inverse problem: one in which both profiles
and
are available, and another where only one of these, namely
is known. We refer to these as Type 1 and Type 2 inverse problems, respectively. For Type 1 problems, we can determine the parameters V,
, and
through the use of appropriate optimization problems. For Type 2 problems, we recover parameters V,
, and
by means of machine learning methods. In this case, we first use an ensemble method to get a more accurate initial value for velocity V, which is crucial for enhancing the performance of the optimization problem to determine the remaining two parameters
and
.
2. Methods
In this section, we focus on Equations (4) and (5). We first take
where
, and u is a proper density field that lies within the interval [0, 1]. Later, we restrict to the case where
is cubic with only one extra parameter
. After the traveling-wave transformation and with the help of a stability analysis in the phase plane, we identify the proper range of parameters V and
, which turn out to be
and
, for which traveling waves are obtained, keeping the density between 0 and 1. We then generate a large set of numerical solutions using these values to train machine learning algorithms. Next, assuming that some numerical traveling wave solution is given (including both u and q profiles or the u profile alone), we design parameter estimation models to estimate the velocity V, relaxation time
, and parameter
that gave rise to that numerical solution.
2.1. Stability Analysis
Stability analysis of fixed points in the phase plane is a powerful approach for qualitative understanding of the behavior of a dynamical system. The equilibrium points can typically be classified as linearly stable, unstable, or semi-stable. Although those classifications are local, it is often possible to infer qualitatively the global trajectories in the phase plane through further analysis. For instance a trajectory connecting two separate fixed points can represent a solution that goes from one constant value to another. However, due to the complex nature of dynamics in the presence of multiple fixed points and nonlinearities, while linear stability analysis is a valuable tool, it may not be suitable for all systems or provide a full picture of the dynamics.
Equation (4) can be written in matrix-vector notation:
(6)
Matrix
is invertible provided that
is not equal to 1. We linearize the system about fixed points
by introducing small perturbations
and
through
and
. The linearized system becomes:
The eigenvalues of
determine the stability of the fixed points; to simplify the notation, let
; the two eigenvalues at each fixed point are then given by
(7)
The function
has roots at
and
in addition to those introduced by the additional factors; thus there are always two equilibrium points at 0 and 1. At these points
and
. We seek traveling wave solutions in which u varies smoothly from 1 to 0 as
goes from
to
. As such
must be a stable node (if it were a spiral, u would oscillate as it approached 0 and become negative at times), while
can be an unstable node or saddle. The corresponding eigenvalues for these equilibrium points are given as:
When
, in order for the eigenvalues
not to be complex which would make the fixed point a spiral, V has to be greater than 2. For any V, we must keep
less than
in order to avoid a singularity and change of sign in the eigenvalues. In order for u to have a smooth transition between 1 and 0, the two key parameters in our system, velocity V and relaxation time
, can be restricted to be within the respective ranges:
, and
.
2.2. Parameter Estimation for the Cubic Non-Linearity
Using
, and applying constraints for
and V, the scaled system (4) reduce to:
(8)
Assuming that a numerical solution is given over some range, we design several approaches to determine the parameters for two types of inverse problems: Type 1 where numerical solutions u and q (density and flux) are both given with u in the range
; Type 2 where the numerical solution u (density only) is given in the range
. A good approximation of the first and second derivatives is essential for getting high accuracy in the parameter estimates. We use the following high order central difference formulae for this purpose:
(9)
To characterize the performance in our results, we will use the Root Mean Square Error (RMSE) which is defined as:
where
is the true value, and
is the estimated value of
.
2.2.1. Type 1 Inverse Problem
Assume that both numerical solutions u and q with u in the range
are given. Solving the first equation for V in (8) yields:
. Since V is supposed to be a constant (independent of
if the estimated
is equal or close to the true value
, there should be almost no variance in
calculated at different
locations. In other words, if we calculate the variance in V (thought of as a function of
) as
is varied, it should be very close to 0 when
is close to the true value
. Therefore, we generate our first model which solves the following optimization problem:
(10)
It is hard to see the convexity of
directly, but we can transform this problem into the following quadratic form in which
is strongly convex (See the Appendix for details):
where
,
,
, and
is a column vector of ones. Here, n is the number of data points in the vector of numerical solution values u.
After obtaining
, we can estimate
and V by following equations obtained by averaging the appropriate pointwise values of the parameters along the numerical traveling wave profile:
2.2.2. Type 2 Inverse Problem
Assuming that only some numerical solution of u is given in the range
without having the corresponding flux q, we devise an alternative method for estimating the parameters. The solution can be generated by solving the second order ODE (5). Considering the different forms of numerical equations, it can sometimes estimate
too. Therefore, it is essential to come up with other methods where the only input is a part of the u solution. To do that, we first solve for
from the equation
treating V and
as given. This yields:
To simply the notation, we define four vectors as follows:
,
,
, and
. Ignoring the vector sign for ease of notation, we can write
. Next, like in the previous case, we regard this as the following two-variable optimization problem:
(11)
It is clear that
is not a convex function. However, we found that if V is equal to or around the true velocity, we can transform this equation into the following form where
is a strongly convex function (see the Appendix for details).
where
, and
.
Therefore, if there is a good initial guess for the velocity, the solution of the optimization problem (11) always exists and is unique. We designed an ensemble model to make such predictions. Also, in addition to solving (11), we use an embedding technique to estimate the parameters.
2.2.3. Variable Selection
To begin the design of the ensemble model, the first step is variable selection. In our model, we generate our data set from the u solutions. Since
is a traveling front, it can be shifted right or left without changing form, so the origin in
is arbitrary. It is therefore better to work with variables that are independent
Figure 1. The process of the first three intercept variables that
. After estimating all
, we generate the intercept variable set:
.
of such shifts. Here is the main process we used to generate our data set from a numerical solution
, as shown in Figure 1:
1. Define a set of fixed range intervals along the u-axis, namely
; the numerical solution is given by
, and its corresponding variables set is
.
2. Since most of the time, the numerical solution does not include the exact
that corresponds to
, we estimate it by linear interpolation: assuming the neighbors of
are
and
, which can be found in the solution set u and their corresponding
values:
and
. The linear interpolation estimate of
is given by
.
3. After estimating all
, to avoid that `horizontally shift’ issue, design the intercept variables as the increments:
.
4. Repeat steps 1 to 3 until the intercept variables of all numerical solutions are generated.
Table 1 shows sample input data generated from four intercept variables in U.
2.2.4. Ensemble Model with Linear Machine Learning Algorithms
In our approach, prediction of the traveling wave velocity is a regression task; therefore, we propose using existing ensemble techniques along with the following common linear models:
1) Linear regression: This is one of the most famous methods, which is easy to calculate and apply. It considers the problem of finding the column vector
in
, which can be formulated as:
where X represents the input data, w is the vector of weights, and y is the target variable. If there is a positivity constraint in linear regression, in which all coefficients must be positive except the intercept, we solve the following optimization problem:
2) Ridge regression/Tikhonov regularization: Due to multicollinearity in
Table 1. One sample variables table contains 4 intercept variables with
, velocity V from 2 to 10, time relaxation
from 0 to
, and
.
data, most people consider using ridge regression to avoid this issue by imposing a penalty on the size of the coefficients. One of the reasons it works is that it does not require unbiased estimators. This problem can be written as:
where
is a non-negative parameter that controls the degree of shrinkage.
3) Lasso regression: “Lasso” stands for Least Absolute Shrinkage and Selection Operator. Similarly to Ridge regression, Lasso regression adds a penalty term to avoid multicollinearity problems, but unlike Ridge regression, it uses the
norm rather than the
norm. We can express this problem as:
where X represents the input data, w is the vector of weights, y represents the target variable, and the non-negative parameter
controls the degree of shrinkage.
4) Bayesian ridge regression: This method assumes that the output y comes from a probability distribution, rather than having a single value. We express this as:
where
is a random variable estimated from the data. In practice, people use gamma distributions which are conjugated prior to the precision of the Gaussian distribution (focusing on two parameters:
and
), rather than the normal distribution directly.
5) Bayesian Automatic Relevance Determination (ARD) regression: This method is similar to Bayesian Ridge Regression but with sparse weights, containing a weight matrix without spherical Gaussian distribution. This means that its coefficients
can be drawn from a normal distribution with zero mean and precision
:
where A is a positive definite diagonal matrix with
.
6) Stochastic Gradient Descent (SGD) Regression: This is another popular gradient descent method. However, compared to regular gradient descent, it induces randomness in the process. Due to this characteristic, SGD can randomly pick one data point from the whole set at each iteration, which reduces the computation time dramatically.
The reason we chose these regression methods is that although linear regression is easy to apply, it is sensitive to outliers and is prone to noise and overfitting. Therefore, to reduce the effects of such overfitting, we introduce other regression methods such as Ridge and Lasso which add a penalty term, Bayesian Ridge which provides a confidence bound on predictions, and ARD regression which provides a sparser solution than Bayesian regression. Additionally, we also applied the SGD regression due to its high speed of execution.
2.2.5. Embedding Method Process
By using the previous ensemble model, the predicted velocity is very close to the true velocity, but its performance is still not good enough for relaxation time
and parameter
prediction. Therefore, instead of using one ensemble model to predict all parameters, we applied an embedding method. Roughly, the process can be shown as is shown in Figure 2.
There are several advantages to our approach:
1) Low cost: In our model, during the training process, the range of velocities typically falls between 2 and 10, and a good initial guess of the velocity is one with an error less than about 10−3 compared to the true velocity. However, designing a data set with velocities that have around 10−3 increments, considering the increments of
and
, might give rise to more than one hundred thousand different cases which makes the learning process too long to be practical. Our approach reduces the computational time and cost.
2) Ease of application: Compared to optimization problem solvers, the ensemble model approach is relatively easy to apply. We have already defined the structure of the ensemble model, so we can simply train another one using the embedded data.
2.2.6. Main Process for Type 2 Inverse Problems
In summary, similar to the previous case, we use an ensemble model to predict a good initial value of the traveling wave velocity. After that, we not only solve the optimization problem (11), but also apply the embedding method to predict the parameters. The whole process can be summarized in Figure 3.
3. Results
In this section, we present some key numerical solutions of the traveling wave
Figure 2. A schematic diagram of the embedding process: the original training set considers the velocity range from 2 to 10. After
prediction, generate new training set with a velocity from
to
where
is a small value.
Figure 3. A schematic representation of the type 2 inverse problem.
problem, consider it in the phase plane, and characterize the performance of the parameter estimation methods from a given solution. We include both types of inverse problems where both u and q are available or when only the u profile is given.
3.1. Traveling Wave Solutions
To solve the nonlinear Equation (8), we can use standard ODE solvers such as Python SciPy’s “odeint”. The pair of coupled equations can be written in terms of u and u’ (which constitutes the phase plane) or in terms of u and q:
· In terms of density and its derivative:
· In terms of density and flux when
:
· In terms of density and flux when
:
Figure 4 displays some examples of the phase plane and numerical solutions of u with
and
. The fixed points are along the horizontal axis in the left panels at
and
. They are marked with red dots in the figure. The heteroclinic trajectory connecting them in the phase plane is the travelling wave solution along which u decreases from 1 to 0. It is clear that when
, the point
is a stable spiral and
is a saddle point, and when
,
become a stable node while
remains a saddle point. In both cases, starting very close to the saddle point along the heteroclinic trajectory and moving toward the attracting fixed point, u starts at 1, but when the attracting fixed point is a spiral, the profile oscillates about 0 before ending up at that fixed point (top right panel), while when the fixed point at zero is a node, the profile decreases monotonically without oscillation (bottom right panel). If u
Figure 4. This figure shows two examples of phase plane and numerical solution of system (4) where the non-linear function
with
and different velocity V. The first example shows the phase plane with the case V = 0.5, and the numerical solution with the velocity V = 0.3, 0.6, 0.9, 1.2, 1.5; The second example shows the phase plane where V = 2 and its numerical solution with the velocity V = 2, 4, 6, 8, 10.
represents a density variable that cannot be negative, only the latter case is physical. That is the reason we restrict our attention to the traveling wave velocity range
.
In Figure 5 we show another solution using the
formulation when
,
and
. When starting very close to
, it takes a much longer range of
to see the variation of u toward zero in this case.
3.2. Parameter Estimation
3.2.1. Type 1 Inverse Problem
The Type 1 problem is the one where both u and q profiles are given. As explained earlier, we take that portion of the density profile
for which u is from 0.95 to 0.05. In our tests, we focus on the parameter ranges given in Table 2.
Figure 5. Numerical solution of u and q with
,
, and
.
Table 2. Type 1 inverse problem parameters test table.
Table 3. Type 1 inverse problem performance table.
Overall, there are 5000 cases in this experiment, and for each case, there may be a few hundred to a few thousand points for the specific range of u and the corresponding q. Table 3 gives the root mean square error values in the estimates of the three parameters over all the cases. The performance is excellent. However, if the data comes not from a numerical but from a physical experiment in which only the density u and not the flux q can be measured in a traveling wave front, the parameter estimation process is more challenging. We discuss that case as the Type 2 inverse problem later in this section.
To further explore the characteristics of the method, in the following three figures, we show the performance in more detail in cases where two of the three parameters V,
, and
are fixed while the third varies.
In Figure 6,
and
are fixed while V is allowed to range from 2 to 10. The three panels compare the predicted values of the three parameters to their true counterparts. The agreement is very good. Note that in the rightmost panel, the vertical axis scale is showing the difference between the true solution of
and the predicted solution, with the range being from 0 to 10−6, indicating excellent agreement. This is indicated by the notation above the top left of the plot. Figure 7 and Figure 8 show similar comparisons where
and
are respectively varied while the other two parameters are fixed. In the panels with the notation above the top left of the plot, the difference between true and predicted values is being plotted.
3.2.2. Type 2 Inverse Problem
In this part, in addition to characterizing the performance of the method, we also apply an epistemic uncertainty analysis on velocity. To do so, we designed
Figure 6. Type 1 inverse problem performance where
,
, and
.
Figure 7. Type 1 inverse problem performance where
,
, and
.
Figure 8. Type 1 inverse problem performance where
,
, and
.
two test data sets, in which the velocity of one set is within the same range as the training set but with different values, and for the other test set, the velocities are outside the training range, as listed in Table 4. The goal is to see how well the method performs in predicting the velocity values if they fall outside the range for which the method has been trained.
In this experiment, there are 2500 cases in the training set, and 300 cases inside and outside test data. For each numerical solution, we have a few hundred to a few thousand data points at the chosen range of u depending on the value of
needed to obtain the full traveling front profile.
The RMSE errors for both the optimization solver (see Table 5) and the embedding method (see Table 6) are quite small, though the ones for the embedding method are one or two orders of magnitude larger. However, as we mentioned above, the time savings associated with the embedding method may make it worthwhile as a viable alternative to the original optimization approach. Both methods are successful in predicting the velocities outside the range in which they were trained. In Figures 9-11 we provide comparisons of true and predicted values of the parameters, when two of them are fixed while the third varies, as we did with the Type 1 inverse problem. Keep in mind that in the plots where there is a notation above the plot, it is the difference between true and predicted values that is being plotted. For the embedding method, the inside and
Table 4. Parameter ranges for training and test sets.
Table 5. Optimization solver inside/outside test performance table.
Table 6. Embedding method inside/outside test performance table.
Figure 9. Type 2 inverse problem optimization method where
,
, and
.
Figure 10. Type 2 inverse problem optimization method where
,
, and
.
Figure 11. Type 2 inverse problem optimization method where
,
, and
.
outside test performances are illustrated in Figure 12 and Figure 13.
Overall, compared to the optimization problem solver, the performance of the
Figure 12. Type 2 inverse problem embedding method of inside test performance.
Figure 13. Type 2 inverse problem embedding method of outside uncertain test performance.
embedding method is not as good. However, both methods are acceptable with errors less than 10−3 for both inside and outside test cases. The embedding method is considerably faster.
4. Discussion
With a modified Fisher’s partial differential equation (PDE) as our test case, we have shown how the three parameters in this PDE can be inferred once a numerical solution of the traveling wave profile is given. Ultimately, the goal is to be able to take experimental data on a traveling front and successfully determine the PDE that gave rise to that profile.
In this paper, as proof-of-principle, we used a numerical solution of the PDE as a surrogate for actual experimental data. The traveling front can be characterized in terms of a density profile that varies smoothly from one to zero across some distance (at a fixed time), or in time (at a fixed spatial location), and also a flux function that appears in the conservation equation for the density field. In most experiments, it is more likely that the density field is the only observable quantity while the flux is not directly measured. As such, we considered two types of inverse problems, one in which both the density and flux profiles are known, and one where only the density profile is given. In the former case, a fairly robust method based on minimization of the variance of certain functions related to the density and flux could be used to infer the parameters accurately. When only the density profile is available, the inverse problem is more challenging, but using methods from machine learning, we were able to obtain satisfactory estimates of the parameters from the solution. In our experiments, the number of points in the range where significant changes in the solution u occurs plays a crucial role in achieving low error, particularly when solving optimization problems (10) and (11). Similarly, for the embedding method, building the second intercept variable training dataset with a higher density for each parameter leads to higher accuracy, but it also results in a longer training process.
Overall, the performance of parameter estimation method for both cases is acceptable. However, there is still room for improvement in our model. Firstly, we did not introduce any extra noise in our data (there is some noise during the linear interpolation approximation), therefore, to study the effect of noise, future studies can add extra white noise to the solution of u before trying to infer the parameters. Secondly, we did not have access to actual traveling front data from a physical experiment to apply our method. Future work will focus on experimental systems with traveling fronts or waves with the goal of finding the parameters in the PDE that gives rise to that wave, or the form of the PDE itself. Another research field is inverse problems based on the solutions of PDEs directly. In this kind of problem, parameters can be estimated by solving the linear equation
, where
represents the library of terms related to u and its spatial derivatives that may appear on the right-hand side of the time-evolution equation for u, and
represents the coefficient vector for the terms in that library. After estimating the coefficients, algorithms used to solve the inverse problem can be used to verify the accuracy of the solution.
Acknowledgements
We are deeply grateful to Professor Marina Chugunova for her helpful comments on this paper.
Appendix
A. Convexity analysis of
First, transform this into
; then to simplify notation, define
,
, and
. So, during the sample variance
and sample mean
estimation, we have:
Sample variance:
Sample mean:
where
is the 2-norm, and
is the
column vector all whose elements are 1. With that:
Next define
. It is clear that D is a symmetric matrix. Then, we have:
To simplify the notation, define constants
,
,
, so the function being minimized can be written as
, and since
, by the second-order derivative condition for convexity
is a convex function as
. Next, we would like to check whether
is strongly convex. To do that, we define
, then:
So that:
Therefore, there exists a positive constant m such that:
Then, this function is strongly convex, and there always exists a unique solution for the following optimization problem:
B. Convexity analysis of
when V is fixed
First, define vector
where
, and
. From the previous proof, we have
and D is symmetric, therefore, for this case, we have the following results:
Since V is fixed,
would be a fixed constant and
is non-negative. Therefore to minimize
, instead of solving the original function directly, we consider this minimize function
. Next, from our numerical solutions,
is always true. Therefore we can apply geometric series in our equation, so that
can be written as:
Therefore, we have the new minimization function
:
By the second-order derivative condition for convexity
is a convex function. Next, let’s prove that
is strongly convex. Like the previous proof, we define the same two variables
, and there exists a positive constant m such that:
Therefore, function
is strongly convex, and instead of the original optimization problem, we can solve the following problem which can get the same results: