Global Convergence Property with Inexact Line Search for a New Hybrid Conjugate Gradient Method

DOI: 10.4236/oalib.1106048   PDF   HTML     191 Downloads   341 Views   Citations

In this study, we derive a new scale parameter φ for the CG method, for solving large scale unconstrained optimization algorithms. The new scale parameter φ satisfies the sufficient descent condition, global convergence analysis proved under Strong Wolfe line search conditions. Our numerical results show that the proposed method is effective and robust against some known algorithms.

Keywords

Share and Cite:

Al-Namat, F. and Al-Naemi, G. (2020) Global Convergence Property with Inexact Line Search for a New Hybrid Conjugate Gradient Method. Open Access Library Journal, 7, 1-14. doi: 10.4236/oalib.1106048. 1. Introduction

In unconstrained optimization, we minimize an objective function that depends on real variables with no restrictions at all on the value of these variables. The unconstrained optimization problem is stated by:

${\mathrm{min}}_{x\in {R}^{n}}f\left(x\right)$ (1)

where $x\in {R}^{n}$ is a real vector with $n\ge 1$ component and $f:{R}^{n}\to R$ is a smooth function and its gradient g is available . A nonlinear conjugate gradient method generates a sequence ${x}_{k}$ Starting from an initial guess ${x}_{0}\in {R}^{n}$ Using the recurrence

${x}_{k+1}={x}_{k}+{\alpha }_{k}{d}_{k},\text{\hspace{0.17em}}\text{\hspace{0.17em}}k=0,1,2,\cdots$ (2)

where ${\alpha }_{k}$ is the positive step size obtained by carrying out a one dimensional search, known as the line searches . Among them, the so-called strong wolf line search conditions require that  .

$f\left({x}_{k}+{\alpha }_{k}{d}_{k}\right)\le f\left({x}_{k}\right)+\sigma {\alpha }_{k}{d}_{k}$, (3)

$|g\left({x}_{k}+{\alpha }_{k}{d}_{k}\right)|\le \delta |{g}_{k}^{\text{T}}{d}_{k}|$ (4)

where $0<\sigma <\delta <\text{1}$, is to find an approximation of ${\alpha }_{k}$ where the descent property must be satisfied and no longer searching in the direction when ${x}_{k}$ is far from the solution. Thus by strong Wolfe line search conditions we in herit the advantages of exact line search with inexpensive and low computational cost .

The search direction ${d}_{k}$ is generated by:

${d}_{k}=\left\{\begin{array}{l}-{g}_{k},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k=1\\ -{g}_{k}+{\beta }_{k}{d}_{k},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k>1\end{array}$ (5)

where ${g}_{k}$ and ${\beta }_{k}$ is the gradient and conjugate gradient coefficient of f(x) respectively at the point ${x}_{k}$. The different choices for the parameter ${\beta }_{k}$ correspond to different conjugate gradient methods. The most popular formulas for ${\beta }_{k}$ is Hestenes Stiefel method (HS), Fletcher-Reeves method (FR), Polak-Ribiere- Polyak method (PR), conjugate―Descent method (CD), Liu―Storey method (LS), and Dai-Yuan method (DY), etc

These methods are identical when f is a strongly convex quadratic function and the line search is exact, since the gradient are mutually orthogonal, and the parameters ${\beta }_{k}$ in these methods are equal. When applied to general nonlinear function with inexact line searches, however, the behavior of these methods is marked different . We are going to summarize some well known conjugate gradient method in Table 1.

An important class of conjugate gradient methods is the hybrid conjugate gradient algorithms. The hybrid computational schemes perform better than the classical conjugate gradient methods. They are defined by (2) and (5) where the parameter ${\beta }_{k}$ is computed as projections or as convex combinations of different conjugate gradient methods .

We are going to summarize some well known hybrid conjugate gradient method in Table 2.

We propose a new hybrid CG method based on combination of MMWU  and RMAR  conjugate gradient methods for solving unconstrained optimization method with suitable conditions. The corresponding conjugate gradient parameters are

${B}_{k}^{MMWU}=\frac{{‖{g}_{k+1}‖}^{2}}{{‖{d}_{k}‖}^{2}}$ (6)

and

${\beta }_{k}^{RMAR}=\frac{{‖{g}_{k+1}‖}^{2}-\frac{‖{g}_{k+1}‖}{‖{d}_{k}‖}{g}_{k+1}^{\text{T}}{d}_{k}}{{‖{d}_{k}‖}^{2}}$ (7)

Table 1. Some well known conjugate gradient coefficients.

Table 2. Hybrid conjugate gradient methods.

We defined the parameter ${\beta }_{k}$ in the proposed method by:

${\beta }_{k}^{FG}=\left(1-{\phi }_{k}\right){\beta }_{k}^{MMWU}+{\phi }_{k}{\beta }_{k}^{RMAR}$ (8)

Observe that if ${\phi }_{k}=0$, then ${\beta }_{k}^{FG}={\beta }_{k}^{MMWU}$, and if ${\phi }_{k}=1$, then ${\beta }_{k}^{FG}={\beta }_{k}^{RMAR}$.

By choosing the appropriate value of the parameter ${\phi }_{k}$ In the convex combination, the search direction ${d}_{k}$ of our algorithm not only is the Newton direction, but also satisfies the famous DL conjugate condition proposed by Dai and Liao . Under the strong Wolfe line search conditions, we prove the global convergence of our algorithm. The numerical results also show the feasibility and effectiveness of our algorithm.

This paper is organized as follows. Section 2 we introduce our new hybrid conjugate gradient method (HFG), and we obtain the parameter ${\phi }_{k}$ using some approaches and give us a specific algorithm. Section 3, we prove that it generates direction satisfying the sufficient descent condition under strong Wolfe line search conditions. The global convergence property of the proposed method is established in Section 4. Some numerical results are reported in Section 5.

2. A New Hybrid Conjugate Gradient Method

In this section, we will describe a new proposed hybrid conjugate gradient method. In order to obtain the sufficient descent direction, we will compute ${\phi }_{k}$ as follows. We combine ${\beta }_{k}^{MMWU}$ and ${\beta }_{k}^{RMAR}$ in a convex combination in order to have a good algorithm for unconstrained optimization.

The direction ${d}_{k+1}$ is generated by the rule

${d}_{k+1}=-{g}_{k+1}+{\beta }_{k}^{HFG}{d}_{k}$ (9)

where ${\beta }_{k}^{HFG}$ defined in (8), the iterates ${x}_{1},{x}_{2},{x}_{3},\cdots$ of our method are computed by means of the recurrence (2), where the step size ${\alpha }_{k}$ Is determined according to the strong Wolf conditions (3) and (4).

The scale parameter ${\phi }_{k}$ satisfying $0\le {\phi }_{k}\le 1$, which will be determined in a specific way to be described later. Observe that if ${\phi }_{k}=0$, then ${\beta }_{k}^{HFG}={\beta }_{k}^{MMWU}$, and

If ${\phi }_{k}=1$, then ${\beta }_{k}^{HFG}={\beta }_{k}^{RMAR}$. On the other hand, if $0<{\phi }_{k}<1$, then ${\beta }_{k}^{HFG}$ is a convex combination of ${\beta }_{k}^{MMWU}$ and ${\beta }_{k}^{RMAR}$.

From (8) and (9) it is obvious that:

${d}_{k+1}=\left\{\begin{array}{l}-{g}_{k+1},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k=1\\ -{g}_{k+1}+\left(1-{\phi }_{k}\right)\frac{{‖{g}_{k+1}‖}^{2}}{{‖{d}_{k}‖}^{2}}{d}_{k}+{\phi }_{k}\frac{{‖{g}_{k+1}‖}^{2}-\frac{‖{g}_{k+1}‖}{‖{d}_{k}‖}{g}_{k+1}^{\text{T}}{d}_{k}}{{‖{d}_{k}‖}^{2}}{d}_{k},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{ }\text{ }k>1\end{array}$, (10)

Our motivation to select the parameter ${\phi }_{k}$ in such a manner that the defection ${d}_{k+1}$ given in (10) is equal to the Newton direction ${d}_{k+1}^{N}=-{\nabla }^{2}f{\left({x}_{k+1}\right)}^{-1}{g}_{k+1}$. There for

$\begin{array}{l}-{\nabla }^{2}f{\left({x}_{k+1}\right)}^{-1}{g}_{k+1}\\ =-{g}_{k+1}+\left(1-{\phi }_{k}\right)\frac{{‖{g}_{k+1}‖}^{2}}{{‖{d}_{k}‖}^{2}}{d}_{k}+{\phi }_{k}\frac{{‖{g}_{k+1}‖}^{2}-\frac{‖{g}_{k+1}‖}{‖{d}_{k}‖}{g}_{k+1}^{\text{T}}{d}_{k}}{{‖{d}_{k}‖}^{2}}{d}_{k}\end{array}$ (11)

Now multiplying (11) by ${s}_{k}^{\text{T}}{\nabla }^{2}f\left({x}_{k+1}\right)$ from the left, we get

$\begin{array}{c}-{s}_{k}^{\text{T}}{g}_{k+1}=-{s}_{K}^{\text{T}}{\nabla }^{2}f\left({x}_{k+1}\right){g}_{k+1}+\left(1-{\phi }_{k}\right)\frac{{‖{g}_{k+1}‖}^{2}}{{‖{d}_{k}‖}^{2}}{s}_{K}^{\text{T}}{\nabla }^{2}f\left({x}_{k+1}\right){d}_{k}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\phi }_{k}\frac{{‖{g}_{k+1}‖}^{2}-\frac{‖{g}_{k+1}‖}{‖{d}_{k}‖}{g}_{k+1}^{\text{T}}{d}_{k}}{{‖{d}_{k}‖}^{2}}{s}_{k}^{\text{T}}{\nabla }^{2}f\left({x}_{k+1}\right){d}_{k}\end{array}$

Therefore, in order to have an algorithm for solving large scale problems we assume that pair $\left({s}_{k},{y}_{k}\right)$ satisfies the secant equation

${\nabla }^{2}f\left({x}_{k+1}\right){s}_{k}={y}_{k}.$ (12)

From (12), we get

${s}_{k}^{\text{T}}{\nabla }^{2}f\left({x}_{k+1}\right)={y}_{k}^{\text{T}}.$

Denoting ${\phi }_{k}^{FG}={\phi }_{k}$ we get

$-{s}_{k}^{\text{T}}{g}_{k+1}=-{y}_{K}^{\text{T}}{g}_{k+1}+\frac{{‖{g}_{k+1}‖}^{2}}{{‖{d}_{k}‖}^{2}}{y}_{k}^{\text{T}}{d}_{k}+{\phi }_{k}^{FG}\left(\frac{‖{g}_{k+1}‖\left({g}_{k+1}^{\text{T}}{d}_{k}\right)}{{‖{d}_{k}‖}^{2}}\right)\left({y}_{k}^{\text{T}}{d}_{k}\right)$

after some algebra, we get

${\phi }_{k}^{FG}=\frac{\left({s}_{k}^{\text{T}}{g}_{k+1}-{y}_{k}^{\text{T}}{g}_{k+1}\right)\cdot {‖{d}_{k}‖}^{3}+{‖{g}_{k+1}‖}^{2}\cdot ‖{d}_{k}‖\left({y}_{k}^{\text{T}}{d}_{k}\right)}{‖{g}_{k+1}‖\cdot \left({g}_{k+1}^{\text{T}}{d}_{k}\right)\cdot \left({y}_{k}^{\text{T}}{d}_{k}\right)}$ (13)

Now, we specify a complete hybrid conjugate gradient method (HFG) which posses some nice properties of conjugate gradient and Newton method.

Algorithm HFG

Step 1: Select ${x}_{0}\in {R}^{n}$, $\in \text{\hspace{0.17em}}>0$, set $k=0$. Compute $f\left({x}_{0}\right)$ and ${g}_{0}=-\nabla f\left({x}_{0}\right)$, set ${d}_{0}=-{g}_{0}$.

Step 2: Test the stopping criteria, i.e. if $‖{g}_{k}‖\le \text{\hspace{0.17em}}\in$, then stop.

Step 3: Compute ${\alpha }_{k}$ by strong Wolfe line search conditions in (3) & (4).

Step 4: Compute ${x}_{k+1}={x}_{k}+{\alpha }_{k}{d}_{k}$, ${g}_{k+1}=g\left({x}_{k+1}\right)$. Compute ${s}_{k}={x}_{k+1}-{x}_{k}$ And ${y}_{k}={g}_{k+1}-{g}_{k}$

Step 5: If ${\phi }_{k}\ge 1$ then set ${\phi }_{k}=1$. If ${\phi }_{k}\le 0$, then set ${\phi }_{k}=0$, otherwise compute ${\phi }_{k}$ as (13).

Step 6: Compute ${\beta }_{k}^{FG}$ by (8).

Step 7: Generate $d=-{g}_{k+1}+{\beta }_{k}^{FG}{d}_{k}$

Step 8: If the restart criteria of Powell $|{g}_{k}^{\text{T}}{g}_{k}|\ge 0.2{‖{g}_{k+1}‖}^{2}$, is satisfied, then set ${d}_{k}=-{g}_{k+1}$, otherwise define ${d}_{k+1}=d$

Step 9: Set $k=k+1$, and continue with step 2.

3. The Sufficient Descent Condition

In this section, we are going to apply the following theorem to illustrate that the search direction ${d}_{k}$ Obtained by hybrid FG satisfies the sufficient descent condition which plays Avit of role in analyzing the global convergence.

For further considerations we need the following assumptions

3.1. Assumption

The level sets $S=\left\{x\in {R}^{n},f\left({x}_{n}\right)\right\}$ are bounded.

3.2. Assumption

In a neighborhood N of S, the function f is continuously differentiable and its gradient is Lipschitz continuous, i.e., there exists a constant $L>0$, such that

$‖\nabla f\left(x\right)-\nabla f\left(y\right)‖\le L‖x-y‖,\text{\hspace{0.17em}}\forall x,y\in N$

Under these assumptions of if there exists a positive constant ( $\gamma ,\stackrel{¯}{\gamma },\omega$ & $\stackrel{¯}{\omega }$ ) & such that

$\stackrel{¯}{\gamma }\le ‖{g}_{k+1}‖\le \gamma \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\stackrel{¯}{\omega }\le ‖{g}_{k}‖\le \omega ,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\forall x\in S$ .

Theorem.

Let the sequences $\left\{{g}_{k}\right\}$ and $\left\{{d}_{k}\right\}$ be generated by a hybrid FG method. Then the search direction ${d}_{k}$ satisfies the sufficient descent condition:

${g}_{k+1}^{\text{T}}{d}_{k+1}\le -\mu {‖{g}_{k+1}‖}^{2},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\forall \mu \ge 0$ (14)

where $\mu =1-\left({E}_{4}-{E}_{3}\right)$, with $0<\left({E}_{4}-{E}_{3}\right)<1$.

Proof. We shall show that ${d}_{k}$ satisfies the sufficient descent condition holds for $k=0$, the proof is a trivial one, i.e. ${d}_{0}=-{g}_{0}$ and so ${g}_{0}^{\text{T}}{d}_{0}=-{‖{g}_{0}‖}^{2}$. Now we have

${d}_{k+1}=-{g}_{k+1}+{\beta }_{k}^{FG}{d}_{k},$

i.e.

${d}_{k+1}=-{g}_{k+1}+\left[\left(1-{\phi }_{k}\right){\beta }_{k}^{MMWU}+{\phi }_{k}{\beta }_{k}^{RMAR}\right]{d}_{k}$

We can rewrite the above direction by the following manner:

${d}_{k+1}=-\left({\phi }_{k}{g}_{k+1}+\left(1-{\phi }_{k}\right){g}_{k+1}\right)+\left(\left(1-{\phi }_{k}\right){\beta }_{k}^{MMWU}+{\phi }_{k}{\beta }_{k}^{RMAR}\right){d}_{k}$.

So,

${d}_{k+1}={\phi }_{k}\left(-{g}_{k+1}+{\beta }_{k}^{RMAR}{d}_{k}\right)+\left(1-{\phi }_{k}\right)\left(-{g}_{k}+{\beta }_{k}^{MMWU}{d}_{k}\right)$,

After some arrangement, we get

${d}_{k+1}={\phi }_{k}{d}_{k+1}^{RMAR}+\left(1-{\phi }_{k}\right){d}_{k+1}^{MMWU}$ (15)

Multiplying (15) by ${g}_{k+1}^{\text{T}}$ from the left, we get

${g}_{k+1}^{\text{T}}{d}_{k+1}={\phi }_{k}{g}_{k+1}^{\text{T}}{d}_{k+1}^{RMAR}+\left(1-{\phi }_{k}\right){g}_{k+1}^{\text{T}}{d}_{k}^{MMWU}$

Firstly, if ${\phi }_{k}=0$, then ${d}_{k+1}={d}_{k+1}^{MMWU}$, we are going to prove that the sufficient descent condition holds for MMWU method in the presence of the strong Wolfe line search condition, because in  they proved this method satisfied the sufficient descent condition with exact line search.

i.e.

${g}_{k+1}^{\text{T}}{d}_{k+1}^{MMWU}=-{‖{g}_{k+1}‖}^{2}+\frac{{‖{g}_{k+1}‖}^{2}}{{‖{d}_{k}‖}^{2}}{g}_{k+1}^{\text{T}}{d}_{k}$ (16)

Since,

${g}_{k+1}^{\text{T}}{d}_{k}\le {y}_{k}^{\text{T}}{d}_{k}$ and ${y}_{k}^{\text{T}}{d}_{k}\le {\alpha }_{k}L{‖{d}_{k}‖}^{2}$ (17)

Applications (17) in (16), we get

$\begin{array}{c}{g}_{k+1}^{\text{T}}{d}_{k+1}^{MMWU}\le -{‖{g}_{k+1}‖}^{2}+\frac{{‖{g}_{k+1}‖}^{2}}{{‖{d}_{k}‖}^{2}}{\alpha }_{k}L{‖{d}_{k}‖}^{2}\\ =-\left(1-{\alpha }_{k}L\right){‖{g}_{k+1}‖}^{2}\\ =-{E}_{1}{‖{g}_{k+1}‖}^{2}\end{array}$ (18)

where ${E}_{1}=\left(1-{\alpha }_{k}L\right)>0$, with $0<{\alpha }_{k}L<1$.

So, it is proved that ${d}_{k+1}^{MMWU}$ satisfies the sufficient descent condition.

Now let ${\phi }_{k}=1$ then ${d}_{k}={d}_{k}^{RMAR}$, we are going to prove that the sufficient descent condition holds for RMAR method in the presence of the strong Wolfe line search condition because in  they proved this method satisfied the sufficient descent condition with exact line search.

${d}_{k+1}^{RMAR}=-{g}_{k+1}+{\beta }_{k}^{RMAR}{d}_{k}$

Multiplying the above equation from left by ${g}_{k+1}^{\text{T}}$ we get

${g}_{k+1}^{\text{T}}{d}_{k+1}^{RMAR}=-{‖{g}_{k+1}‖}^{2}+\frac{{‖{g}_{k+1}‖}^{2}-\frac{‖{g}_{k+1}‖}{‖{d}_{k}‖}{g}_{k+1}^{\text{T}}{d}_{k}}{{‖{d}_{k}‖}^{2}}{g}_{k+1}^{\text{T}}{d}_{k}$.

In , they proved that

$0\le \frac{{‖{g}_{k+1}‖}^{2}-\frac{‖{g}_{k+1}‖}{‖{d}_{k}‖}{g}_{k+1}^{\text{T}}{d}_{k}}{{‖{d}_{k}‖}^{2}}\le 2\frac{{‖{g}_{k+1}‖}^{2}}{{‖{d}_{k}‖}^{2}}$ (19)

Used (17), and (19) the direction become

$\begin{array}{c}{g}_{k+1}^{\text{T}}{d}_{k+1}\le -{‖{g}_{k+1}‖}^{2}+2{\alpha }_{k}L{‖{g}_{k+1}‖}^{2}\\ =-\left(1-2{\alpha }_{k}L\right)\cdot {‖{g}_{k+1}‖}^{2}\\ =-{E}_{2}\cdot {‖{g}_{k+1}‖}^{2}\end{array}$ (20)

where ${E}_{2}=\left(1-2{\alpha }_{k}L\right)>0$ with $0<2{\alpha }_{k}L<1$ and $0.

So, it is proved that ${d}_{k+1}^{RMAR}$ satisfied the sufficient descent condition.

Now, we are going to prove the direction satisfy the sufficient descent condition when $0<{\phi }_{k}<1$, firstly for

$\begin{array}{l}\left(1-{\phi }_{k}\right){\beta }_{K}^{MMWU}{g}_{k+1}^{\text{T}}{d}_{k}\\ =\frac{{‖{g}_{k+1}‖}^{2}}{{‖{d}_{k}‖}^{2}}{g}_{k}^{\text{T}}{d}_{k}-\left[\frac{{s}_{k}^{\text{T}}{g}_{k+1}{‖{d}_{k}‖}^{3}-{y}_{k}^{\text{T}}{g}_{k+1}{‖{d}_{k}‖}^{3}+{‖{g}_{k+1}‖}^{2}‖{d}_{k}‖{y}_{k}^{\text{T}}{d}_{k}}{‖{g}_{k+1}‖\left({g}_{k+1}^{\text{T}}{d}_{k}\right){y}_{k}^{\text{T}}{d}_{k}}\right]\ast \frac{{‖{g}_{k+1}‖}^{2}}{{‖{d}_{k}‖}^{2}}{g}_{k+1}^{\text{T}}{d}_{k}\end{array}$

We have from Lipschitz condition ${g}_{k+1}^{\text{T}}{d}_{k}<{y}_{k}^{\text{T}}{d}_{k}$ and

$-\left(1-\sigma \right)‖{g}_{k}‖\le {y}_{k}^{\text{T}}{d}_{k}\le {\alpha }_{k}L{‖{d}_{k}‖}^{2}$

with a mathematical calculation, we get

$\begin{array}{l}\left(1-{\phi }_{k}\right){\beta }_{k}^{MMWU}{g}_{k+1}^{\text{T}}{d}_{k}\\ \le \left[\frac{{\alpha }_{k}L{‖{d}_{k}‖}^{2}}{{‖{d}_{k}‖}^{2}}-\frac{‖{s}_{k}‖‖{g}_{k+1}‖‖{d}_{k}‖-{\alpha }_{k}L{‖{d}_{k}‖}^{2}+{‖{g}_{k+1}‖}^{2}{\alpha }_{k}L‖{d}_{k}‖}{‖{g}_{k+1}‖\cdot \left(-\left(1-\sigma \right)‖{g}_{k}‖\right)}\right]{‖{g}_{k+1}‖}^{2}\\ \le \left[{\alpha }_{k}L+\frac{L}{\left(1-\sigma \right)}\frac{{‖{s}_{k}‖}^{2}‖{d}_{k}‖-{\alpha }_{k}L{‖{d}_{k}‖}^{3}+{‖{g}_{k+1}‖}^{2}‖{d}_{k}‖}{‖{g}_{k+1}‖{‖{g}_{k}‖}^{2}}\right]{‖{g}_{k+1}‖}^{2}\\ \le \left[{\alpha }_{k}L+\frac{A\gamma B-{\alpha }_{k}L{B}^{2}+{Y}^{2}{\alpha }_{k}LB}{\left(1-\sigma \right)\stackrel{¯}{Y}{\stackrel{¯}{W}}^{2}}\right]{‖{g}_{k+1}‖}^{2}\end{array}$

Let ${E}_{1}={\alpha }_{k}L+\frac{LAB-{\alpha }_{k}L{B}^{3}+{\gamma }^{2}{\alpha }_{k}LB}{\left(1-\sigma \right)\stackrel{¯}{\gamma }\text{ }{\stackrel{¯}{\omega }}^{2}}$

$\therefore \text{\hspace{0.17em}}\left(1-{\phi }_{k}\right){\beta }^{MMWU}{g}_{k+1}^{\text{T}}{d}_{k}\le {E}_{3}{‖{g}_{k+1}‖}^{2}$ (21)

Now, secondly for

$\begin{array}{l}{\phi }_{k}{\beta }_{k}^{RMAR}{g}_{k+1}^{\text{T}}{d}_{k}\\ =\left[\frac{{s}_{k}^{\text{T}}{g}_{k+1}{‖{d}_{k}‖}^{3}-{y}_{k}^{\text{T}}{g}_{k+1}{‖{d}_{k}‖}^{3}+{‖{g}_{k+1}‖}^{2}‖{d}_{k}‖{y}_{k}^{\text{T}}‖{d}_{k}‖}{‖{g}_{k+1}‖\left({g}_{k+1}^{\text{T}}{d}_{k}\right)\left({y}_{k}^{\text{T}}{d}_{k}\right)}\right]\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{ }\text{ }\cdot \left[\frac{{‖{g}_{k+1}‖}^{2}-\frac{‖{g}_{k+1}‖}{‖{d}_{k}‖}{g}_{k+1}^{\text{T}}{d}_{k}}{{‖{d}_{k}‖}^{2}}\right]{g}_{k+1}^{\text{T}}{d}_{k}\end{array}$

From (19), Lipschitz condition ${s}_{k}^{\text{T}}{g}_{k+1}\le {y}_{k}^{\text{T}}{s}_{k}\le L{‖{s}_{k}‖}^{2}$ and ${s}_{k}={\alpha }_{k}{d}_{k}$, we get

${\phi }_{k}{\beta }_{k}^{RMAR}{g}_{k+1}^{\text{T}}{d}_{k}=2\left[\frac{L{‖{s}_{k}‖}^{2}‖{d}_{k}‖-‖{y}_{k}‖‖{g}_{k+1}‖{‖{d}_{k}‖}^{3}+{\alpha }_{k}L{‖{g}_{k+1}‖}^{2}{‖{d}_{k}‖}^{2}}{{‖{g}_{k+1}‖}^{2}\left(-\left(1-\sigma \right)\right){‖{g}_{k}‖}^{2}{‖{d}_{k}‖}^{2}}\right]\cdot {‖{g}_{k+1}‖}^{2}$

Since $‖{y}_{k}‖\le ‖{g}_{k+1}‖+‖{g}_{k}‖$, so

$\begin{array}{l}{\phi }_{k}{\beta }_{k}^{RMAR}{g}_{k+1}^{\text{T}}{d}_{k}\\ \le \frac{-2}{1-\sigma }\left[\frac{L{‖{s}_{k}‖}^{2}‖{d}_{k}‖-0.8{‖{g}_{k+1}‖}^{2}‖{d}_{k}‖+{\alpha }_{k}L{‖{g}_{k+1}‖}^{2}‖{d}_{k}‖}{{‖{g}_{k+1}‖}^{2}{‖{g}_{k}‖}^{2}}\right]\cdot {‖{g}_{k+1}‖}^{2}\\ \le \frac{-2B}{1-\sigma }\left[\frac{LA-0.8{\gamma }^{2}+{\alpha }_{k}L{\omega }^{2}}{\stackrel{¯}{\gamma }\text{ }{\stackrel{¯}{\omega }}^{2}}\right]\cdot {‖{g}_{k+1}‖}^{2}\end{array}$

where ${E}_{4}=\frac{2B}{1-\sigma }\left[\frac{LA-0.8{\gamma }^{2}+{\alpha }_{k}L{\omega }^{2}}{\stackrel{¯}{\gamma }\text{ }{\stackrel{¯}{\omega }}^{2}}\right]$

$\therefore \text{\hspace{0.17em}}{\phi }_{k}{\beta }_{k}^{RMAR}{g}_{k+1}^{\text{T}}{d}_{k}\le -{E}_{4}{‖{g}_{k+1}‖}^{2}$ (22)

From (18), (20), (21) and (22) we get

$\begin{array}{c}{g}_{k+1}^{\text{T}}{d}_{k+1}\le -{‖{g}_{k+1}‖}^{2}+{E}_{3}{‖{g}_{k+1}‖}^{2}-{E}_{4}{‖{g}_{k+1}‖}^{2}\\ =-\left[1-\left({E}_{4}-{E}_{3}\right)\right]{‖{g}_{k+1}‖}^{2}\\ =-E{‖{g}_{k+1}‖}^{2}\end{array}$

with $E=1-\left({E}_{4}-{E}_{3}\right)$ and $0<{E}_{4}-{E}_{3}<1$.

So, it is proved that ${d}_{k+1}$ Satisfied the sufficient descent condition.

4. Converge Analysis

Let Assumption 2.1 and 2.2 hold. In  it is proved that for any conjugate gradient method with strong Wolfe line search conditions, it holds:

4.1. Lemma

Let Assumption 2.1 and 2.2 holds. Consider the method (2) and (5) where the ${d}_{k}$ Is a descent direction and αk is received from the strong wolf line search. If

${\sum }_{k\ge 1}\frac{1}{{‖{d}_{k}‖}^{2}}=\infty$.

Then

${\mathrm{lim}}_{k\to \infty }\mathrm{inf}‖{g}_{k}‖=0$.

4.2. Theorem

Suppose that assumption 2.1 and 2.2 holds. Consider the algorithm HFG were $0\le {\phi }_{k}\le 1$ and ${\alpha }_{k}$ is obtained by the strong Wolfe line search and ${d}_{k+1}$ is the descent direction. Then

${\mathrm{lim}}_{k\to \infty }\mathrm{inf}‖{g}_{k}‖=0$.

Proof. Because the descent condition holds, we have . So using lemma 3.1, it is sufficient to prove that is bounded above. From (10). They proved that in  and . ,

And .

Now for ,

By (4), we have .

Since and ,

with some mathematical calculation, we get    5. Numerical Experiments

In this section we selected some of test functions in Table 3 from CUTE library, along with other large scale optimization problems presented in Andrei   and Bongartz et al. .

All codes are written in double precision FORTRAN Language and compiled Visual F90 (default compiler settings) on a Workstation Intel Pentium 4. The value of is always computed by cubic fitting procedure.

We selected 26 large scale unconstrained optimization problems in the extended

Table 3. It gives the comparison depending in the NOI and NOF between, and the proposed method.

Table 4. The percentage performance of the proposed methods.

Figure 1. The comparison between the three methods.

or generalized form. Each problem was tested three times for a gradually increasing number of variables: N = 1000, 5000 and 10,000, all algorithms implemented the strong Wolfe line search (3) and (4) conditions with and and the same stopping criterion is used.

In some cases, the computation stopped due to the failure of the line search to find the positive step size, and thus it was considered as a failure denoted by (F).

We record the number of iteration calls (NOI), the number of function evaluations calls (NOF), and the dimensions of test problems calls (N), for the purpose of our comparisons.

Table 3 gives the comparison depending in the NOI and NOF between, and the proposed method.

Table 4 gives the percentage performance of the proposed methods against and. We have seen that. Method saves (NOI 0.8%), (NOF 7.6%), and method saves (NOI 28.7%), (NOF 40.0%) compared with method.

While Figure 1 gives the comparison between, and, using a well-known Wood test function.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

  Liu, A.J. and Li, S. (2014) New Hybrid Conjugate Gradient Method for Unconstrained Optimization. Applied Mathematics and Computation, 245, 36-43.https://www.sciencedirect.com/science/article/abs/pii/S0096300314010637 https://doi.org/10.1016/j.amc.2014.07.096  Salleh, Z. and Alhawarat, A. (2016) An Efficient Modification of the Hes-tenes-Stiefel Nonlinear Conjugate Gradient Method with Restart Property. Journal of Inequalities and Applications, 2016, Article No. 110. https://link.springer.com/article/10.1186/s13660-016-1049-5 https://doi.org/10.1186/s13660-016-1049-5  Wolfe, P. (1969) Convergence Condition for Ascent Methods. SIAM Review, 11, 226-235. https://doi.org/10.1137/1011036  Wolfe, P. (1971) Convergence Conditions for Ascent Methods II: Some Corrections. SIAM Review, 13, 185-188. https://doi.org/10.1137/1013035  Alhawarat, A., Mamat, M., Rivaie, M. and Salleh, Z. (2015) An Efficient Hybrid Conjugate Gradient Method with the Strong Wolfe-Powell Line Search. Mathematical Problems in Engineering, 2015, Article ID: 103517. https://www.hindawi.com/journals/mpe/2015/103517 https://doi.org/10.1155/2015/103517  Hestenes, M.R. and Stiefel, E. (1952) Methods of Conjugate Gradients for Solving Linear Systems. Journal of Research of the National Bureau of Standards, 49, 409-436. https://nvlpubs.nist.gov/nistpubs/jres/049/jresv49n6p409_A1b.pdf https://doi.org/10.6028/jres.049.044  Fletcher, R. and Reeves, C. (1964) Function Minimization by Conjugate Gradients. The Computer Journal, 7, 149-154. https://doi.org/10.1093/comjnl/7.2.149https://academic.oup.com/comjnl/article/7/2/149/335311  Polak, E. and Ribie’re, G. (1969) Note sur la convergence de me’thodes de directions conjugue’s. Revue Francsis d’Infermatique et de recherché Operationnelle, 3, 35-43. http://www.numdam.org/item?id=M2AN_1969__3_1_35_0 https://doi.org/10.1051/m2an/196903R100351  Polyak, B.T. (1969) The Conjugate Gradient Method in Extreme Problems. USSR Computational Mathematics and Mathematical Physics, 9, 94-112.https://www.researchgate.net/publication/222365587 https://doi.org/10.1016/0041-5553(69)90035-4  Fletcher, R. (1987) Practical Methods of Optimization. 2nd Edition, John Wiley & Sons, Inc., Hoboken. https://www.wiley.com/en-ba  Liu, D. and Story, C. (1991) Efficient Generalized Conjugate Gradient Algorithms. Part 1: Theory. Journal of Optimization Theory and Applications, 69, 129-137.https://link.springer.com/article/10.1007/BF00940464 https://doi.org/10.1007/BF00940464  Dai, Y.H. and Yuan, Y. (1999) A Nonlinear Conjugate Gradient Method with a Strong Global Convergence Property. SIAM Journal on Optimization, 10, 177-182.https://doi.org/10.1137/S1052623497318992  Al-Naemi, Gh.M. and Hamed, E.T. (2013) New Conjugate Method with Wolfe Type Line Searches for Nonlinear Programming. Australian Journal of Basic and Applied Sciences, 7, 622-632. https://www.researchgate.net/publication/330686295  Andrei, N. (2010) Acceleration Hybrid Conjugate Gradient Algorithm with Modified Secant Condition for Unconstrained Optimization. Numerical Algorithms, 54, 23-46. https://link.springer.com/article/10.1007/s11075-009-9321-0 https://doi.org/10.1007/s11075-009-9321-0  Andrei, N. (2008) Another Nonlinear Conjugate Gradient Algorithm for Unconstrained Optimization. Optimization Methods and Software, 24, 89-104.https://www.tandfonline.com/doi/abs/10.1080/10556780802393326 https://doi.org/10.1007/s11075-007-9152-9  Yan, H., Chen, L. and Jiao, B. (2009) HS-LS-CD Hybrid Conjugate Gradient Algorithm for Unconstrained Optimization. 2nd International Workshop on Computer Science and Engineering, Vol. 1, Qingdao, 28-30 October 2009, 264-268. https://ieeexplore.ieee.org/document/5403315 https://doi.org/10.1109/WCSE.2009.667  Li, S. and Sun, Z.B. (2010) A New Hybrid Conjugate Gradient Method and Its Global Convergence for Unconstrained Optimization. International Journal of Pure and Applied Mathematics, 63, 84-93. https://ijpam.eu/contents/2010-63-3/4/index.html  Djordjevic, S.S. (2019) New Hybrid Conjugate Gradient Method as a Convex Combination of LS and CD Methods. Filomat, 31, 1813-1825.https://doi.org/10.2298/FIL1706813D  Djordjevic, S.S. (2018) New Hybrid Conjugate Gradient Method as a Convex Combination of HS and FR Conjugate Gradient Methods. Journal of Applied Mathematics and Computation, 2, 366-378. https://doi.org/10.26855/jamc.2018.09.002https://www.researchgate.net/publication/328125868  Djordjevi?, S.S. (2019) New Hybrid Conjugate Gradient Method as a Convex Combination of LS and FR Conjugate Gradient Methods. Acta Mathematica Scientia, 39, 214-228.  Zheng, X.Y., Dong, X.L., Shi, J.R. and Yang, W. (2019) Further Comment on Another Hybrid Conjugate Gradient Algorithm for Unconstrained Optimization by Andrei. Numerical Algorithm, 1-6. https://doi.org/10.1007/s11075-019-00771-1  Abdullahi, I. and Ahmad, R. (2016) Global Convergence Analysis of a Nonlinear Conjugate Gradient Method for Unconstrained Optimization Problems. Indian Journal of Science and Technology, 9, 1-9.http://www.indjst.org/index.php/indjst/article/view/90175/74779 https://doi.org/10.17485/ijst/2016/v9i41/90175  Livieris, I.E., Tampakas, V. and Pintelas, P. (2018) A Descent Hybrid Conjugate Gradient Method Based on the Memoryless BFGS Update. Numerical Algorithms, 79, 1169-1185. https://link.springer.com/article/10.1007/s11075-018-0479-1 https://doi.org/10.1007/s11075-018-0479-1  Mandara, A.V., Mamat, M., Waziri, M.Y., Mohamed, M.A. and Yakubu, U.A. (2018) A New Conjugate Gradient Coefficient with Exact Line Search for Unconstrained Optimization. Far East Journal of Mathematical Sciences, 105, 193-206.https://www.researchgate.net/publication/329522075 https://doi.org/10.17654/MS105020193  Liu, J.K. and Li, S.J. (2014) New Hybrid Conjugate Gradient Method for Unconstrained Optimization. Applied Mathematics and Computation, 245, 36-43.https://doi.org/10.1016/j.amc.2014.07.096https://www.sciencedirect.com/science/article/abs/pii/S0096300314010637  Yunus, R.B., Mamat, M. and Abashar, A. (2018) Comparative Study of Some New Conjugate Gradient Methods. UniSZA Research Conference (URC 2015), Kuala Terengganu, 14-16 April 2015, 616-621. https://www.researchgate.net/publication/326301618  Al-Naemi, Gh.M. and Ahmed, H.I. (2013) Modified Nonlinear Conjugate Gradient Algorithms with Application in Neural Networks. LAP Lambert Academic Publishing, Saarbrucken.  Andrei, N. (2008) An Unconstrained Optimization Test Functions Collection. Advanced Modeling and Optimization, 10, 147-161.http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.665.3152&rep=rep1&type=pdf  Andrei, N. (2014) Test Functions for Unconstrained Global Optimization. 3-5.  Bongartz, I., Conn, A.R., Gould, N. and Toint, P.L. (1995) CUTE: Constrained and Unconstrained Testing Environment. ACM Transactions on Mathematical Software, 21, 123-160. https://doi.org/10.1145/200979.201043

comments powered by Disqus

Copyright © 2020 by authors and Scientific Research Publishing Inc. This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.