An Algorithm for the Derivative-Free Unconstrained Optimization Based on a Moving Random Cone Data Set

Abstract

In this paper, we suggest and analyze some new derivative free iterative methods for solving nonlinear equation using a trust-region method. We also, give several examples to illustrate the efficiency of these methods. Comparison with other similar method is also given. This tech-nique can be used to suggest a wide class of new iterative methods for solving optimization problem. For, solving linearly unconstrained optimi-zation problems without derivatives, a derivative-free Funnel method for unconstrained non-linear optimization is proposed. The study presents new interpolation-based techniques. The main work of this paper depends on some matrix computation techniques. A linear system is solved to obtain the required quadratic model at each iteration. Interpolation points are based on polynomial which is then minimized in a trust-region.

Share and Cite:

Mu’lla, M. (2019) An Algorithm for the Derivative-Free Unconstrained Optimization Based on a Moving Random Cone Data Set. Open Access Library Journal, 6, 1-11. doi: 10.4236/oalib.1105652.

1. Introduction

The main of this paper is to study methods to solve optimization problem whose objective function does not possess partial derivatives available at hard [1] . This is due to the difficulty of finding derivatives. Derivative-free optimization (DFO) models attempt to express, in mathematical terms, the goal of this paper to solving problems in the “best” way, optimization, in lower dimensional subspaces. These new methods can be considered to develop this method. The new

method starts with an initial data N = ( n + 1 ) ( n + 2 ) 2 points, where, n is the

dimension of the problem. An initial guess x 0 is provided first, then using random Householder’s and Given’s, matrices, the ( N 1 ) other points are generated [2] [3] , with x 0 thought of as a centre. The other points are equidistant from, x 0 , with distance Δ a pre-assigned radius. During the iterations there is need to regenerate the interpolation data points. This is done when no progress in function decrease has taken place. The interpolation points are selected to be D units distant from the kth iterate of x k . The Householder matrix H n = I n 2 u u T , u 2 = 1 is used in the generation of the rest ( N 1 ) points [2] [4] . Given a random vector z where its components are uniformly distributed in ( 0 , 1 ) , i.e., z i U ( 0 , 1 ) , H n is obtained so that H n z = z e 1 . Now if z is a column of H n then a point y is defined by, y = x k + D z [1] . The resulting algorithm is shown to be numerically highly competitive.

2. Basic Material

Many interpolation-based trust-region methods construct local polynomial interpolation-based models of the objective function and compute steps by minimizing these models inside a region using the standard trust-region methodology [5] [6] [7] . The models are built so as to interpolate previously computed function values at a subset of past iterates or at specially constructed points. For the model to be well-defined, the interpolation points must be poised [8] [9] [10] . To provide the reader with some necessary information about used notations, we have to recall some basic material about the general trust-region framework, multivariate interpolation, Lagrange polynomials and the definition of poised-ness and well-poised-ness [11] [12] .

3. Lagrange Polynomials

If the interpolation set Y is poised, the basis of Lagrange polynomials { ` l i ( x ) } p i = 1 exists and is uniquely defined by [11] .

3.1. Definition

Given a set of interpolation points Y = { y 1 , y 2 , , y p } a basis of p polynomials l j ( x ) , j = 1 , , p in p n d is called a basis of Lagrange polynomials if

l j ( y i ) = δ i j = { 1 if i = j , 0 if i j . (1)

The unique polynomial m ( x ) which interpolates f ( x ) on Y using this basis of Lagrange polynomials can be expressed as

m ( x ) = i = 1 p f ( y i ) l i ( x ) . (2)

Moreover, for every poised set Y = { y 1 , y 2 , , y p } , we have that

i = 1 p l i ( x ) = 1 for all x I R n . (3)

The accuracy of m ( x ) as an approximation of the objective function f in some region B I R n can be quantified using the following notion [13] [14] . A poised set Y = { y 1 , y 2 , , y p } is said to be Λ-poised in B for some Λ > 0 if and only if for the basis of Lagrange polynomials associated with Y, [15] .

Λ max 1 i p max x B | l i ( x ) | . (4)

The right hand side of (3) is related to the Lebesgue constant Λ n of the set which is defined as

Λ n = max x B i = 1 n | l i ( x ) | , (5)

see for instance [16] [17] [18] . Given the following relations

max 1 i n | l i ( x ) | i = 1 n | l i ( x ) | n max 1 i n | l i ( x ) | , (6)

we conclude that

Λ Λ n n Λ . (7)

It is a measure of the accuracy of the polynomial interpolation at the set of points below. This suggests to look for a set of interpolation points with a small Lebesgue constant. Hence, conversely, the smaller Λ, the better the geometry of the set Y, importantly for our purposes, Lagrange polynomial values and Λ-posedness can be used to bound the model function and model gradient error. In particular, it is shown in Ciarlet and Raviart [15] , [1] that for any x in the convex hull of Y.

3.2. Example

Using the natural basis ¯ = { 1 , x 1 , x 2 , 1 2 x 1 2 , 1 2 x 2 , x 1 x 2 } and a sample set Y = { y 1 , y 2 , y 3 , y 4 } , with y 1 = ( 0 , 0 ) , y 2 = ( 0 , 1 ) , y 3 = ( 1 , 0 ) , y 4 = ( 1 , 1 ) .

The matrix M ( , Y )

M ( , Y ) = [ 1 0 0 0 0 0 1 0 1 0 0.5 0 1 1 0 0.5 0 0 1 1 1 0.5 0.5 1 ] .

Choosing now the first four columns of M ( , Y ) , the system is determined but not well defined since the matrix is singular. We see now that the set Y is not

poised with respect to the sub-basis = { 1 , x 1 , x 2 , 1 2 x 1 2 } , but if we selected the

sub-basis = { 1 , x 1 , x 2 , x 1 x 2 } , the set Y is well-poised and the corresponding matrix consisting of the first, the second, the third, and the sixth columns of M ( , Y ) is well-conditioned and a unique solution to this determined system exists. is given by [11] [13] .

4. Unconstrained DFO Algorithm

We consider the bound-constrained optimization problem

min x I R n f ( x ) , (8)

where f is a nonlinear function from I R n into I R , which is bounded below, and where l and u are vectors of (possibly infinite) lower and upper bounds on x. We denote the feasible domain of this problem by F [6] .

a model of the form

m k ( x k , s ) = f ( x k ) + g k T s + 1 2 s T H k s (9)

(where g k and H k are the function’s gradient and Hessian, respectively) is minimized inside a trust region B ( x k , Δ k ) , as derivatives are not given, g k and H k are approximated by determining its coefficients (here represented by the vector, α ) from the interpolation conditions [14]

m ( y i ) = j = 1 p α j j ( y j ) = f ( y i ) , i = 1 , , p . (10)

The points y 1 , , y p considered in the interpolation conditions (10) form the interpolation, set Y k . The set Y k contains in our case at least n + 1 points and is chosen as a subset of X k , the set of all points where the value of the objective function f is known. How to choose this interpolation set is of course one of the main issues we have to address below, as not every set Y k is suitable because of posedness issues [11] [19] . We propose to handle the bound constraints by an “active-set” approach where a bound l i or u i is called active at x if l i = x i or x i = u i , respectively. The bound constraints l i and u i are inactive if l i < x i < u i at x.

4.1. The Ingredients Strategy

In a trust-region algorithm for choosing the trust-region radius Δ k at each iteration. We base this choice on the agreement between the model function m k and the objective function f at previous iterations. Given a step p k we define the ratio:

ρ k = f ( x k ) f ( x k + p k ) m k ( 0 ) m k ( p k ) , (11)

The numerator is actual reduction, and the denominator is the predicted reduction (that is, the reduction in f predicted by the model function). The method has been developed and especially model-based trust-region methods have been shown to perform well. It improves the efficiency of method while maintaining its good theoretical convergence properties, furthermore, the unconstrained method applying a model-based derivative-free method [1] . The interpolation points are chosen randomly, at each iteration, the generation of the random points is descanted in the functions. The function proved to be quite helpful to restore the method on the right track. This is because of randomness. The method proved to be efficient and reliable.

The accuracy of the method is acceptable and relies on the contours of the objective function. The step p k is obtained by minimizing the model m k over a region that includes ρ k = 0 , the predicted reduction will always be nonnegative. Hence, if ρ k is negative, the new objective value f ( x k + p k ) is greater than the current value f ( x k ) , so the step must be rejected. On the other hand, if ρ k is close to 1, there is good agreement between the model m k and the function f over this step, so it is safe to expand the trust region for the next iteration. If ρ k is positive but significantly smaller than 1, do not alter the trust region, but if it is close to zero or negative, we shrink the trust region by reducing Δ k at the next iteration [1] [20] .

Here Δ ^ (radius) is an overall bound on the step lengths. That the radius is increased only if p k actually reaches the boundary of the trust region. If the step stays strictly inside the region, we infer that the current value of Δ k is not interfering with the progress of the algorithm, so leave its value unchanged for the next iteration. The shape of the points looks like a moving cone. To turn Algorithm into a practical algorithm, we need to focus on solving the trust-region sub-problem

min p I R n m k ( p ) = f k + g k T p + 1 2 p T B k p (12)

In discussing this matter, we sometimes drop the iteration subscript k and restate the problem as follows

min p I R n m ( p ) = f + g T p + 1 2 p T B p

A first step to exact solutions is given by the following:

Assume that at the current iterate x k we have a set of sample points

Y = { y 1 , y 2 , y q } ,

with y i I R n , i = 1 , 2 , , q again assume that x k is an element of this set and that no point in Y has a lower function value than x k . This point depends on Householder’s and Given’s matrices [2] [21] . The choice of the points was guided by a desire to model-based method. Wish to construct a quadratic model of the form [3] [9]

m k ( x k + p ) = c + g T p + 1 2 p T G p = f ( y ) (13)

In general q = ( n + 1 ) ( n + 2 ) 2 the matrix is given as the follows:

m ( x + p ) = c + [ g 1 g n ] [ p 1 p n ] + 1 2 [ p 1 p n ] [ g 11 g 1 n g n 1 g n n ] [ p 1 p n ] = [ f ( y 1 ) f ( y n ) ] (14)

[ 1 p 1 p n 1 2 p 1 2 1 2 p n 2 p 1 p 2 p 1 p 3 p 1 p n 1 ] [ c g 1 g 2 g n G 11 G 1 n ] = [ f ( y 1 ) f ( y n ) ]

Algorithm (4.1)

1) Input the Y = [ y 1 , y 2 , , y N ] ,

then set N = 1 2 ( n + 1 ) ( n + 2 ) size of Y.

2) Solve the model (3) to find c , g , G .

c = h 1 ;

g = h ( 2 : n + 1 ) ;

for i = 1 , 2 , , n

The matrix G i i = h n + 1 + i ;

end

m = 2 n + 1 ;

1) for i = 1 , 2 , , n 1

i i = m + ( i 1 ) ( 2 n i ) 2 ;

2) for j = i + 1 , , n

G ( i , j ) = h ( i i + j i ) ;

G ( j , i ) = G ( i , j ) ;

end

Output c , g , G

4.2. Theorem (First Order Necessary Conditions)

If x * is a local minimize and f is continuously differentiable in an open neighborhood of x * , then f ( x * ) = 0 [1] [22] .

4.3. Theorem (Second Order Necessary Conditions)

If x * is a local minimize of f and 2 f exists and is continuous in an open neighborhood of x * , then f ( x * ) = 0 and 2 f ( x * ) is positive semi definite [10] [11] [18] .

4.4. Theorem (Second Order Sufficient Conditions)

Suppose that 2 f is continuous in an open neighborhood of x * and that f ( x * ) = 0 and 2 f ( x * ) is positive definite. Then x * is a strict local minimize of f [1] .

4.5. Theorem

Let f be twice Lipschitz continuously differentiable in a neighborhood of a point x * at which second-order sufficient conditions Theorem 4.4 are satisfied. Suppose the sequence { x k } converges to x * and that for all k sufficiently large, the trust-region algorithm based on B k = 2 f ( x k ) chooses steps p k that satisfy the Cauchy-point-based model reduction criterion and are asymptotically

similar to Newton steps p k N whenever p k N 1 2 Δ k , that is [23] ,

p k p k N = o ( p k N ) . (11)

Then the trust-region bound Δ k becomes inactive for all k sufficiently large and the sequence { x k } converges super-linearly to x * [20] [23] .

5. The Selected Test Problems

The following functions are used in testing method have been selected from [22] and [24]

1) f ( x ) = 100 ( x 2 x 1 2 ) 2 + ( 1 x 1 2 )

2) f ( x ) = ( x 1 x 3 ) 4 + 7 ( x 2 1 ) 2 + 10 ( x 3 10 ) 4

The Output (Tables 1-12)

Table 1. Summary of structural iterative solution overview of (DFO) of the function f ( x ) = 100 ( x 2 x 1 2 ) 2 + ( 1 x 1 2 ) the initial point is x 0 = [ 1 , 1.6 , 1 ] T .

Table 2. The value of the function f ( x ) = 100 ( x 2 x 1 2 ) 2 + ( 1 x 1 2 ) at each iteration in Table 1.

Table 3. Summary of structural iterative Solution overview of (DFO) of the function f ( x ) = 100 ( x 2 x 1 2 ) 2 + ( 1 x 1 2 ) using the last point in the iterative in Table 2 x 1 = [ 1.5187 ; 2.3702 ; 4.9791 ] T .

Table 4. The value of the function f ( x ) = 100 ( x 2 x 1 2 ) 2 + ( 1 x 1 2 ) at each iteration in Table 3.

Table 5. Summary of structural iterative Solution overview of (DFO) of the function f ( x ) = 100 ( x 2 x 1 2 ) 2 + ( 1 x 1 2 ) using the last point in the iterative in Table 4 x 2 = [ 1.5194 ; 2.3689 ; 4.9789 ] T .

Table 6. The value of the function f ( x ) = 100 ( x 2 x 1 2 ) 2 + ( 1 x 1 2 ) at each iteration in Table 5.

Table 7. Summary of structural iterative Solution overview of (DFO) of the function f ( x ) = ( x 1 x 3 ) 4 + 7 ( x 2 1 ) 2 + 10 ( x 3 10 ) 4 using the initial point in Table 6 x 0 = [ 0 ; 0 ; 0 ; 0 ] T .

Table 8. The value of the function f ( x ) = ( x 1 x 3 ) 4 + 7 ( x 2 1 ) 2 + 10 ( x 3 10 ) 4 at each iteration in Table 7.

Table 9. Summary of structural iterative Solution overview of (DFO) of the function f ( x ) = ( x 1 x 3 ) 4 + 7 ( x 2 1 ) 2 + 10 ( x 3 10 ) 4 using the last point in the iterative in Table 8 x 1 = [ 2.1784 ; 0.5427 ; 9.4005 ; 0.3411 ] T .

Table 10. The value of the function f ( x ) = ( x 1 x 3 ) 4 + 7 ( x 2 1 ) 2 + 10 ( x 3 10 ) 4 at each iteration in Table 9.

Table 11. Summary of structural iterative Solution overview of (DFO) of the function f ( x ) = ( x 1 x 3 ) 4 + 7 ( x 2 1 ) 2 + 10 ( x 3 10 ) 4 using the last point in the iterative in Table 10 x 2 = [ 2.1743 ; 0.6608 ; 9.5020 ; 0.4375 ] T .

Table 12. The value of the function f ( x ) = ( x 1 x 3 ) 4 + 7 ( x 2 1 ) 2 + 10 ( x 3 10 ) 4 at each iteration in Table 11.

6. Conclusion

In this paper it has been shown that using a randomly selected data set point, within an interpolation based method for derivative free optimization was adequate and practical. In the generating of these randomly selected points Householders and Given’s matrices were used. The randomness comes from the random seed vectors, which were then transformed by Householders and Given’s matrices into the required.

Acknowledgements

I would like to thank my supervisor, Dr. Muhsin Hassan Abdallah who was a great help to me.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Nocedal, J. and Wright, S.J. (2000) Numerical Optimization Mathematics Subject Classification. 2nd Edition, Library of Congress Control Number: 2006923897.
https://doi.org/10.1007/b98874
[2] Wilkinson, J.H. (1960) Householder’s Method for the Solution of the Algebraic Eigenproblem. The Computer Journal, 3, 23-27. https://doi.org/10.1093/comjnl/3.1.23
[3] Powell, M.J.D. (2014) Powell. LINCOA software.
http://mat.uc.pt/zhang/software.htmllincoa
[4] Powell, M.J.D. (1970) A Hybrid Method for Nonlinear Equations. In: Robinowitz, P., Ed., Numerical Methods for Nonlinear Algebraic Equations, Gordon and Breach Science, London, 87-144.
[5] Conn, A.R., Gould, N.I.M. and Toint, P.L. (2000) Trust-Region Methods. MPS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics, Philadelphia, PA. https://doi.org/10.1137/1.9780898719857
[6] Dennis, J.E. and Schnabel, A.B. (1989) A View of Unconstrained Optimization. In: Handbooks in Operations Research and Management, Elsevier Science Publishers, Amsterdam, The Netherlands, 1-72. https://doi.org/10.1016/S0927-0507(89)01002-9
[7] Golub, G.H. and von Matt, U. (1991) Quadratically Constrained Least Squares and Quadratic Problems. Numerische Mathematik, 59, 561-580.
https://doi.org/10.1007/BF01385796
[8] Conn, A.R., Scheinberg, K. and Vicente, L.N. (2008) Geometry of Interpolation Sets in Derivative Free Optimization. Mathematical Programming, 111, 141-172.
https://doi.org/10.1007/s10107-006-0073-5
[9] Powell, M.J.D. (1998) Direct Search Algorithms for Optimization Calculations. Acta Numerica, 7, 287-336. https://doi.org/10.1017/S0962492900002841
[10] Fletcher, R. (1987) Practical Methods of Optimization. 2nd Edition, John Wiley and Sons, Chichester.
[11] Gratton, S., Toint, P.L. and Troltzsch, A. (2011) An Active-Set Trust-Region Method for Derivative-Free Nonlinear Bound-Constrained Optimization. Optimization Methods and Software, 26, 873-894.
[12] Powell, M.J.D. (1970) A New Algorithm for Unconstrained Optimization. In: Rosen, J.B., Mangasarian, O.L. and Ritter, K., Eds., Nonlinear Programming, Academic Press, New York, 31-65. https://doi.org/10.1016/B978-0-12-597050-1.50006-3
[13] Gay, D.M. (1981) Computing Optimal Local Constrained Step. SIAM Journal on Scientific and Statistical Computing, 2, 186-197. https://doi.org/10.1137/0902016
[14] Moré, J.J. and Sorenson, D.C. (1983) Computing a Trust Region Step. SIAM Journal on Scientific and Statistical Computing, 4, 553-572.
https://doi.org/10.1137/0904038
[15] Ciarlet, P.G. and Raviart, P.A. (1972) General Lagrange and Hermite Interpolation in 〖IR〗^n with Applications to Finite Element Methods. Archive for Rational Mechanics and Analysis, 46, 177-199. https://doi.org/10.1007/BF00252458
[16] Erd?s, P. (1961) Problems and Results on the Theory of Interpolation. II. Acta Mathematica Academiae Scientiarum Hungarica, 12, 235-244.
https://doi.org/10.1007/BF02066686
[17] Smith, S.J. (2006) Lebesgue Constants in Polynomial Interpolation. Annales Mathematicae et Informaticae, 33, 109-123.
[18] Byrd, R.H., Schnabel, R.B. and Schultz, A.A. (1988) Approximate Solution of the Trust Regions Problem by Minimization over Two-Dimensional Subspaces. Mathematical Programming, 40, 247-263. https://doi.org/10.1007/BF01580735
[19] Paviani, D.A. and Himmelblau, D.M. (1969) Constrained Nonlinear Optimization by Heuristic Programming. Operations Research, 17, 872-882.
https://doi.org/10.1287/opre.17.5.872
[20] Conn, A.R., Scheinberg, K. and Vicente, L.N. (2008) Introduction to Derivative-Free Optimization. MPS-SIAM Series on Optimization. SIAM, Philadelphia, PA. https://doi.org/10.1137/1.9780898718768
[21] Yuan, Y.-X. (2015) A Review of Trust Region Algorithms for Optimization. Chinese Academy of Sciences, Beijing.
[22] Powell, M.J.D. (1998) Direct Search Algorithms for Optimization Calculations. Acta Numerica, 7, 287-336. https://doi.org/10.1017/S0962492900002841
[23] Dennis, J.E. and Mei, H.H.W. (1979) Two New Unconstrained Optimization Algorithms Which Use Function and Gradient Values. Journal of Optimization Theory and Applications, 28, 453-482. https://doi.org/10.1007/BF00932218
[24] Omojokun, E.O. (1989) Trust Region Algorithms for Optimization with Nonlinear Equality and Inequality Constraints. Ph.D. Thesis, University of Colorado, Boulder, CO.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.