On Global Minimization for the Value Function in Affine Optimal Control Problems

Abstract

In this paper we provide a computational approach to a minimization problem for the value function associated with an affine optimal control problem subject to terminal-constraint with quadratic cost plus a potential, for a fixed final time and initial point. We study the global minimization problem of the value function over the attainable set and the regularity properties of the value function at a global minimizer point. On the other hand, in the global minimization of the value function, by an example in this computational approach, we also focus on a numerical method by Riccati matrix differential equations.

Share and Cite:

Zhu, J.H. (2023) On Global Minimization for the Value Function in Affine Optimal Control Problems. Open Access Library Journal, 10, 1-16. doi: 10.4236/oalib.1110408.

1. Introduction and Preliminary

The regularity properties of a value function associated with an optimal control problem have been deeply studied in the last decades, extensively using tools from geometric control theory and nonsmooth analysis. It is well known that the value function associated with an optimal control problem fails to be everywhere differentiable. Actually, it is not even continuous.

In [1] the authors define a value function on the target state set associated with an affine optimal control problem and study its regularity properties. Under certain assumptions, in the attainable set the authors find out some subsets on which the value function is continuous and differentiable. In this paper, by a point of view in optimization methods, we first solve a global minimization problem for the value function defined by authors of the paper in [1] , and then we show that the value function is continuous and differentiable at a global minimizer point.

In deed, the value function defined in [1] takes its value by an optimal control with a terminal constraint. In this paper we solve a global minimization problem by the optimal control without terminal constraints.

The optimal control problem we deal with in this paper is the following:

( P y ) : min C y ( u ) : = 0 T [ u Τ ( t ) u ( t ) Q ( x ( t ) ) ] d t , (1.1)

such that

x ˙ ( t ) = f ( x ( t ) ) + g ( x ( t ) ) u ( t ) , x ( 0 ) = a R n , x ( T ) = y R n , u ( t ) R m , x ( t ) R n , t [ 0 , T ] .

Here the function Q ( x ) : R n R 1 is smooth and bounded from above, the vector function f ( x ) : R n R n and matrix function g ( x ) : R n R n × m are smooth on R n . a R n is a given initial condition and y R n is a given terminal condition. An admissible control u ( ) is a vector function in L 2 ( [ 0, T ] , R m ) , such that the solution x u ( ) for the equation x ˙ ( t ) = f ( x ( t ) ) + g ( x ( t ) ) u ( t ) satisfying x ( 0 ) = a R n , x ( T ) = y is well defined on the interval [ 0, T ] . The set of admissible controls is denoted by Ω a . For the initial state a R n , we define value function S a ( ) : R n R 1 associated with ( P y ) to be a function of terminal states as follows: for y R n ,

S a ( y ) = inf { C y ( u ) | u L 2 ( [ 0, T ] , R m ) , x u ( 0 ) = a , x u ( T ) = y } , (1.2)

with the understanding that S a ( y ) = + if y cannot be attained by admissible trajectories in time T. It is also clear that S a ( y ) is bounded below noting that C y ( u ) is bounded below due to the assumption that Q ( x ) is bounded from above. Define the attainable set A a as the set of points on R n that can be reached from a by admissible trajectories in time interval [ 0, T ] . We always assume A a . In [1] the authors have studied the problem concerning the regularity properties of the value function S a ( y ) on a dense subset of the attainable set.

In this paper we provide a computational approach to the following minimization problem:

min y A a S a ( y ) . (1.3)

Meanwhile, we study the regularity properties of the value function S a ( y ) at a global minimizer.

Remark 1.1. In practice, when Q ( x ) 0 , because the cost functional is a square of the L 2 norm of control, by the minimization of the value function one can find an optimal target for the system to work at minimal cost.

To deal with minimizing S a ( y ) over A a , we need to solve the following optimal control problem without terminal state restriction.

( B ) : min J a ( u ) : = 0 T [ u Τ ( t ) u ( t ) Q ( x ( t ) ) ] d t , (1.4)

such that

x ˙ ( t ) = f ( x ( t ) ) + g ( x ( t ) ) u ( t ) , x ( 0 ) = a R n , u ( t ) R m , x ( t ) R n , t [ 0 , T ] .

Remark 1.2. We see that, for y A a and u which is an admissible control steering the affine system from a to y,

C y ( u ) = J a ( u ) . (1.5)

Remark 1.3. Suppose that ( B ) is solvable (i.e. there exists an optimal control of the problem ( B ) ). Let u ^ be an optimal control of the problem ( B ) . It implies that J a ( u ^ ) < . Let the optimal control u ^ steer the primal affine system from a to a point y ^ . By (1.2), (1.5), we have

S a ( y ^ ) = inf { C y ^ ( u ) : u L 2 ( [ 0 , T ] , R m ) , x u ( 0 ) = a , x u ( T ) = y ^ } = inf { J a ( u ) : u L 2 ( [ 0 , T ] , R m ) , x u ( 0 ) = a , x u ( T ) = y ^ } = J a ( u ^ ) < . (1.6)

Since S a ( y ) is bounded below, we conclude that S a ( y ) , y A a is not constantly infinity. Thus if ( B ) is solvable, then by (1.6) we see that inf y A a S a ( y ) is finite. Thus the minimization (1.3) is meaningful. Moreover, we will show that ( B ) is solvable only if the minimization problem min y A a S a ( y ) is solvable.

In this paper we focus on the Hamilton-Jacobi-Bellman equation [2] [3] with respect to the problem ( B ) . We present parameterized convection-diffusion equations for a viscosity approximation [4] [5] [6] to the Hamilton-Jacobi-Bellman equation. Then a parameterized convection-diffusion equation yields a piecewise differentiable flow for an approximation of the optimal objective value of the problem ( B ) .

The rest of the paper is organized as follows. In section 2, we study the global minimization problem of the value function S a ( y ) over the attainable set. In section 3, two results are given on continuity and differentiability of the value function S a ( y ) at a global minimizer. The section 4 is devoted to present a computational approach to the minimization problem of the value function S a ( y ) . Two examples are presented to illustrate the computational approach for a linear quadratic optimal control problem under terminal constraint in section 5. In section 6, we derive an iteration of difference equations for implementing the computational approach to min y A a S a ( y ) given in section 4. A conclusion is in the last section.

2. Minimizing the Value Function S a ( y ) over the Attainable Set

For the problem ( P ) , to minimize the value function S a ( y ) over the attainable set A a , we consider the optimal control problem ( B ) in (1.4). For the problem ( B ) , we define its value function as follows:

V ( x ) = inf { J x ( u ) | u L 2 ( [ 0 , T ] , R m ) , x ( 0 ) = x } . (2.1)

Theorem 2.1. If ( B ) is solvable (i.e. there exists an optimal control of the problem ( B ) ), then

V ( a ) = min y A a S a ( y ) . (2.2)

Proof. Let u ^ be an optimal control of the problem ( B ) , which steers the control affine system (in (1.4)) from a to y ^ . By the definition of S a ( y ^ ) ((1.2)), noting that by (1.5) for all controls u steering the affine control system from a to y ^ satisfying C y ^ ( u ) = J a ( u ) J a ( u ^ ) = C y ^ ( u ^ ) , we have

V ( a ) = J a ( u ^ ) = C y ^ ( u ^ ) = inf { C y ^ ( u ) | u L 2 ( [ 0 , T ] , R m ) , x u ( 0 ) = a , x u ( T ) = y ^ } = S a ( y ^ ) . (2.3)

Thus we have

V ( a ) = S a ( y ^ ) , (2.4)

On the other hand, if there is a y A a such that S a ( y ) < S a ( y ^ ) , then, noting (1.5) and (1.2), there is a control u which steers the affine system from a to y such that J a ( u ) = C y ( u ) < S a ( y ) + ( S a ( y ^ ) S a ( y ) ) = S a ( y ^ ) = V ( a ) , which leads to a contradiction to the fact that u ^ is an optimal control of the problem ( B ) . Thus we have S a ( y ^ ) = min y A a S a ( y ) . It follows from (2.4) that we have V ( a ) = min y A a S a ( y ) . The proof of Theorem 2.1 is completed.

Theorem 2.2. If a vector y ^ R n satisfies S a ( y ^ ) = min y A a S a ( y ) and an admissible control u ^ satisfies C y ^ ( u ^ ) = S a ( y ^ ) , then ( B ) is solvable.

Proof. Since the minimization problem min y A a S a ( y ) is solvable, we have a point y ^ A a such that

S a ( y ^ ) = min y A a S a ( y ) . (2.5)

We need to show that an admissible control u ^ steering the system from a to y ^ such that C y ^ ( u ^ ) = S a ( y ^ ) is an optimal control of ( B ) . Let u ¯ be an arbitrary admissible control steering the system from a to y ¯ . We will show that J a ( u ^ ) J a ( u ¯ ) . By (1.5), (2.5) and the definition of S a ( y ¯ ) , also noting the assumption C y ^ ( u ^ ) = S a ( y ^ ) , we have

J a ( u ^ ) = C y ^ ( u ^ ) = S a ( y ^ ) = min y A a S a ( y ) S a ( y ¯ ) C y ¯ ( u ¯ ) = J a ( u ¯ ) . (2.6)

Since u ¯ is an arbitrary admissible control, by (2.6) we see that u ^ is an optimal control of ( B ) . The proof of Theorem 2.2 is completed.

Remark 2.1. In the proof of Theorem 2.1 and Theorem 2.2, a basic fact is the both of them have the same control system and the same cost for the same admissible control. By these two theorems, we know that the optimal control of the problem ( B ) steers the control system from the initial state point to the minimizer of the value function S a ( y ) of the problem ( P ) over the attainable set. On the other hand, if y ^ is a minimizer of S a ( y ) over A a and a control u ^ , steering the system from the initial point a to y ^ , is an optimal control of ( P y ^ ) , then u ^ is an optimal control of ( B ) .

3. The Regularity Properties of Value Function S a ( y ) at a Minimizer Point

In this section we assume that ( B ) is solvable. Let u ^ be an optimal control of the problem ( B ) , which steers the primal affine system from the initial point a to y ^ . By (2.4) we see that y ^ is a minimizer point of S a ( y ) over A a .

We know that the end-point map E a T is smooth [1] [7] [8] [9] [10] . Further we assume that the end-point map E a T is an open map at the optimal control u ^ in the following theorems.

Theorem 3.1. If the minimizer point y ^ int A a and the end-point map is an open map at u ^ , then the value function S a ( y ) of ( P ) is continuous at the minimizer y ^ of S a ( y ) .

Proof. Since u ^ is an optimal control of the problem ( B ) , which steers the primal affine system from a to y ^ , we have V ( a ) = S a ( y ^ ) by (2.4). Since the end-point map is open at u ^ , there are positive numbers η and δ such that the image of { u : u u ^ L 2 < η } under the end-point map covers { y : y y ^ R n < δ } . Given ε > 0 . We can also choose positive number η so small that, when u u ^ L 2 < η , the inequality | J a ( u ) J a ( u ^ ) | < ε holds(see Proposition 32 in [3] ). Since y ^ int A a , we can choose positive number δ above so small that, for all Δ y { ξ R m : ξ R n δ } , y ^ + Δ y A a and V ( a ) = S a ( y ^ ) S a ( y ^ + Δ y ) noting that y ^ is the minimizer of S a ( y ) . In other words, now, for given ε > 0 , we have two positive numbers η and δ such that:

1) the image of { u : u u ^ L 2 < η } under the end-point map covers { y : y y ^ R n < δ } ;

2) when u u ^ L 2 < η , the inequality | J a ( u ) J a ( u ^ ) | < ε holds;

2) when Δ y R n δ , we have y ^ + Δ y A a and V ( a ) = S a ( y ^ ) S a ( y ^ + Δ y ) .

Now for Δ y R n δ , in { u : u u ^ L 2 < η } , we have an admissible control v steering the primal affine system from a to y ^ + Δ y A a . By the definition of S a ( y ) , also noting that the relationship C y ( u ) = J a ( u ) and J a ( u ^ ) = V ( a ) = S a ( y ^ ) , we have the following inequalities:

V ( a ) = S a ( y ^ ) S a ( y ^ + Δ y ) = inf u C y ^ + Δ y ( u ) C y ^ + Δ y ( v ) = J a ( v ) J a ( u ^ ) + ε = S a ( y ^ ) + ε . (3.1)

Consequently,

0 S a ( y ^ + Δ y ) S a ( y ^ ) J a ( v ) J a ( u ^ ) < ε . (3.2)

The proof of Theorem 3.1 is completed.

Next we study the differentiability of the value function S a ( y ) at a minimizer point.

Theorem 3.2. If the minimizer point y ^ int A a and the end-point map is an one-to-one open map at u ^ , then the value function S a ( y ) of ( P ) is differentiable at the minimizer y ^ of S a ( y ) .

Proof. Since all functions Q ( x ) , f ( x ) , g ( x ) are smooth, we see that the functional J a ( u ) is F-differentiable at u L 2 ( [ 0, T ] , R m ) . Noting that the end-point map is an open map at u ^ and J a ( u ) J a ( u ^ ) for all admissible control near u ^ in L 2 ( [ 0, T ] , R m ) , we have

l i m Δ u L 2 0 J a ( u ^ + Δ u ) J a ( u ^ ) Δ u L 2 = 0 (3.3)

which also implies that, for a small positive r, there exists C r > 0 such that, as 0 < Δ u L 2 < r ,

J a ( u ^ + Δ u ) J a ( u ^ ) C r Δ u L 2 . (3.4)

Since the end-point map E a T is an open map at u ^ , there are positive numbers η and δ such that

E a T : { u : u u ^ L 2 < η } { y : y y ^ R n < δ } ,

is surjective. There exists a smooth right inverse Φ : { y : y y ^ R n < δ } { u : u u ^ L 2 < η } such that E a T ( Φ ( y ) ) = y for every y in { y : y y ^ R n < δ } . Fix local coordinates around the initial state a, and let B a ( r ) { y : y y ^ R n < r } and B u ^ ( r ) { u : u u ^ L 2 < r } denote some balls of radius r > 0 centered at a and u ^ , respectively. Due to Φ being smooth, there exists positive numbers R Φ < η and C Φ > 0 such that, for every 0 r R Φ ,

B a ( C Φ r ) E a T ( B u ^ ( r ) ) , (3.5)

Pick any point y int A a such that Δ y = C Φ r , with 0 r R Φ . Then by (3.5) there exists v B u ^ ( r ) satisfying v = u ^ + Δ u and Δ u L 2 r such that E a T ( v ) = y . Noting that

Δ y = C Φ r C Φ Δ u L 2 , (3.6)

When Δ u L 2 r ( Δ u = v u ^ ), by (3.1), (3.2), (3.4), (3.6), also noting that J a ( u ^ ) = V ( a ) = S a ( y ^ ) , we have,

0 S a ( y ^ + Δ y ) S a ( y ^ ) J a ( u ^ + Δ u ) J a ( u ^ ) C r Δ u L 2 C r C Φ Δ y

and

0 S a ( y ^ + Δ y ) S a ( y ^ ) Δ y J a ( u ^ + Δ u ) J a ( u ^ ) Δ y = J a ( u ^ + Δ u ) J a ( u ^ ) Δ u Δ u Δ y . (3.7)

By (3.5) we see that when Δ y 0 , i.e. C Φ r 0 ( C Φ > 0 is picked above), we have Δ u L 2 r 0 . Thus, by (3.7), (3.6), (3.3) we have

0 S a ( y ^ + Δ y ) S a ( y ^ ) Δ y J a ( u ^ + Δ u ) J a ( u ^ ) Δ y = J a ( u ^ + Δ u ) J a ( u ^ ) Δ u Δ u Δ y J a ( u ^ + Δ u ) J a ( u ^ ) Δ u C Φ 1 0.

Consequently, the value function S a ( y ) of ( P ) is differentiable at the minimizer y ^ of S a ( y ) and the corresponding derivative is zero. The proof of Theorem 3.1 is completed.

Remark 3.1. For both theorems above we assume y ^ int A a noting that the candidate as a minimizer of the value function should be in the interior of the attainable set. Since A a is the image of Ω a under the end-point map, it is reasonable to assume that the end-point map is an open map at the optimal control u ^ . In [1] the authors assume the end-point map to be open and a submersion at an optimal control and consider the regularity properties of the value function on a subset ( int A a ). In this paper we only need to assume that y ^ is an image of the end-point map which is open at the optimal control u ^ . But we do not assume the end-point map to be a submersion.

4. An Extremal Flow for Minimizing the Value Function S a ( y ) of Affine Optimal Control Problems under Terminal Constraint

In this section, for the problem ( P ) , to minimize the value function S a ( y ) over the attainable set, we create a so called extremal flow for computing optimal value V ( a ) of the problem ( B ) numerically. We focus on the following HJB equation for ( t , x ) R × R n ,

v t ( t , x ) + v x Τ ( t , x ) f ( x ) Q ( x ) + inf u R m { v x Τ ( t , x ) g ( x ) u + u Τ u } = 0, (4.1)

with boundary condition v ( T , x ) = 0 .

By elementary optimization, for given ( t , x ) , we see that u = 1 2 g Τ ( x ) v x ( t , x ) is the unique minimizer of v x Τ ( t , x ) g ( x ) u + u Τ u over R m . Then we have

inf u R m { v x Τ ( t , x ) g ( x ) u + u T u } = 1 4 v x Τ ( t , x ) g ( x ) g T ( x ) v x ( t , x ) . (4.2)

Thus we can rewrite the equation in (4.1) as the following PDE:

v t ( t , x ) + v x Τ ( t , x ) f ( x ) Q ( x ) 1 4 v x Τ ( t , x ) g ( x ) g Τ ( x ) v x ( t , x ) = 0, v ( T , x ) = 0. (4.3)

Remark 4.1. By classical PDE theory [4] [5] [6] , we know that a viscosity solution of the PDE in (4.3) can be obtained from smooth solutions v ε to the family of convection-diffusion equations

ε Δ x v ( t , x ) = v t ( t , x ) + v x Τ ( t , x ) f ( x ) Q ( x ) 1 4 v x Τ ( t , x ) g ( x ) g Τ ( x ) v x ( t , x ) , v ( T , x ) = 0. (4.4)

(parameterized by ε > 0 ) in the limits as ε 0 + , where Δ x v ( t , x ) = k = 1 n 2 v ( t , x ) x k 2 . The convergence of (4.4) to (4.3) as ε 0 + has been established by the classical works of PDE on the viscosity approximation (see [2] ). Thus in the equation (2.4) the diffusion term ε Δ x v ( t , x ) converges to zero locally uniformly as ε 0 + .

For computing V ( a ) numerically, we define extremal flows as follows.

Definition 4.1. Given ε > 0 , for a solution v ( t , x ) of the PDE problem (4.4), we call x ε ( ) an extremal flow, if it is a solution of Cauchy initial value problem

{ x ˙ ε ( t ) = f ( x ε ( t ) ) 1 2 g ( x ε ( t ) ) g T ( x ε ( t ) ) v x ( t , x ε ( t ) ) , t [ 0,1 ] , x ε ( 0 ) = a , (4.5)

with a feedback control

u ε ( t ) : = 1 2 g T ( x ε ( t ) ) v x ( t , x ε ( t ) ) . (4.6)

The following theorem claims that optimal value V ( a ) of the problem ( B ) can be approximated by solving the equation (4.4) and (4.5).

Theorem 4.1. Let v ( t , x ) be denoted as a solution of PDE in (4.4) corresponding to a positive real number ε . Given y R n , if x ε ( ) is an extremal flow related to v ( t , x ) and u ε ( ) is the corresponding feedback control (see (4.5), (4.6)), then we have

l i m ε 0 + J a ( u ε ) = V ( a ) = min y A a S a ( y ) . (4.7)

Proof. By (4.5), (4.6), (4.7), we have

ε Δ x v ( t , x ε ( t ) ) = v t ( t , x ε ( t ) ) + v x Τ ( t , x ε ( t ) ) f ( x ε ( t ) ) Q ( x ε ( t ) ) 1 4 v x Τ ( t , x ε ( t ) ) g ( x ε ( t ) ) g Τ ( x ε ( t ) ) v x ( t , x ε ( t ) ) = v t ( t , x ε ( t ) ) + v x Τ ( t , x ε ( t ) ) f ( x ε ( t ) ) Q ( x ε ( t ) ) 1 2 v x Τ ( t , x ε ( t ) ) g ( x ε ( t ) ) g Τ ( x ε ( t ) ) v x ( t , x ε ( t ) ) + 1 4 v x Τ ( t , x ε ( t ) ) g ( x ε ( t ) ) g Τ ( x ε ( t ) ) v x ( t , x ε ( t ) ) = d v ( t , x ε ( t ) ) d t Q ( x ε ( t ) ) + u ε Τ ( t ) u ε ( t ) a . e . t [ 0, T ] . (4.8)

Integrating the equality in (4.8) with respect to t from 0 to T, noting that v ( T , x ε ( T ) ) = 0 , we have

0 T ε Δ x v ( t , x ε ( t ) ) d t = J a ( u ε ) v ( 0, a ) . (4.9)

On the other hand, if x ( ) is another trajectory corresponding to an admissible control u ( ) with respect to ( B ) , we have

inf u R m { v x Τ ( t , x ( t ) ) g ( x ( t ) ) u + u Τ u } v x Τ ( t , x ( t ) ) g ( x ( t ) ) u ( t ) + u ( t ) Τ u ( t ) , (4.10)

then for each t [ 0, T ] , we have

ε Δ x v ( t , x ( t ) ) = v t ( t , x ( t ) ) + v x Τ ( t , x ( t ) ) f ( x ( t ) ) Q ( x ( t ) ) 1 4 v x Τ ( t , x ( t ) ) g ( x ( t ) ) g Τ ( x ( t ) ) v x ( t , x ( t ) ) = v t ( t , x ( t ) ) + v x Τ ( t , x ( t ) ) f ( x ( t ) ) Q ( x ( t ) ) + inf u R m { v x Τ ( t , x ( t ) ) g ( x ( t ) ) u + u Τ u }

v t ( t , x ( t ) ) + v x Τ ( t , x ( t ) ) f ( x ( t ) ) Q ( x ( t ) ) + v x Τ ( t , x ( t ) ) g ( x ( t ) ) u ( t ) + u Τ ( t ) u ( t ) = v t ( t , x ( t ) ) + v x Τ ( t , x ( t ) ) d x ( t ) d t Q ( x ( t ) ) + u Τ ( t ) u ( t ) . (4.11)

Integrating the above inequality over [ 0, T ] , noting v ( T , x ( T ) ) = 0 , we obtain

0 T ε Δ x v ( t , x ( t ) ) d t 0 T d d t v ( t , x ( t ) ) d t + J a ( u ) = v ( 0, a ) + J a ( u ) . (4.12)

By (4.9) and (4.12), we have

J a ( u ε ) = 0 T ε Δ x v ( t , x ε ( t ) ) d t + v ( 0, a ) J a ( u ) + 0 T ε Δ x v ( t , x ε ( t ) ) d t 0 T ε Δ x v ( t , x ( t ) ) d t . (4.13)

Let u ^ ( ) be an optimal control and x ^ ( ) be the corresponding optimal trajectory with respect to ( B ) . Using (4.13) for the optimal pair ( x ^ ( ) , u ^ ( ) ) , noting the fact V ( a ) = J a ( u ^ ) , we have

J a ( u ε ) = 0 T ε Δ x v ( t , x ε ( t ) ) d t + v ( 0, a ) V ( a ) + 0 T ε Δ x v ( t , x ε ( t ) ) d t 0 T ε Δ x v ( t , x ^ ( t ) ) d t . (4.14)

By (4.14), noting that u ε ( . ) is an admissible feedback control, we have

V ( a ) J a ( u ε ) V ( a ) + 0 T [ ε Δ x v ( t , x ε ( t ) ) ε Δ x v ( t , x ^ ( t ) ) ] d t

which yields

0 J a ( u ε ) V ( a ) 0 T [ ε Δ x v ( t , x ε ( t ) ) ε Δ x v ( t , x ^ ( t ) ) ] d t . (4.15)

Noting that, in the Equation (4.4), the diffusion term ε Δ x v ( t , x ) converges to zero locally uniformly as ε 0 + (see Remark 4.1), we can show that, on a compact set Ω ( R ( a ) ) which contains the optimal trajectory { x ^ ( ) } and the flow { x ε ( ) } ,

l i m ε 0 + sup ( t , x ) [ 0,1 ] × Ω | ε Δ x v ( t , x ) | = 0. (4.16)

Thus, by Lebesgue Convergence Theorem, we have

l i m ε 0 + 0 T [ ε Δ x v ( t , x ε ( t ) ) ] d t = 0, (4.17)

and

l i m ε 0 + 0 T [ ε Δ x v ( t , x ^ ( t ) ) ] d t = 0. (4.18)

Thus by (4.15), (4.16), (4.17), (4.18) we have, as ε 0 + ,

0 J a ( u ε ) V ( a ) 0 T [ ε Δ x v ( t , x ε ( t ) ) ε Δ x v ( t , x ^ ( t ) ) ] d t 0. (4.19)

It follows from Theorem 2.1 that

lim ε 0 + J a ( u ε ) = V ( a ) = min y A a S a ( y ) .

The theorem has been proved.

In the proof of Theorem 4.1, replacing Δ x v ( t , x ε ( t ) ) with zero, the extremal flow will not depend on ε . By the same way we can prove the following result.

Theorem 4.2. If v ( t , x ) satisfies the PDE

v t ( t , x ) + v x Τ ( t , x ) f ( x ) Q ( x ) 1 4 v x Τ ( t , x ) g ( x ) g Τ ( x ) v x ( t , x ) = 0, v ( T , x ) = 0, (4.20)

and x ^ ( ) is the solution to Cauchy initial value problem

{ x ˙ ( t ) = f ( x ( t ) ) 1 2 g ( x ( t ) ) g T ( x ( t ) ) v x ( t , x ( t ) ) , t [ 0 , 1 ] , x ( 0 ) = a , (4.21)

with a feedback control

u ^ ( t ) = 1 2 g T ( x ^ ( t ) ) v x ( t , x ^ ( t ) ) , (4.22)

then we have

J a ( u ^ ) = V ( a ) = min y A a S a ( y ) . (4.23)

5. Examples on Linear-Quadratic Optimal Control Problem under Terminal State Constraint for Illustrating Theorem 4.2

Example 5.1. We consider the following linear-quadratic optimal control problem with terminal state constraint:

( P L ) : { inf C y ( u ) : = 0 T [ u T ( t ) u ( t ) + x T ( t ) x ( t ) ] d t x ˙ ( t ) = A x ( t ) + B u ( t ) , x ( 0 ) = a R n , x ( T ) = y R n , u ( t ) R m , x ( t ) R n , t [ 0, T ] , (5.1)

and the corresponding linear-quadratic optimal control problem:

( B L ) : { inf J a ( u ) : = 0 T [ u T ( t ) u ( t ) + x T ( t ) x ( t ) ] d t x ˙ ( t ) = A x ( t ) + B u ( t ) , x ( 0 ) = a R n , u ( t ) R m , x ( t ) R n , t [ 0 , T ] . (5.2)

where in (5.1) and (5.2), A R n × n and B R n × m .

By classical LQ optimal control theory [3] , there exists an absolutely continuous symmetric matrix function S ( t ) , defined for t [ 0, T ] , which satisfies the matrix Riccati Differential Equation on [ 0, T ] :

S ˙ + S A + A Τ S + I S B B Τ S = 0, S ( T ) = 0. (5.3)

Moreover the LQ optimal control problem ( B L ) is solvable.

To use Theorem 4.2, we see that the function v ( t , x ) = x T S ( t ) x satisfies the following HJB equation

v t ( t , x ) + v x Τ ( t , x ) A x + x T x 1 4 v x Τ ( t , x ) B B Τ v x ( t , x ) , v ( T , x ) = 0. (5.4)

For v ( t , x ) = x T S ( t ) x , we have

v x ( t , x ) = 2 S ( t ) x .

For

u = 1 2 B Τ ( x ) v x ( t , x ) = B Τ S ( t ) x ,

to find an extremal flow x ^ ( ) , we solve the Cauchy initial value problem

{ x ˙ = A x + B u = ( A B B T S ( t ) ) x , t [ 0, T ] , x ( 0 ) = a , (5.5)

Let Φ ( t ,0 ) be the solution of the matrix differential equation

X ˙ ( t ) = ( A B B T S ( t ) ) X ( t ) , X ( 0 ) = I .

By classical ordinary differential equation theory, Φ ( t ,0 ) is the fundamental solution associated to A B B T S ( ) and the solution of (5.5) is given by

x ^ ( t ) = Φ ( t ,0 ) a .

Then we have a feedback control

u ^ ( t ) : = B T S ( t ) x ^ ( t ) . (5.6)

By Theorem 4.2 we have

min y A a S a ( y ) = J a ( u ^ ) = 0 T [ u ^ T ( t ) u ^ ( t ) + x ^ T ( t ) x ^ ( t ) ] d t . (5.7)

Remark 5.1. We will provide an approximation approach to compute min y A a S a ( y ) in the example as follows.

Example 5.2. We consider following linear-quadratic optimal control problem with terminal state constraint:

{ inf C y ( u ) : = 0 1 [ u 2 ( t ) + x 2 ( t ) ] d t x ˙ ( t ) = u ( t ) , x ( 0 ) = a R 1 , x ( T ) = y R 1 , u ( t ) R 1 , x ( t ) R 1 , t [ 0 , 1 ] , (5.8)

and the corresponding linear-quadratic optimal control problem:

{ inf J a ( u ) : = 0 1 [ u 2 ( t ) + x 2 ( t ) ] d t x ˙ ( t ) = u ( t ) , x ( 0 ) = a R 1 , u ( t ) R 1 , x ( t ) R 1 , t [ 0 , 1 ] . (5.9)

Similar as the PDE in (5.4), the HJB equation for this example is

v t ( t , x ) + x 2 1 4 v x 2 ( t , x ) = 0 , v ( 1 , x ) = 0 , t [ 0 , 1 ] , x R 1 . (5.10)

Similar as Example 5.1 we have

v ( t , x ) = S ( t ) x 2 , v x ( t , x ) = 2 S ( t ) x , (5.11)

where S ( t ) satisfies Riccati Differential Equation:

S ˙ + 1 S 2 = 0 , S ( 1 ) = 0 , t [ 0 , 1 ] , S ( t ) R 1 . (5.12)

We solve the Cauchy initial value problem

{ x ˙ = S ( t ) x , t [ 0 , 1 ] , x ( 0 ) = a . (5.13)

to find an extremal flow x ^ ( ) and the feedback control

u ^ = 1 2 v x ( t , x ^ ) = S ( t ) x ^ ,

By Theorem 4.2, for this example we have

min y A a S a ( y ) = J a ( u ^ ) = 0 1 ( S 2 ( t ) + 1 ) x ^ 2 ( t ) d t . (5.14)

For a numerical approach to compute min y A a S a ( y ) , in the following we present a sequence of flows to converge the extremal flow and the corresponding feedback control for an approximation of min y A a S a ( y ) .

By the iteration method given in [11] , we have a sequence of differentiable functions { S i ( ) , i = 1,2 , } , satisfying

S ˙ i + 1 ( S i 1 ) 2 = 0 , S i ( 1 ) = 0 , t [ 0 , 1 ] , (5.15)

S 0 ( t ) = 1 t , t [ 0,1 ] , (5.16)

such that { S i ( ) } converges uniformly to the solution S ( t ) of the equation in (5.12). Then we have a sequence of { x i ( ) , i = 1,2 , } . Such that for i = 1 , 2 , ,

x ˙ i ( t ) = u i : = S i ( t ) x i ( t ) , t [ 0,1 ] , x i ( 0 ) = a . (5.17)

Noting that S ( t ) is bounded and { S i ( ) } converges uniformly to the solution S ( t ) , we see that { S i ( ) } is uniformly bounded. Therefore by Bellman-Gronwall inequality we can show that { x i ( ) } is uniformly bounded. Further, we show that { x i ( ) } converges uniformly to x ^ ( ) which is the solution of the equation in (5.13) as follows. We have, for t [ 0,1 ] and i = 1,2 , ,

x i ( t ) x ^ ( t ) = 0 t [ S i ( s ) x i ( s ) S ( s ) x ^ ( s ) ] d s = 0 t [ ( S i ( s ) x i ( s ) S i ( s ) x ^ ( s ) ) + ( S i ( s ) x ^ ( s ) S ( s ) x ^ ( s ) ) ] d s .

Then for t [ 0,1 ] and i = 1,2 , ,, by general integral estimation we have,

x i ( t ) x ^ ( t ) 0 1 S i ( s ) S ( s ) x ^ ( s ) d s + 0 t S i ( s ) x i ( s ) x ^ ( s ) d s .

By Bellman-Gronwall inequality, we have,

x i ( t ) x ^ ( t ) ( 0 1 S i ( s ) S ( s ) x ^ ( s ) d s ) e 0 t S i ( s ) x i ( s ) x ^ ( s ) d s (5.18)

Noting that { x i ( ) } , S i ( ) , i = 0,1,2, are uniformly bounded [ 0,1 ] , x ^ ( ) is bounded on [ 0,1 ] and S i ( ) S ( ) converges uniformly to zero, by (5.18) we have shown that x i ( t ) , i = 1,2, converges uniformly to x ^ ( t ) on { t [ 0,1 ] } . Meanwhile, if define u i ( t ) = S i ( t ) x i ( t ) , then u i ( ) converges uniformly to the feedback control u ^ ( t ) = S ( t ) x ^ ( t ) on [ 0,1 ] . Noting that

J a ( u i ) = 0 1 ( S i 2 ( t ) + 1 ) x i 2 ( t ) d t ,

we have

l i m i J a ( u i ) = J a ( u ^ ) = min y A a S a ( y ) . (5.19)

6. A Numerical Approach to Compute m i n y A a S a ( y ) for the Sake of General Affine Optimal Control Problem ( P )

In this section, we present an iteration of difference equations to illustrate the approximation of min y A a S a ( y ) given by Theorem 4.1 concerning the affine optimal control problem ( P ) .

Given ε > 0 . Let x ε ( . ) satisfy

{ x ˙ ε ( t ) = f ( x ε ( t ) ) 1 2 g ( x ε ( t ) ) g T ( x ε ( t ) ) v x ( t , x ε ( t ) ) , t [ 0 , 1 ] , x ε ( 0 ) = a . (6.1)

with a feedback control

u ε ( t ) : = 1 2 g T ( x ε ( t ) ) v x ( t , x ε ( t ) ) . (6.2)

By the result in Theorem 4.1, we need to compute J a ( u ε ) . Consider the function

H ε ( t ) : = 1 4 v x Τ ( t , x ε ( t ) ) g ( x ε ( t ) ) g Τ ( x ε ( t ) ) v x ( t , x ε ( t ) ) Q ( x ε ( t ) ) . (6.3)

Noting the expression of the cost functional of ( B ) in (1.4), we will estimate

J a ( u ε ) = 0 T H ε ( t ) d t . (6.4)

Let L = [ 1 ε 2 ] + 1 . Dividing the time interval [ 0, T ] evenly into L small intervals [ t i , t i + 1 ] , i = 0,1, , L 1 , with t 0 = 0 , t L = T . Let τ : = T L = t i + 1 t i , i = 0 , 1 , , L 1 . Define

F L ( h ε ) = τ [ 1 2 H ε ( 0 ) + i = 1 L 1 H ε ( i τ ) + 1 2 H ε ( T ) ] . (6.5)

By classical numerical mathematics [12] , we have

l i m L F L ( h ε ) = J a ( u ε ) . (6.6)

By (6.3), for computing F L ( h ε ) , we need to estimate x ε ( t i ) , i = 0,1, , L 1 by following difference equation

{ x ε ( t i + 1 ) x ε ( t i ) = τ [ f ( x ε ( t i ) ) 1 2 g ( x ε ( t i ) ) g T ( x ε ( t i ) ) v x ( t i , x ε ( t i ) ) ] , t i [ 0 , 1 ] , x ε ( t 0 ) = a . (6.7)

with appearing v x ( t i , x ε ( t i ) ) , i = 0,1, , L 1 . Therefore, next for a given ( t , x ) [ 0, T ] × R 2 we need to estimate v x ( t , x ) . We present an iteration of difference equations as follows to compute v x ( t , x ) numerically.

On [ 0, T ] × R 2 we define

h ( t , x ) : = 1 2 g T ( x ) v x ( t , x ) . (6.8)

The equation in (4.4) can be rewritten as

ε Δ x v ( t , x ) = v t ( t , x ) + v x T ( t , x ) f ( x ) + v x T ( t , x ) g ( x ) h ( t , x ) Q ( x ) + h ( t , x ) T h ( t , x ) , (6.9)

with the boundary condition v ( 1, x ) = 0 .

For simplicity, we restrict our discussion to the state in R 2 as follows. For a given ( t , x ) [ 0, T ] × R 2 , along the direction h ( t , x ) , we have a linear function x ¯ ( s ) = x + [ f ( x ) + g ( x ) h ( t , x ) ] s , s 0, x ¯ ( 0 ) = x such that

ε Δ x v ( t , x ) = lim s 0 + [ v t ( t + s , x ¯ ( s ) ) + v x T ( t + s , x ¯ ( s ) ) d x ¯ ( s ) d s Q ( x ¯ ( s ) ) + h ( s , x ¯ ( s ) ) T h ( s , x ¯ ( s ) ) ] = d v ( t + s , x ¯ ( s ) ) d s | s = 0 Q ( x ) + h ( t , x ) T h ( t , x ) , (6.10)

Noting that d x ¯ ( s ) d s f ( x ) + g ( x ) h ( t , x ) . Then when 0 < Δ t 1 , at ( t , x ) , we have the following difference equation:

ε Δ x v ( t + Δ t , x ) v ( t + Δ t , x ¯ ( Δ t ) ) v ( t , x ) Δ t Q ( x ) + h ( t , x ) T h ( t , x ) , (6.11)

x ¯ ( Δ t ) = x + [ f ( x ) + g ( x ) h ( t , x ) ] Δ t . (6.12)

Write the state x = ( x 1 , x 2 ) in R 2 . Denote [ x 1 1, x 1 + 1 ] × [ x 2 1, x 2 + 1 ] by Ω . We focus on difference equations on [ t ,1 ] × Ω . For positive integers: N , L , let i , j { N , , 1,0,1, , N } and k { 0,1, , L } . For ζ = 2 N , τ = 1 L , denote x i , j = ( x 1 + i ζ , x 2 + j ζ ) and s k = 1 k τ ( 1 t ) . Let w i , j k denote the approximate grid value of the solution v ( s k , x i , j ) and

δ w i , j k : = ( 2 ζ ) 1 ( w i + 1, j k w i 1, j k , w i , j + 1 k w i , j 1 k ) v x ( s k , x i , j ) , (6.13)

δ 2 w i , j k : = ζ 2 ( w i + 1, j k + w i 1, j k 4 w i , j k + w i , j + 1 k + w i , j 1 k ) Δ x v ( s k , x i , j ) . (6.14)

We use w k ( x ) , k { 0,1, , L } to denote the piecewise bi-linear interpolant of w i , j k , P ( x 1 ± 1 ± α , x 2 + j ζ ) , P ( x 1 + i ζ , x 2 ± 1 ± α ) , P ( x 1 1 α , x 2 ± 1 ± α ) and P ( x 1 + 1 + α , x 2 ± 1 ± α ) , for all i , j .

By difference iteration method with (6.11)-(6.14) and w k ( x ) (a piecewise bi-linear interpolant of w i , j k ), for a given ( t , x ) [ 0, T ] × R 2 , we write algorithm for approximating v x ( t , x ) as follows.

Algorithm 4.1.

1) Set w i , j 0 = P ( x i , j ) , i , j { N , , 1,0,1, , N } ;

2) Compute

u i , j 0 = h ( x i , j , g T ( x i , j ) δ w i , j 0 ) ,

x ¯ i , j 0 = x i , j + [ f ( x i , j ) + g ( x i , j ) u i , j 0 ] τ ,

w ¯ i , j 0 = w 0 ( x ¯ i , j 0 ) ;

3) For k = 1,2, , L ,

ε δ 2 w i , j k 1 w ¯ i , j k 1 w i , j k τ = Q ( x i , j ) + h ( t , x i , j ) T h ( t , x i , j ) . (6.15)

u i , j k = h ( x i , j , g T ( x i , j ) δ w i , j k ) (6.16)

x ¯ i , j k = x i , j + [ f ( x i , j ) + g ( x i , j ) u i , j k ] τ . (6.17)

w ¯ i , j k = w k ( x ¯ i , j k ) ; (6.18)

4) Compute

δ w 0,0 L = ( 2 h ) 1 ( w 1,0 L w 1,0 L , w 0,1 L w 0, 1 L ) v x ( s L , x 0,0 ) = v x ( t , x ) . (6.19)

Remark 6.1. The above discretization scheme is essentially an Euler’s method in the characteristic direction to be stable in t [12] . The system matrix associated with the third step of the algorithm above is symmetric and positive definite if ε is sufficiently smaller than h 2 τ . We can benefit from these special properties to use an efficient iterative solver which takes advantage of the symmetry and positive-definiteness of the matrix.

Remark 6.2. In Algorithm 4.1, for computing w i , j k ( k = 1,2, , L ) , by (6.15) we only need to have linear computations:

w i , j k = w ¯ i , j k 1 τ [ ε δ 2 w i , j k 1 + Q ( x i , j ) h ( t , x i , j ) T h ( t , x i , j ) ] , (6.20)

noting that w i , j k 1 , x ¯ i , j k 1 , w ¯ i , j k 1 have been got by previous steps.

7. Conclusion

It is well-known that in general a value function of optimal control problem is non-smooth. It is hard to study the regularity properties of a non-smooth function. In this paper, we study the regularity properties of the value function in an affine optimal control problem by solving the global minimization problem for the value function over the attainable set. We also provide a computational approach to this global minimization by a convection-diffusion equation. By more works in future we may consider some global optimization for non-smooth function with the help of optimal control methods.

Conflicts of Interest

The author declares no conflicts of interest.

References

[1] Barilari, D. and Boarotto, F. (2018) On the Set of Points of Smoothness for the Value Function of Affine Optimal Control Problems. SIAM Journal on Control and Optimization, 56, 649-671. https://doi.org/10.1137/17M1123948
[2] Pontryagin, L.S. (1964) The Mathematical Theory of Optimal Processes. Pergamon Press, Oxford.
[3] Sontag, E.D.P. (1998) Mathematical Control Theory: Deterministic Finite Dimensional Systems. 2nd Edition, Springer, New York.
[4] Bardi, M. and Capuzzo-Dolcetta, I. (1997) Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations. Birkhauser, Boston. https://doi.org/10.1007/978-0-8176-4755-1
[5] Crandall, M.G. and Lions, P.L. (1983) Viscosity Solution of Hamilton-Jacobi Equations. Transactions of the American Mathematical Society, 277, 1-42. https://doi.org/10.1090/S0002-9947-1983-0690039-8
[6] Fleming, W.H. (1969) The Cauchy Problem for a Nonlinear First Order Partial Differential Equation. Journal of Differential Equations, 5, 515-530. https://doi.org/10.1016/0022-0396(69)90091-6
[7] Agrachev, A.A., Barilari, D. and Boscain, U. (2012) Introduction to Riemannian and Sub-Riemannian Geometry. https://people.sissa.it/~agrachev/agrachev_files/ABB-final-SRnotes.pdf
[8] Rifford, L. (2014) Sub-Riemannian Geometry and Optimal Transport. Springer, Cham. https://doi.org/10.1007/978-3-319-04804-8
[9] Trelat, E. (2000) Some Properties of the Value Function and Its Level Sets for Affine Control Systems with Quadratic Cost. Journal of Dynamical and Control Systems, 61, 511-541.
[10] Sussmann, H.J. and Jurdjevic, V. (1972) Controllability of Nonlinear Systems. Journal of Differential Equations, 12, 95-116. https://doi.org/10.1016/0022-0396(72)90007-1
[11] Zhu, J.H. (2005) On Stochastic Riccati Equations for the Stochastic LQR Problem. System and Control Letters, 54, 119-124. https://doi.org/10.1016/j.sysconle.2004.07.003
[12] Burden, R.L. and Faires, J.D. (1989) Numerical Analysis. 4th edition, Weber and Schmidt, Boston.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.