The Solution Classical Feedback Optimal Control Problem for m-Persons Differential Game with Imperfect Information ()
1. Mathematical Challenge: Creating a Game Theory That Scales
What new scalable mathematics is needed to replace the traditional Partial Differential Equations (PDE) approach to differential games?
Let be a probably space. Any stochastic process on is a measurable mapping. Many stochastic optimal control problems essentially come down to constructing a function that has the properties 1)2)
, where is the termination payoff functional, is a control and is some Markov process governed by some stochastic Ito’s equation driven by a Brownian motion of the form 3)where is the Brownian motion. Traditionally the function has been computed by way of solving the associated Bellman equation, for which various numerical techniques mostly variations of the finite difference scheme have been developed. Another approach, which takes advantage of the recent developments in computing technology and allows one to construct the function by way of backward induction governed by Bellman’s principle such that described in [1]. In paper [1] Equation (3) is approximated by an equation with affine coefficients which admits an explicit solution in terms of integrals of the exponential Brownian motion. In approach proposed in paper [2,3] we have replaced Equation (3) by Colombeau-Ito’s Equation (4)
, where is the white noise on, i.e., almost surely in, and is the smoothed white noise on i.e.,
and is a model delta net [2,4]. Fortunately in contrast with Equation (3) one can solve Equation (4) without any approximation using strong large deviations principle [4]. In this paper we considered only quasi stochastic case, i.e.. General case will be considered in forthcoming papers.
Statement of the novelty and uniqueness of the proposed idea: A new approach, which is proposed in this paper allows one to construct the Bellman function and optimal control directly, i.e., without any reference to the Bellman equation, by way of using strong large deviations principle for the solutions Colombeau-Ito’s SDE (CISDE).
2. Proposed Approach
Let us consider an m-persons Colombeau-Ito’s differential game with a stochastic nonlinear dynamics:
;
(1)
and m-persons Colombeau-Ito’s differential game with imperfect information about the system [5-8]:
;
(2)
Here is the algebra of Colombeau generalized functions [9], is the ring of Colombeau’s generalized numbers [10-12],; is the control chosen by the i-th player, within a set of admissible control values, and the playoff for the i-th player is:
. (3)
where is the trajectory of the Equation (1). Optimal control problem for the i-th player is:
. (4)
Let us consider now a family of the solutions Colombeau-Ito’s SDE:
(5)
where is n-dimensional Brownian motion,
is a polynomial, i.e.
Definition 1. CISDE (5) is -dissipative if exist Lyapunov candidate function and Colombeau constants, such that:
1)
2).
Theorem 1. Main result (strong large deviations principle) [5,13]. For any solution of dissipative CISDE (5) and valued parameters, there exist Colombeau constant
such that:
. (6)
where a function is the solution of the master equation:
(7)
where the Jacobian, i.e. is a - matrix:
.
Remark.1. We note that
.
Example 1.
.
From a general master Equation (7) one obtain the next linear master equation:
. (8)
From the differential master Equation (8) one obtain transcendental master equation
. (9)
Numerical simulation: Figures 1 and 2.
. (10)
Here Let us consider now an m-persons Colombeau stochastic differential game with nonlinear dynamics
(11)
Here, is the control chosen by the i-th player, within a set of admissible control values, and the playoff of the i-th player is
(12)
where and is the trajectory of the Equation (11).
Theorem 2. For any solution
of the dissipative and val-
Figure 1. The solution of the Equation (8) in a comparison with a corresponding solution of the ODE (10).
ued parameters, there exists Colombeau constant such that:
. (13)
where the trajectory of the corresponding master game
(14)
Example 2.
1)
optimal control problem for the first player:
and optimal control problem for the second player:
From Equation (14) we obtain corresponding master game:
2)
optimal control problem for the first player is:
and optimal control problem for the second player is:
Having solved by standard way [14,15] linear master game (2) one obtain optimal feedback control of the first player:
and optimal feedback control of the second player [5]:
Here
where is a part-whole of a number. Thus, for numerical simulation we obtain ODE:
Numerical simulation: Figures 3-6
Theorem 3. For any solution
Figure 5. Optimal control of the first player.
of the dissipative and
valued parameters, there exists Colombeau constant such that:
. (15)
where the trajectory of the corresponding master game
(16)
Example 3. Game with imperfect measurements.
1)
From Equation (16) one obtain corresponding master game:
2)
Having solved by standard way linear master game (2) one obtain local optimal feedback control of the first player [5]:
and local optimal feedback control of the second player:
Thus, finally we obtain global optimal control of the next form [5]:
Here
where is a part-whole of a number. Thus, for numerical simulation we obtain ODE:
Numerical simulation: Figures 7-12. Game with imperfect measurements: red curves. Classical game: blue curves .
3. Homing Missile Guidance with Imperfect Measurements Capable to Defeat in Conditions of Hostile Active Radio-Electronic Jamming
Homing missile guidance strategies (guidance laws) dictate the manner in which the missile will guide to intercept, or rendezvous with, the target. The feedback nature of homing guidance allows the guided missile (or, more generally, the pursuer) to tolerate some level of (sensor) measurement uncertainties, errors in the assumptions used to model the engagement (e.g., unanticipated target maneuver), and errors in modeling missile capability (e.g., deviation of actual missile speed of response to guidance commands from the design assumptions). Nevertheless, the selection of a guidance strategy and its subsequent mechanization are crucial design factors that can have substantial impact on guided missile performance. Key drivers to guidance law design include the type of targeting sensor to be used (passive IR, active or semi-active RF, etc.), accuracy of the targeting and inertial measurement unit (IMU) sensors, missile maneuverability, and, finally yet important, the types of targets to be engaged and their associated maneuverability levels.
Figure 13 shows the intercept geometry of a missile in planar pursuit of a target. Taking the origin of the reference frame to be the instantaneous position of the missile, the equation of motion in polar form are [16]:
(17)
1) The variable denotes a true target-tomissile range.
2) The variable denotes the it is real measured target-to-missile range.
3) The variable denotes a true line-of-sight angle (LOST) i.e., the it is true angle between the constant reference direction and target-to-missile direction.
4) The variable denotes the it is real measured line-of-sight angle (LOSM) i.e., the it is true angle between the constant reference direction and target-tomissile direction.
5) The variable denotes the missiles acceleration along direction which perpendicularly to line-of-sight direction.
6) The variable denotes the missile acceleration along target-to-missile direction.
7) The variable denotes the target acceleration along direction which perpendicularly to line-of-sight direction.
8) The variable denotes the target acceleration along target-to-missile direction.
Using replacement into Equation (17) one obtain:
(18)
Figure 7. Uncertainty of speed measurements.
Figure 11. Optimal control of the first player.
Figure 12. Optimal control of the second player.
(18)
Suppose that:
Therefore
(20)
Let us consider antagonistic Colombeau differential game
with non-linear dynamics and imperfect measurements [6]:
(21)
Optimal control problem of the first player is:
(22)
Optimal control problem of the second player is:
(23)
From Equations (21)-(23) one obtain corresponding linear master game:
(24)
From Equation (24) we obtain quasi optimal solution for the antagonistic differential game
given by Equations (21)(23). Quasi optimal control of the first player and quasi optimal control of the second player are:
(25)
Thus, for numerical simulation we obtain ODE:
Example 4: Figures 14-24.
Figure 15. Uncertainty of measurements of a variable.
4. Conclusions
Supporting Technical Analysis: Let us consider optimal control problem from Example 1, corresponding Bellman type equation is:
(27)
Figure 17. Speed of rapprochement missile-to-target:.
Figure 21. Missile acceleration along target-to-missile direction:.
Figure 22. Missile acceleration along direction which perpendicularly to line-of-sight direction:.
Figure 23. Target acceleration along target-to-missile direction:.
Figure 24. Target acceleration along direction which perpendicularly to line-of-sight direction:.
Complete constructing the exact analytical solution for PDE (27) is a complicated unresolved classical problem, because PDE (27) is not amenable to analytical treatments. Even the theorem of existence classical solution for boundary Problems such (27) is not proved. Thus, even for simple cases a problem of construction feedback optimal control by the associated Bellman equation complicated numerical technology or principal simplification is needed [17]. However as one can see complete constructing feedback optimal control from Theorems 1-2 is simple. In study [6], the generic imperfect dynamic models of air-to-surface missiles are given in addition to the related simple guidance law.