Polynomial Time Method for Solving Nash Equilibria of Zero-Sum Games ()
1. Introduction
It is well known that John von Neumann the existence of Nash equilibria of zero-sum games in the late 1920s [1].
Dantzig and Thapa [2] mentioned the linear programming formulation of a zero-sum game, and considered reducing linear primal and dual problems to a zero-sum game.
Khachian [3] and Karmarkar’s [4] interior-point methods for solving linear programming problems in polynomial time are significant discoveries.
Khachian’s ellipsoid method [3] is important in that it is a first polynomial time algorithm, however, computational results with it have been disappointing.
Karmarkar’s projective scaling method [4] is considered to be a practical polynomial algorithm. Numerous attempts have been made to refine a variety of interior point methods in the late 1980s [5] [6]. There are few studies concerning computational methods (e.g., [7] ) to solve a Nash equilibrium of a zero-sum game, in spite of those developments in linear programming.
We discuss the use of the Karmarkar’s method to solve a Nash equilibrium of a zero-sum game, which is expressed as linear programming problems derived from the original zero-sum game, and prove that it is without doubt a polynomial time algorithm. We implement the Karmarkar method, and apply it for solving Rock-Paper-Scissors, of which result seems to be promising.
Finally, we also mention an affine scaling method that would help us compute a Nash equilibrium effectively.
2. Formulation
Dantzig and Thapa [2] mentioned that maximin strategy of Player 1 in a zero-sum game can be formulated as the following linear programming problem:
(1)
where
,
is a payoff matrix, and
is a mixed strategy vector of Player 1.
Likewise, the minimax strategy of Player 2 can be formulated as the following linear programming problem:
(2)
where
is a mixed strategy of Player 2, which is, in fact, the linear programming dual problem of (1).
The following result which clarifies the relationship between (1) and (2).
Theorem 1. There exist solutions
and
for (1) and (2). Moreover,
.
Proof We let
,
,
, and rewrite (1) as
(3)
The dual linear program of (1) is
(4)
Note that (4) is equivalent to (2) by letting
.
There exists a feasible point for (3), such as
,
,
. We also observe that (3) has a finite solution, as
should be bounded because
, where
is an
simplex. Therefore,
because the strong duality theorem [8] holds. □
The above result shows that (2) is the linear programming dual problem of (1).
Next, we establish the validity of the above formulation. Let
and
be sets of pure strategies of Players 1 and 2, respectively. Let
denote the expected payoff value for Player 1 when Player 1 takes x and Player 2 takes y as a mixed strategy, that is,
(5)
Theorem 2.
. (6)
Proof Because
and
are the solutions for (1) and (2), respectively,
,
where
from Theorem 1.
Therefore,
and
namely,
. □
The result from Theorem 2 is known as the minimax theorem [1]; however, the direct derivation above would be simple and easy to comprehend.
3. Karmarkar Method
We can use the simplex method to find a Nash equilibrium from the results of Section 2.
However, the curse of dimensionality may occur if the running time required to solve a linear problem using the simplex method may increase rapidly as the number of variables increases.
Therefore, we employ a variation of Karmarkar’s method [9] which belongs to an interior point method and assures the polynomial time convergence property.
The Karmarkar method [9] we adopt deals with the following canonical form:
(7)
where
,
, and
.
If
, where
is the solution of (7), we refer to (7) as the canonical form.
It should be noted that the original Karmarkar’s method [4] utilizes
instead of a. In this study, we utilize a because we can formulate the part of components of x for mixed strategies as
in (14) and (15) later.
The key projective transformation sends x to
where
. We notice that
, where
is
-simplex. The corresponding inverse transformation is
As seen in [9], the problem in (7) is transformed as
(8)
where
and
. We denote the constraint matrix using
,
and the corresponding orthogonal projection matrix is
.
The projected steepest-descent direction is
(9)
At each step, the next point
is given by
,
,
more precisely,
, (10)
so that
will be an inner point in
.
The next point
in x-space is obtained by
. (11)
Finally, we define the optimality criterion as
, (12)
where
is a prescribed precision.
Next, we describe the Karmarkar method as follows:
Algorithm 1.
Input:
,
, a feasible point
for (7)
Output:
such that
,
,
,
.
1: Compute
by (9).
2: Determine
by (10).
3: Obtain
using (11). Check the optimality criterion using (12).
4:
. Go to Step 1.
Karmarkar defined a potential function as follows:
. (13)
He proved that Algorithm 1 generates,
, which reduces
by
> 0 at each iteration.
In fact, the following result follows for original Karmarkar’s algorithm [4].
Theorem 3. Let
be the sequence generated by Karmarkar’s algorithm [4] applied to (3). Then, (3) can be solved in polynomial time.
Proof We can transform (3) to original Karmarkar’s original canonical form [4] in pol-ynomial time in the manner of [4]. Then the result follows since the original canonical form can be solved in at most (𝑛(𝑞 + log2 𝑛)) iterations from Theorem 1 [4]. □
We can establish the following result.
Theorem 4. A Nash equilibrium of zero-sum games can be found in polynomial time.
Proof Nash equilibria of zero-sum games can be formulated as (1) and (2), even if the value of the game is not necessarily 0. We can rewrite (1) and (2) as (3) and (4) respectively. (3) (and (4)) can be solved in polynomial time from Theorem 3. □
We should observe that the transformation in the above proof is used only for the proof itself. When
, we can employ the dual problem scheme [9, Chapter 17.5] to update an estimated lower bound to,
, which is used to transform (3) into the canonical form (7).
4. Computational Results
In this section, we employ the Karmarkar method described in the previous section.
The zero-sum game (1) and (2) can be rewritten as:
(14)
and
(15)
where
,
,
,
.
These problems belong to the canonical form (7), which can be solved by Algorithm 1.
Rock-Paper-Scissors is used as a test problem of zero-sum games. Its payoff matrix for Player 1 (gain) and Player 2 (loss) is shown in Table 1.
The Nash equilibrium of Rock-Paper-Scissors is
and the value of the game is 0.
It should be noted that (14) and (15) are exactly the same because
in this case. Thus, we have only the results of (14) to demonstrate, namely, the problem is:
(16)
The Karmarkar method was coded in Python, and run on a Windows PC (Core i7 8550U, RAM 16GB).
Table 2 summarizes the sequence generated by the Karmarkar method applied to (16) from
,
,
,
. We can see that the objective value decreases by half in each iteration, which with (12) validates Theorem 3.
Figure 1 shows the sequences generated by the Karmarkar method from three different starting points
Figure 1. Sequences for different starting points.
Table 1. Payoff matrix of rock-paper-scissors for player 1 (P1) and player 2 (P2).
Table 2. Details for x0 = (0.8, 0.1, 0.1).
.
It seems that each sequence tends linearly to the Nash equilibrium from an arbitrary interior point.
Table 2 and Figure 1 show that the Karmarkar method is effective and efficient even for small-size problems.
5. Concluding Remarks
Linear programming formulations can represent the minimax principle for zero-sum games.
We have proved that the Karmarkar’s algorithm solves the linear programming problems in polynomial time.
The Karmarkar method performs well when the value of the game is 0 . If the value of the game is not 0 , we can utilize the dual problem scheme ( [9], Chapter 17.5).
We can resort to an affine scaling algorithm [10] because, either (1) or (2) is the linear programming standard form generated by introducing slack variables, regardless of whether or not the value of the game is 0 . The affine scaling algorithm for solving equilibria of zero-sum games would be practically effective; however, there is no proof that it has a polynomial time convergence property [11].