Experimental Study of Methods of Scenario Lattice Construction for Stochastic Dual Dynamic Programming ()
1. Introduction
This article discusses one of the most powerful modern algorithms, which can solve problems of stochastic optimization—the stochastic dual dynamic programming algorithm first described in [1]. There are several ways to implement the algorithm; an exhaustive survey can be found in [2]. We will discuss the most commonly used version—SDDP algorithm with scenario lattice. The main goal of our study is the testing of methods of a lattice construction. We considered “The problem of production, sales and product storage”, which is an extension of the newsvendor problem [3], to compare the scenario lattice construction methods.
There are many articles about scenario generation methods. The approach of [4] is based on variance reduction techniques. There are works with overviews of different methods [5] [6] [7]. Multiple scenario lattice construction methods exist, but we will only consider 3 of them: k-means, competitive learning and Voronoi cell sampling. In this study, we use the scenario generation methods for stochastic variables that follow a Markovian process. We are focusing on comparing some different techniques. There were some works with a similar goal, for example [8]. However, in this work, we go further and make an empirical analysis in the stage-dependent case. We perform modeling for several days, and each next day is related to the previous day (unlike [8], where stage-independent processes are considered).
The rest of the paper is organized as follows: In Section 2, we are making an introduction to the theory, then in Sections 3 and 4 describing the methods of lattice construction we are using, and then we discuss numerical experiments in Section 5, and after this summarize all the results in Section 6.
2. Problem Formulation
The original problem of stochastic programming in general terms looks like the following [8]:
(1.1)
(1.2)
where x is a controlling variable with definition area
; z is a vector vector of realizations of random variable Z;
is a cost function.
To resolve this problem, we need to move from integral, whose computation is too hard, to sum, so our problem will have the discrete form:
(1.3)
(1.4)
In literature, this transition is known as sample average approximation (SAA), and it is formally described, for example, in [9].
There are many ways to implement transition from (1.1)-(1.2) to (1.3)-(1.4), for example Monte Carlo [10] [11] or quasi-Monte Carlo methods [12]. We will use another method based on the direct reduction of the approximation error of
to
, which was described in [12]. We will briefly describe this method below.
3. Methods and Algorithms of Probability Distribution Discretization
Our approach is based on established work [4]. We will briefly discuss it here. Our task is to approximate the distribution F with discrete one
. To do so, we generate realizations of F. The Monte-Carlo method provides the approximation cost function convergence to the real cost function with probability of 1, but we want to reduce the variance of the estimate to speed up the convergence of the error bounds. To do so, we want to address the approximation error directly. Let’s define an error in SAA as absolute difference between the cost functions:
(1.5)
Since it is nearly impossible to compute
with (1.5), it is usual to work with the upper bound of the error. To estimate the upper bound, firstly, we need to introduce some additional notation. Denote as
Lipschitz constant of
function of r order:
(1.6)
where
is the upper bound. Further,
is Wasserstein distance between F and
distributions:
, (1.7)
where the minimum is taken over the entire space of distribution functions
, marginal distributions F and
are Lipschitz, which means they satisfy (1.6). From [12] we know that:
, (1.8)
so as far as
is a constant for given r and
, it is clear that with the reduction of
, the approximation error will also reduce.
It is impractical to find
using (1.7) because of the integral, so we will use distance between two discrete distributions:
, (1.9)
where
are the probabilities of the corresponding discrete distribution values, and
. Thus, according to (1.8), we need to find the discrete distribution
, which is the best approximation for F, reducing SAA error.
Since we consider continuous distributions, we need to generate a sample of size N, so (1.9) will have the following form:
, (1.10)
where
are probabilities of discrete distribution values;
, if element
of the original sample attributes to element
of the new distribution, otherwise
; M is the number of values of the new distribution,
.
Now, we will briefly describe algorithms, which we used to solve (1.10) and get a new approximate probability distribution. More comprehensively, they are discussed in [8].
k-means algorithm:
1) Randomly choose M elements
from the source sample
.
2) Attribute element
of sample
to element j of sample
, if j is the solution of
.
3) Recalculate cluster centers
:
, where
is the
number of elements of the sample
, attributed to cluster j on stage 2.
4) If the stopping condition is met, then stop, else return to step 2.
Note that we are using k-means modification from [13].
Competitive learning:
The sample
is obtained from the distribution F and initial approximations
are chosen. Then the following algorithm is fulfilled for
:
(1.11)
is the step size,
, values
give the wanted discrete distribution.
Voronoi cells sampling:
The sample
is obtained from the distribution F and initial approximations
are chosen. Then the following algorithm is fulfilled for
:
(1.12)
Further, put
and:
(1.13)
4. Scenario Lattice Construction Algorithm
Now consider the scenario lattice construction algorithm that was used.
1) First of all, the grid parameters are selected: the number of stages, the number of nodes at each stage, the number of scenarios generated from each node.
2) Then, from each vertex of stage, t, we generate M scenarios according to the relevant random processes.
3) From the M× N scenarios, where N is the number of nodes in thet stage, N is selected using one of the methods described in Section 3.
4) The probabilities of transition from the vertex i of stage t to the vertex j of stage t + 1 are calculated as the ratio of the number of scenarios generated from the vertex i and related to the vertex j of stage t + 1 to the total number of scenarios generated from the vertex i.
5) If t+ 1 is not equal to T, where T is the number of stages, we pass to point 2; otherwise, the grid is constructed.
The lattice obtained with this algorithm can be illustrated as follows (Figure 1):
In Figure 1, each vertex represents a vector of a given dimension, corresponding to random variables. The probability of transition from the node i of stage t to the node j of stage t+ 1 is indicated by
. Some probabilities can be zero, and since we are using clustering algorithms to group scenarios, our lattice can be illustrated by Figure 2. The vertex of step 1 corresponds to the initial values, which are selected before the beginning of the grid construction algorithm. We are using standard method [1] to choose scenarios during the forward step of the SDDP algorithm.
![]()
Figure 2. Scenario lattice construction.
5. The Case Problem
To compare k-means, competitive learning, and Voronoi cell sampling algorithms, we will use them to solve “The problem of commodities production, storage and selling” that can be formulated as follows.
Let T be the time interval at which we consider the problem, k is the number of type of goods,
is the cost of single item production of type i,
is the maximum number of items that can be produced in one day,
is the number of goods of type i that were produced on day t,
is the selling price for goods of type i,
is the demand for type of goods i on day t,
– number of undelivered goods of type i on day t,
is the cost of storage of goods of type i.
The goal is to find a strategy (amount of produced goods) that leads to maximum income.
Income in day t:
Amount of sold goods in day t:
Amount of goods in storage at the end of day t:
CVaR is used as a risk measure; the optimization criterion is the weighted sum of CVaR and profit expectation:
.
Formally dynamic programming equations look like the following:
for
where
, (
by definition),
и
.
for
6. Numerical Experiments
For numerical experiments, we considered four different stochastic processes for the demand for goods.
1) Autoregressive model (AR):
,
where c is a constant,
are the model parameters,
2) Autoregressive moving-average model (ARMA):
,
where c is a constant,
are the model parameters,
3) Geometric Brownian motion (GBM):
,
where
are process parameters,
is the Wiener process.
4) Stage independent normal distribution (SIND):
We are using the processes with chosen parameters, so in our case, the processes look as follows:
1) GBM:
;
2) AR(1):
;
3) ARMA(1,1):
;
4) SIND:
.
The experiments were organized as follows. First of all, for every combination of process (AR, ARMA, GBM, SIND) and scenario grid construction algorithm, we simulated the stochastic process several times. We then checked if the results came from the normal distribution using the Shapiro-Wilk test and quantile-quantile (Q-Q) plot, then compared results using the one-sided t-test. (Q-Q) plots show the relationship between observed data and theoretical quantiles. It is necessary to check our results for normality because we are using the t-test to compare average profits obtained using different lattice construction methods. All the parameters of our case problem are in Table 1. We have chosen these parameters so as to simulate the real-world situation.
For the SDDP algorithm, we used the following optimization criterion:
Our lattice construction parameters:
The number of nodes at each stage is 10; number of scenarios generated from each node during lattice construction is 100.
To perform the experiments, we were using a computer with an Intel Core i5-6300HQ processor running at 2.30 GHz.
We used Python 3.7.3 to generate scenario lattice and Julia 1.3.1 to run the SDDP algorithm; we implemented the SDDP by ourselves. To understand SDDP realization, please refer to [14].
The full list and versions of used packages are in Table 2 for Python and in Table 3 for Julia.
![]()
Table 1. Parameters of experiments.
6.1. Numerical Experiments
First of all, it is necessary to compare lattice construction time for every process and lattice construction algorithm because work time plays a crucial role in computing.
As we can see (Figure 3), k-means works much faster than the Voronoi and Competitive learning methods, whose results are almost the same.
6.2. AR Demand Results
As we can see from the tables and plots, our results came from a normal distribution (Figures 4-6 and Table 4), and mean profit from scenarios generated by the Voronoi method is higher than the profit of scenarios generated by k-means and Competitive learning (Table 4 and Table 5). (Element i,j of Table 5 means p-value of t-test, which compares method in row i and the method in column j).
6.3. ARMA Demand Results
In this case, our results are also normal (see Figures 7-9 and Table 6), but there is no difference between k-means and Voronoi, while Competitive learning results are terrible (Table 6 and Table 7).
6.4. GBM Demand Results
For GBM process, our results are slightly different from the normal distribution but not so much (Figures 10-12 and Table 8), and all three methods show almost the same result (Table 8 and Table 9).
6.5. Stage-Independent Normal Distribution Demand Results
For the Competitive learning method, our results slightly differ from the normal distribution but not so much (Figures 13-15 and Table 10). The k-means and Voronoi methods show almost identical results, while Competitive learning loses pretty badly (Table 10 and Table 11).
![]()
Figure 5. Competitive learning Q-Q plot.
![]()
Figure 8. Competitive learning Q-Q plot.
![]()
Table 11. Stage-independent t-test p-values.
7. Conclusions
As we can see in the numerical results section, in our experimental problem, the Voronoi method slightly outperforms others, but the k-means method is much faster.
1) With the AR process, the Voronoi method performs better than k-means and Competitive learning, which both show almost the same result.
2) With the ARMA process, the Voronoi method is comparable to k-means, and Competitive learning underperforms pretty hard.
3) With GBM process, results of the methods are close to each other.
4) Scenario lattice calculation time was the same for the Competitive learning and Voronoi method but much lower for the k-means.