Long Term Load Forecasting and Recommendations for China Based on Support Vector Regression ()
1. Introduction
The time horizon for long-term forecasting ranges between a few weeks and several years. For example, the time horizon is set as 5, 10, 20 years ahead for the planning of power systems, the schedule of construction of new generating capacity and the purchase of generating units [1] . It is difficult to forecast load demand accurately over a planning period of this length because there are a large number of factors affecting load characterized by direct or indirect effect on the underlying forecasting process. Moreover, all the factors are uncertain and uncontrollable. Therefore, any LTLF by nature is inaccurate [2] . In the last few decades, many methods have been applied to improve the accuracy of LTLF. These can be classified into parametric and artificial intelligence methods. Parametric methods construct a statistical model of load by mining the qualitative relationships between load and factors affecting load. These methods, such as multiple linear regression, autoregressive and moving average, need assumed parameters estimated from historical data [3-5] . Usually they can’t deal with nonlinear or random relationships between load and factors affecting load. Recently, artificial intelligence methods including fuzzy logic and artificial neural networks (ANNs) have been implemented as a substitute for parametric methods because of the robustness of the load prediction system and their fitness to nonlinear relationships [6-10] . Other artificial intelligence methods such as dynamic simulation theory (DS) [1] , particle-swarm optimization (PSO) [11] , fuzzy logic [12,13] and grey relational grade [14] have been applied, too. Besides these, there are also various hybrid methods [8,10, 12,13,15] that combine the ability of individual methods to improve the accuracy of LTLF. However, there are some problems with these methods. Fuzzy logic has difficulty in inheriting the knowledge of the previous mathematical models and is poor at solving logic problems [1] . Multilayered feed forward network with one hidden neuron layer has drawbacks in that it does not clarify the relationship between input and output variables due to a black-box description [1] . Inherit learning algorithms like DS and PSO have the ability to search for the global optimal solutions, but the search need lots of information and data which are limited in LTLF [1] .
Recently, a soft computing algorithm based on support vector machines (SVMs), i.e. support vector regression (SVR), has been proposed for load forecasting because of the structural risk minimization (SRM) principle to minimize the upper bound on the generalization error. Both SVMs and SVR have been successfully applied to shortterm load forecasting already. Ao, Wang and Zhang [16] proposed a hybrid model based on dual support vector machines to deal with short-term load forecasting. The first SVM takes the recent samples in the vicinity of the demand day as training samples, and the second one takes the same season’s load samples in historic years to reflect the season-period rule [16] . Hong applied immune algorithm and chaotic particle swarm optimization algorithm separately to confirm the parameters of a SVR model for short term load forecasting [17,18] . SVR has also been integrated with other algorithms like regression to improve the accuracy of the new hybrid methods [19] . However, compared with application of SVR in short-term load forecasting, there is far less application in LTLF.
In this paper, SVR is proposed for a five-year load forecasting within China. And an economic factor, GDP, is included rather than using methods which just employ the historical data of the load [4,6,7-9,11] . It is important to note that total load consumption is a key indicator of economical growth, especially for recovery from a global economic recession. Thus, it is necessary to include economic factors in load forecasting. It is also helpful to assess the effect of forecasting results on economic growth. Weather factors are omitted in our study because economical factors are more influential than weather conditions in LTLF [13] . We put focus on the relationship between load and GDP, and discuss the effect of this relationship on the expected economic growth of China to provide recommenddations. SVR is applied in this paper because of the SRM principle rather than the minimization of training errors which is used by ANNs [18]. So it is feasible to perform LTLF with better accuracy. Besides, SVR provides a mapping to transfer the actual time series into a multi-dimensional space to depict the nonlinear relationship [17] between GDP and load represented by load output, load imports and load export. Moreover, it is not necessary to collect a great amount of historical data for the learning of SVR. Thus, it is easy to implement LTLF with limited data by SVR.
2. Support Vector Regression
Support vector machines (SVMs), proposed by Vapnik [20] , are one of the significant developments in overcoming shortcomings of ANNs mentioned above. Rather than by implementing the empirical risk minimization (ERM) principle to minimize the training error, SVMs apply the structural risk minimization (SRM) principle to minimize an upper bound on the generalization error. SVMs could theoretically guarantee to achieve the global optimum, instead of trapping local optimum like ANNs models. Thus, the solution of a nonlinear problem in the original lower dimensional input space could find its linear solution in the higher dimensional feature space. Particularly, along with the introduction of Vapnik’s ε-insensitive loss function, SVMs also have been extended to solve nonlinear regression estimation problems, which are so-called support vector regression (SVR).
2.1. SVR Function
The key idea of SVMs for regression is the nonlinear mapping. A nonlinear mapping is defined to map the input data into a high dimensional feature space, (Figures 1(a) and (b)). Then, in the high dimensional feature space, there theoretically exists a linear function, g, to formulate the nonlinear relationship between the input data and the output data. Such a linear function, namely, the SVR function, is as Equation (1),
(1)
where depicts the numerical forecasting results, and the coefficients and are adjustable parameters.
The parameter vector w in Equation (1) could be obtained as,
(2)
where are obtained by solving a quadratic program and are the Lagrangian multipliers.
2.2. Minimization of the Empirical Risk
As mentioned above, the SVM method attempts to minimize the empirical risk,
(3)
where is the ε-insensitive loss function (the thick line in Figure 1(c)) defined as (4):
(4)
In addition, is employed to find the optimum hyperplane on the high dimensional feature space (Figure 1(b)) to maximize the distance separating the training data into two subsets. Thus, the SVR focuses on finding the optimum hyperplane and minimizing the training error between the input data and the ε-insensitive loss function.
2.3. SVR Model
SVR minimizes the overall errors, thus:
(5)
with the constraints
The first term of Equation (5) is used to regularize weight sizes, penalize large weights and maintain regression function flatness. The second term penalizes training errors of and y by using the ε-insensitive loss function. C is a parameter to trade off these two terms. Training errors above ε are denoted as whereas training errors below ε are denoted as (Figure 1(b)).
2.4. Kernel Function
Finally, the SVR function is obtained as (6) in the dual space:
(6)
where is called the kernel function, and the value of the kernel equals the inner product of two vectors, xi and xj, in the feature space and, respectively; that is,. There are several types of kernel function. However, it has been hard to determine the type of kernel function for specific data patterns until now. The most used kernel functions are the Gaussian radial basis functions (RBF) with a width as follows.
(7)
The Gaussian RBF kernel is not only easier to implement, but also capable of nonlinearly mapping the training data into an infinite dimensional space. Thus, it is suitable for dealing with nonlinear relationship problems. Therefore, the Gaussian RBF kernel function is specified in this paper.
3. LTLF Based on SVR
In Section 2, we set up a forecasting method based on SVR with the Gaussian RBF kernel function. In this section, we use the historical data from 1995 to 2008 for the learning of the parameters of SVR. SVR is then applied for a five-year load forecasting within China. Finally, several recommendations are provided on load output, load imports and load exports according to the expected growth of annual GDP of 7.5% mentioned in China’s 11th FiveYear Plan (2006-2010). The data of load output and GDP are found to be identical with the rapid economic growth and the intensive pattern of power consumption in China from 1995, the beginning of the 9th Five-Year Plan (1995- 2000). It is also observed that the growth rates of load output and GDP in 2008 are lower than those at the beginning of the Five-Year Plan, which can be taken as a reflection of the global economic recession starting at the end of 2007. Although the trend of load output is similar to that of GDP, there is no parallel relationship among GDP, load imports and load exports. In this paper, the load is calculated by (8). Thus, the relationship between GDP and load is a nonlinear one from 1995 to 2008.
(8)
Taking the historical data above as the learning samples, we implement the learning of SVR. In detail, the parameters of Gaussian RBF kernel function used in SVR, ε and σ, are optimized according to the value of the independent variable, GDP, and the value of the dependent variables, load output, load imports and load exports, respectively. After the learning of SVR, we perform the load forecasting for China from 2009 to 2013. According to China’s 11th Five-Year Plan, the expected growth rate of GDP from 2006 to 2010 is 7.5%. Thus we use the rate of 7.5% in this paper (Figure 2). Finally, Table 1 shows the numerical results of predicted load output, load imports and load exports for China from 2009 to 2013 using SVR.
3.1. Recommendations
In order to show the superiority of the proposed model and the reliability of recommendations according to China’s
Figure 2. GDP of China from 1995 to 2013.
Table 1. Load output, load imports and load exports forecasting using SVR.
11th Five-Year Plan (2006-2010), we also apply other methods like linear regression (LR) and back-propagation neural network (BPNN) as comparisons. In China’s 11th Five-Year Plan, the supply of power load is vital to the function of economic activity. Concerning electric power, due to the 3 - 4 years lead time of power infrastructure construction, proper recommendations and active strategies for electric power supply should be drawn up. The policy of “electricity supply leads economic growth” has been implemented by the Chinese Government in the current period of power shortage and should be continued for a long period. That is, China should meet more and more domestic power load demand by itself, especially for the expected economic growth in the recession.
3.1.1. Recommendations on Load Output
In Table 2, the growth rate of load output was more than 10% from 2001 to 2007, falling to 5% in 2008 due to the global economic recession. It will take time for the growth rate to return to a level of 10%. It is obvious that the growth rate in 2009 obtained from LR or BPNN is more than 15%, and that the growth rate will fall and stay at 7% in the remaining years. Conversely, SVR provides a positive growth rate curve of load output, starting at 8% in 2009 and reaching 13% in 2013, coinciding with the expected GDP of China. Thus, it is recommended that China should take actions on the development of the domestic electricity market and increase of the load supply to industry for the expected economic growth. It is also suggested that attention should be paid to the diversity of energy supply with preference on renewable energy such as wind and solar power.
3.1.2. Recommendations on Load Imports
In Table 3, all methods provide positive curves of load imports. The starting points of the curves obtained from LR and BPNN are higher than the amount of load imports in 2008. However, these are unreliable due to the economic challenges China is facing currently. Power load imports have always been an international dispute and are sensitive to disruption by international political affairs, geostrategic riffles and even subtle sentimental changes. Up to now the national level of strategic load reserve in
Table 2. The results of forecasting methods for growth rate of load output.
Table 3. The results of forecasting methods for load imports.
China is yet to be established. Without sufficient pre-action, the Chinese domestic economy will be vulnerable to change of international power load trade. For example, the amount of quarterly load imports from Russia was more than 1.0 (100 million kwh) in 2007. Since the appearance of the global economic recession in 2008, it has declined to 0.2 (100 million kwh). In 2009, it even reached 0 for the 1st quarter. So the estimated load imports based on SVR are more consistent with the situation of load imports of China. Therefore, it is important for China to prepare for the challenge, that is, to achieve the expected GDP by adjusting the industrial output structure and the development of renewable energy rather than load imports.
3.1.3. Recommendations on Load Exports
Based on the forecasting results in Table 4, it will be a challenge for China to get the expected growth of GDP through increasing load exports. However, it would help the recovery of countries and regions in Asia from economic recession. For example, the amount of load exports to Hong Kong, Macao and Vietnam all kept rising in 2008, and then dropped a little in the first quarter of 2009. Moreover, the curve provided by SVR is above those by LR and BPNN, indicating SVR is more reliable for the expected growth of GDP.
4. Conclusions
By nature, LTLF is a complex problem. Among the factors affecting load, the accuracy of forecasting is more influenced by economical factors than weather conditions.
Table 4. The results of forecasting methods for load exports.
In developing countries, where demand for power is increasing more dramatically, LTLF is even more important. Thus, it is necessary to find a nonlinear relationship between load and economic factors like GDP, when implementing LTLF.
In this paper, we propose SVR for LTLF within China and provide recommendations on future load output, load imports and load exports. Using mapping to transfer the nonlinear relationship in the original lower dimensional input space into a higher dimensional feature space, SVR can identify the relationship between load and GDP. SVR is also suitable for small-sample cases like LTLF. We first took the historical annual data of load output, load imports, load exports and GDP from 1995 to 2008 as learning samples to perform the learning of SVR. We then applied SVR for LTLF within China from 2009 to 2013. According to the expected GDP in China’s 11th Five-Year Plan (2006-2010), we assessed the performance of SVR in comparison with LR and BPNN. We found that the recommendations based on the forecasting results obtained from SVR are more reliable and feasible for the current role China plays in international trade in the global economic recession.
It would be interesting to see long load forecasting for more types of load and of factors affecting load within China. Employing more than one economic factor to make a comprehensive study on load forecasting with other factors according to the characteristics of economic growth of China would also be of value. In our current work, we are investigating the relationship between load and economic factors using soft computing methods or models like SVR. The benefit of the application of soft computing methods is that complicated and indiscernible relationships can be modified with high accuracy.
5. Acknowledgements
The numerical results with support vector regression and back-propagation neural network were obtained using LS-SVMlab Toolbox 1.5 and Neural Network Toolbox of MATLAB 6.5, respectively. The language of this paper is edited by International Science Editing.
NOTES