Linear Regression and Gradient Descent Method for Electricity Output Power Prediction ()
1. Introduction
The power output of a power plant typically has a complicated relation with respect to the physical parameters, such as temperature, vacuum level, relative humidity, and exhaust steam pressure, etc. [1] [2] [3] [4] [5]. Attempts have been made to resolve the relations using methods such as Bagging algorithm [2], neural network [3], etc. However, either the algorithm itself is complicated or it involves other non-intuitive algorithms such as particle swarm optimization. A simple prediction method without consuming much computing resources is highly desired.
With the advent of computer science, specifically machine learning, methods have been established to build mathematical models based on training data to make predictions or decisions. These methods do not involve explicit programs to perform the task. Using different mathematical algorithms, computers are able to make accurate predictions. Wide applications have been implemented in our daily life, for example, pattern recognition [6] [7], speech recognition [8] [9], text categorization [10] [11], autonomous driving [12] [13], medical diagnosis [14] [15], computational biology [16] [17], etc.
In machine learning, gradient descent is a very popular method for regression. It is an optimization algorithm used to find the values of coefficients of a function that minimizes a cost function. Gradient descent and cost function are methods and functions that help to analyze sets of data [18] [19]. By combining these two, it enables us to estimate values base on previous records. In this work, we developed a predictive model, that can predict output power of a power plant given the temperature and vacuum level. This is an attempt in using cost function and linear descent after courses of machine learning.
2. Methods
Obtaining the predicted value requires the theta of the linear equation. In order to validate the result, we use ninety percent of the data to predict the result and the remaining ten percent will be used to verify the validity of the predicted data. Graphing will reflect the range of the predicting value. As last, there will be calculation of rate of deviation, which is the percent of error comparing the predicted value and actual value.
Linear regression is a linear approach to modeling the relationship between a scalar response or dependent variable and one or more explanatory variables or independent variables. Simply, we have a set of data. Each dependent value corresponds to another independent variable. We may call these two represent dependent variable and independent variable; our purpose is to find the relationship between the dependent variable and the independent variables. Nevertheless, unlike any “pretty” functions we familiar the most, the variable does not have a direct relationship such as linear or exponential. This can be easily explained: the numbers are authentic data from real life, which means most numbers will not perfectly match to each other. Because there are many other uncertainties in real life, causing the change of dependent variable unstable and without pattern, people cannot use equation to explain the relationship of two variables. People can, however, plot the data to a graph and draw line of best fit. It looks easy when we usually directly get that by plugging data into excel, but the method of actually drawing the line is not so simple. The word “draw” is not very appropriate because it requires rigid calculations. Assume the linear function is expressed as the following.
We want it to fit the data we gathered. That means when we plug in x value (the size) the output of the function is closest to the real value in the dataset. There are several points, so we need a line that the average deviation from real value is the smallest. Therefore, we write the equation.
This simply means the sum of the difference of the estimate value to real value. In order to make it smaller, we take the derivative of the equation and then search for the critical point (1/2 m and square is added for simplifying the calculation process):
In addition, we need Gradient Descent to calculate the minimum of the function. Gradient Descent is a formula to find the minimum of a function,
For the linear function, we can finally get the slope and the intersection point of the function.
3. Results
The code starts with loading given data file by csvread, which is used for reading excel data. The distribution of the power output with respect to vacuum level and temperature is shown in Figure 1. The absolute value of power output is reflected in its corresponding size and color during plotting. Then we declare the amount of the data we will be used to predict, which is 90% percent, and set the scale features to zero mean. The first step is used to set up the matrix that required in gradient descent formula. Part two is plugging in the matrix we just set up in step one. Gradient descent will provide the two largest constant (theta) of the linear function, which is an essential for prediction. The cost function is plotted in Figure 2 versus number of iterations. It clearly shows that the cost function decreased by four order of magnitude with gradient descent algorithm. Then we use cross validation to find the predicted value. Basically, theta is the ratio of the current data to the next data. So we use the current data times theta to gain the prediction. Finally, we use the remaining ten percent data to find the error of the prediction in Figure 3. Ten percent stands for the actual data and we compare it with our prediction to find the error. Results show that we can get less than 1% average error with this method.
![]()
Figure 1. Power output versus vacuum level and temperature.
![]()
Figure 2. Cost function J decreases with number of iterations.
![]()
Figure 3. Validation with the 10% samples. Results show that less than 1% error is obtained.
The difficulty of the program is declaring the data and set the matrix. Because gradient descent and cross validation are only formula. The input of the formula comes from the matrix, which required some manipulation to make the input fit the requirement of the formulas.
4. Conclusion
In this paper, we employed gradient descent method combined with cost function to predict the power output based on the input of vacuum level and temperature in a power plant. Less than 1% prediction error has been achieved. Although this is a preliminary study, with more complicated gradient method by incorporating more physical parameters, more accurate results could be anticipated. Moreover, we believe this method could be extended to other areas.