Fuzzy Varying Coefficient Bilinear Regression of Yield Series

Abstract

We construct a fuzzy varying coefficient bilinear regression model to deal with the interval financial data and then adopt the least-squares method based on symmetric fuzzy number space. Firstly, we propose a varying coefficient model on the basis of the fuzzy bilinear regression model. Secondly, we develop the least-squares method according to the complete distance between fuzzy numbers to estimate the coefficients and test the adaptability of the proposed model by means of generalized likelihood ratio test with SSE composite index. Finally, mean square errors and mean absolutely errors are employed to evaluate and compare the fitting of fuzzy auto regression, fuzzy bilinear regression and fuzzy varying coefficient bilinear regression models, and also the forecasting of three models. Empirical analysis turns out that the proposed model has good fitting and forecasting accuracy with regard to other regression models for the capital market.

Share and Cite:

He, T. and Lu, Q. (2015) Fuzzy Varying Coefficient Bilinear Regression of Yield Series. Journal of Data Analysis and Information Processing, 3, 43-54. doi: 10.4236/jdaip.2015.33006.

1. Introduction

Researchers usually expect a more reliable estimate of dynamic indicators of the market data through rational design and adjustment with more flexible and applicable models when they study the financial assets price changes. The actual financial data which are often given in the form of interval are not only random but also contain fuzziness. Therefore, taking the innate fuzziness of actual financial data into account, starting with interval financial observed data and combining the features of interval financial series and finally establishing an analytical mode are big problems in analyzing assets price changes.

Since Zadeh [1] put forward the fuzzy set in 1965, the fuzzy set theory has been widely used in social science research, especially in the economic construction, financial investment, capital market operation and management etc. For interval financial observed data, Li Zhuyu et al. [2] defined the stationary of fuzzy financial time series as well as fuzzy financial assets yield series and then constructed a pth-order fuzzy auto regression (FAR) model, and also estimated the unknown coefficients by using fuzzy linear program (FLP) with satisfying the minimum fuzzy index and real meaning of financial yields. The result of estimation in [2] shows that yields convergence and fluctuation have the same trends; in addition, the model only can apply to centralized fuzzy data. This leads to deviations with the reality. In order to reflect the dynamic changes in financial yields in a period of time, Li Zhuyu et al. [4] learned the idea from D’Urso and T. Gastaldi [3] that fluctuations depended on the centers to some extent in a dynamic process, then built the fuzzy bilinear regression (FBR) model with fuzzy financial yields centers and fluctuations respectively, and estimated the unknown coefficients by fuzzy least- squares (FLS) method. Wang Donghua [5] suggested that the method of fuzzy linear program (FLP) was relatively simple so as to get a wide range result and was worse for our application.

Due to the influence of various kinds of social factors, financial asset price changes with non-linear dynamic characteristics, so the traditional method of estimation in the description of the non-linear problems tends to have larger error when modelling. Financial assets yield prediction model in literature [4] still needs to combine the explanatory ability on model, the relationship between the dynamic data, the fitting precision and prediction precision more ideally. Varying coefficient models (also called functional coefficient model) can effectively avoid the problem of the curse of dimensionality because of its obvious flexible model structure, and have a distinct advantage in exploring non-linear dynamic characteristics, reducing model specification errors, describing the features of data as well as forecasting. This article generalizes the fuzzy model in literature [4] to varying coefficient model, which is called fuzzy varying coefficient bilinear regression (FVCBR) model, by the correspondence between financial yields and the symmetric numbers used to depict its fuzziness and use fuzzy least-squares (FLS) method to deduce the estimator. Additionally, we test the adaptability of the proposed model by means of generalized likelihood ratio test. Mean square errors and mean absolutely errors are employed to evaluate and compare the fitting of fuzzy auto regression (FAR), fuzzy bilinear regression (FBR) and fuzzy varying coefficient bilinear regression (FVCBR) models, and also the forecasting of three models in and out of the sample period.

2. The Distance of Fuzzy Number Space

In order to introduce the distance of fuzzy number space, we first need to give the introduction to the symmetric numbers and its operational properties.

Definition 1 [6] . A fuzzy number is a fuzzy set of the real line with satisfying the following conditions:

1) there exists an such that;

2) for any, the -level set is an interval number.

The set of all the fuzzy numbers is denoted by.

Definition 2 [6] . A fuzzy number with the following membership function:

(1)

is called a symmetric fuzzy number and usually denoted by, where and are the center and spread of respectively. Besides, is a strictly decreasing function on with and. When, symmetric fuzzy number becomes an ordinary real number and is denoted by.

The set of all the symmetric fuzzy numbers is denoted by. Symmetric fuzzy number is called symmetric triangle fuzzy number if.

Proposition 1 [6] . Suppose, and, therefore:

1);

2).

Let and be two fuzzy numbers. Xu [7] proposed a formula that defines complete distance between fuzzy numbers generalized by distance between interval numbers:

(2)

where f(λ) is an increasing function on [0,1] satisfying and,

and are -level sets of and respectively and

.

As pointed out in [7] , the monotonically increasing function emphasizes the contribution of higher values of to the distance between and. Furthermore, and ensure

is generalized by ordinary distance. In fact, we usually let.

When and are two symmetric fuzzy numbers, the -level sets of and are respectively and, therefore

and the distance (2) becomes

(3)

where.

Proposition 21. If, , and, therefore:

1);

2).

Proposition 32. is separable distance space.

3. The Estimation and Test of Model

Suppose that is a given interval series, in which and are respectively the lowest

and highest prices of financial products. Fuzzy financial time series is the fuzzy

depiction of interval series, where represents the average price of fi-

nancial product on t-th day and the series of centers depicts the trend of convergence of financial product

price; is the radius of fluctuations in price and the series of spreads depicts the uncer-

tainty of on t-th day, namely the magnitude of nonrandomfluctuations. Series is called fuzzy finan- cial yields series and denoted by. Furthermore, can also be put in another form of sym-

metric fuzzy numbers series, then the series of centers reflects the convergence be-

ing the following formula:

(4)

the series of spreads reflects the magnitude of nonrandomfluctuations and is expressed as following:

(5)

3.1. Fuzzy Auto Regression (FAR) Model

If is a conditional stationary series, a p-th order fuzzy auto regression model [2] can be established through and:

(6)

where are auto regressive parameters and are the fuzzy errors on t-th day.

The p-th order fuzzy auto regression model has two obvious limitations, one of those is that the fuzzy auto regression as (6) will finally leads to the same trends of centers and spreads, another is that formula (6) can be equivalent to the combination of auto regression models built by two ordinary time series with centers and spreads respectively. However, only centralized data can omit the constant terms in auto regression models with traditional time series while in the common case of data, the constant terms usually cannot be ignored.

3.2. Fuzzy Bilinear Regression (FBR) Model

The fuzzy bilinear regression (FBR) model in [4] set up by the centers and spreads of fuzzy financial yields series respectively solves those two problems fundamentally.

(7)

where, and are unknown coefficient vectors and and are error terms of the series of centers and spreads respectively on t-th day.

The model (7) respectively describes the auto regression relationship between the convergence of fuzzy financial yields on t-th day and the p-th order lagging value of yields, as well as the auto regression relationship between fluctuations and its p-th order lagging value. Meanwhile, the model also expresses the interdependent relationship between the fluctuations and the convergence of fuzzy financial yields. When, , and, the model (7) will turn into the model (6) in form. Therefore, on one hand, the model (7) is formally generalized by the model (6), and on other hand, the model (7) can improve the explanatory ability to fuzzy financial yields.

3.3. Fuzzy Varying Coefficient Bilinear Regression (FVCBR) Model

In the FAR model and FBR model, the regression coefficients being constants suggests that explanatory variables impact on explained variables constantly during the sample period. While in the study of financial assets yields prediction, the yields is a time series and the level of influence of various factors in different intervals will change. Consequently, linear models with constant coefficients are unfit for prediction and the varying coefficient models should put to use. The form of FVCBR model shows below:

(8)

In which the model coefficients are the function of.

3.3.1. Coefficients Estimate of Fuzzy Varying Coefficient Bilinear Regression (FVCBR) Model

Considering the number of unknown coefficients of varying coefficient model will multiply as the sample size gets large, thus fitting models with traditional methods of estimation are not appropriate. Because of the above- mentioned disadvantages, the restricted weighted least-squares estimation is much used at present. Generally speaking, suppose that t0 is a given point in the domain of variable t, the existing observations all

provide information to the model coefficients, , around, but different observations work different effects. The importance is shown by assigning each observation a weight, and the t-th weight corresponds to the t-th observation.

The kernel estimation is used to estimate the unknown coefficients, , , ,. On the basis of distance (3) and the principle of the kernel smoothing in statistics, we formulate the following restricted weighted least-square problem. That is, the objective function is

And it is minimized with respect to where , with being a given kernel function and being the smoothing parameter.

The restricted weighted least-squares problem is equivalent to minimizing the following equations:

(9)

Let, , , ,

, , ,

, ,

.

We here assume that the inverse matrix of and exist for each.Then the solution of the weighted least-squares problem (9), that is, the estimation of the vectors and of the fuzzy coefficients can be obtained using matrix notation as

(10)

From (10) we can see the factor is independent of the unknown parameters.

If the observed values of explanatory variables are known, we can obtain the fitted values of explained variables at. Furthermore, performing the above estimation procedure at respectively, we can obtain the estimation of explained variables during the whole study period.

(11)

As in statistical nonparametric regression, two kinds of kernel functions are commonly used and one of them is Gaussian kernel:

(12)

and the other is Beta kernel [8] :

(13)

Here, we use the distance (3) used to fuzzify the cross-validation procedure [8] in statistics for selecting the optimal value of the smoothing parameter, that is, let

(14)

where, and are the resulting estimates of the centers

and spreads of the fuzzy coefficients under through deleting the t-th observation and computing the estimates according to the restricted weighted least-squares described above. Thus,

(15)

Then, select as the optimal value of the smoothing parameter such that.

3.3.2. The Test of the FVCBR Model

As point out in [4] , FBR model can be used to analyze the fuzzy financial yields series, but whether the analysis model with constant coefficients are enough to embody the dynamic changes of yields is a question worth thinking about, that is, whether the effects of explanatory variables on explained variables will significantly change or nor as the time goes by? For that reason, we need to test the constant coefficients hypothesis, and

where and are estimates of coefficients of constant coefficients model.

Here we use the generalized likelihood ratio (GLR) test [9] . Let

(16)

(17)

Here and are residual sum of squares of null hypothesis and the whole space respectively. The GLR statistic is

(18)

Then, the asymptotic distribution of statistic can be generalized by the method of Bootstrap. Specific steps are as follows:

Step 1. Let, and a series of random numbers obeyed will be generated, that is,

where, and let;

Let, and a series of random numbers obeyed will be generated, that is,

where, and let;

Step 2. Use the sample data to construct GLR statistics;

Step 3. Repeat Step 1 and Step 2 m times and then get GLR statistics;

Step 4. The asymptotic distribution of GLR statistic of null hypothesis is expressed by the empirical distribution of, that is

(19)

4. Empirical Analysis

Now, we will fit and forecast the fuzzy financial yields with the FVCBR model. The database consists of 119observations of the SSE Composite index from July 17, 2014 to January 9, 2015. Assume that

is a series of interval observations of the SSE Composite index, in which and

represent the maximum and minimum of the SSE Composite index at time respectively, as shown in Figure 1. In order to obtain the yields series, as shown in Figure 2, we deal with the fuzzy series by logarithmic transformation and then first order difference. Further, we divide the data

into two parts. One is the fitted samples and the other is the

forecast samples, and we will test the empirical results and prediction abilities of models in and out of the sample period.

4.1. The Varying Coefficient Bilinear Regression Model of Fuzzy Financial Yields Series

In Figure 1 we can see that the observations of the SSE Composite index show an

obvious increase with the time. Then, we adopt the method of run test to test the series of centers of

and the results shown in the column 2 of Table 1 rejecting the null hypothesis with sig-

nificant level 5% suggests that the sequence of is non-stationary, in other words, the sequence of the SSE Composite index is non-stationary.

As can be seen in Figure 2, the sequence of yields has been detrended fluctuation basically and ranges about from −0.04 to 0.04. Then, we also employ the method of run test to the centers of, and the results shown in the column 3 of Table 1 cannot reject the null hypothesis with significant level 5%, that is, the sequence is stationary so that we can infer is conditional stationary.

Figure 1. The series of interval observations of the SSE Composite index SSE Composite index (2014.7.17-2015.01.09) (2014.7.17-2015.01.09).

Figure 2. The fuzzy yields series of the SSE Composite index SSE Composite index (2014.7.17-2015.01.09) (2014.7.17- 2015.01.09).

Table 1. The fuzzy time series of the SSE Composite index/the run test of centers of fuzzy yields series.

Note: the test values are the averages of each sequence.

We can establish the FVCBR model for the conditional stationary financial yields. Here we will not consider the order determination of the model and only modelling the fuzzy yields sequence of the SSE Composite index with FVCBR (1,1) and finally discuss the imitative effect.

(20)

We select Epanechnikov kernel, Briweight kernel, Triweight kernel and Gaussian kernel to analysis respectively, and in turn denote above kernel functions as, , and. For each kernel function, we use the cross-validation procedure to find the right.

Suppose that is selected from 1 to 20 and steps by 0.2. Figure 3(a) illustrates the CV values computed by each of the kernel functions change with. From Figure 3(a) we can see that the CV values of all kernel functions decrease first and then increase with the increase of. Therefore, under the assumption of that is chosen from 1 to 20, the optimum for is 2 and are 4 for, , respectively. In order to find more better, we further let steps by 0.05 from 1 to 5, and the relationship between CV values and is shown in Figure 3(b). So we can obtain the right for every kernel functions are respectively 3.18, 3.82, 4.36 and 1.82.

For every kernel function, we select the optimal bandwidth and then obtain the estimates of regression coefficients of model (20) at any point in time. Figure 4 shows the estimate values of regression coefficients of different kernel functions and also indicates that the regression coefficients vary similarly with the time although the different kernel functions, that is, different kernel function has little effect on the estimates values of coeffi- cients. Based on this result, we instead considering Gaussian kernel as the kernel function in this article. As we

(a) (b)

Figure 3. The changes of CV values of every kernel function with h increasing (a) h is from 1 to 20; (b) h is from 1 to 5.

Figure 4. The estimates of regression coefficients of model (20). (a); (b); (c); (d); (e).

can see from the estimates of model (20) and Figure 4, the spreads of fuzzy yields have positive correlation to its first order lagging values while the relationship between the centers of fuzzy yields and its first order lagging values as well as the relationship between the spreads and current centers remain positive or negative in the dynamic changes.

Next, we will test the hypothesis of the regression coefficients are constants by using Gaussian kernel as the selected kernel function. Based on the data of SSE composite index and formula (18), we can get the generalized likelihood ratio (GLR) statistic T = 42.6781. When repeat the Step 1 and Step 2 of Bootstrap method for 100 times, that is, m = 100, the curve of asymptotic distribution of GLR statistic is as shown in Figure 5(a) and at this time the value being lower than 0.01 suggests that the result reject the null hypothesis with significant level 0.01. Similarly, when m = 1000, the curve of asymptotic distribution of GLR statistic is as shown in Figure 5(b), and in this case the value being lower than 0.001 illustrates the result reject the null hypothesis with significant level 0.001. In conclusion, we reject the null hypothesis and hold that the regression coefficients are change with time. As a result, the fuzzy varying coefficient regression model is a better choice.

4.2. Forecasting and Evaluation of Simulation

4.2.1. Forecasting

We forecast the real data from 105-th to 119-th with one-step-ahead prediction by using the model (20) and formula (10). The results are as shown in Table 2.

4.2.2. Evaluation of Simulation of the Model

In order to compare to the prediction of FAR model and FBR model, we respectively calculate the absolute error of prediction and show them with curves (as see in Figure 6) of FAR model, FBR model and FVCBR model.

As we can see from Figure 6(a), when predict the centers, the absolute errors of FVCBR model are a little greater than FAR model and FBR model only on the 1st, the 2nd, the 5th and the 9th periods; meanwhile, when predict the spreads, the predicted values are all much close to the observed values at every period except the 1st, the 12th and the 14th periods. So, we preliminary infer that the predictions of FVCBR model are more precise than FAR model and FBR model.

For the purpose of judging the predictions and evaluations of simulation of the three models more accurately, we use the mean square error (MSE) and mean absolute error (MAE) that are often used in regression analysis to evaluate the fitting effects and prediction accuracy. Specifically, we calculate the MSE and MAE of financial yields in and out of sample period respectively and the results are shown in Table 3.

(a) (b)

Figure 5. The asymptotic distribution of GLR statistic. (a) m = 100; (b) m = 1000.

(a) (b)

Figure 6. The comparisons of predictions of FAR model, FAR model and FVCBR model. (a) The sequence of centers; (b) The sequence of spreads.

Table 2. The results of one-step-ahead prediction of fuzzy varying coefficient bilinear regression model3.

Table 3. The errors measurements of fitting and forecasting of SSE composite index.

From Table 3, we can see no matter in and out of the sample period, the MSE and MAE of FVCBR model are all lower than FAR model and FBR model. This result proves that the fitting effects and prediction accuracy of FVCBR model are superior to FAR model and FBR model.

5. Conclusion

This article introduces the fuzzy financial yields series to deal with the interval observed samples in financial markets and constructs the fuzzy varying coefficient bilinear regression (FVCBR) model with satisfying the practical significance of financial yields. Besides, based on the complete distance between fuzzy numbers, we develop the fuzzy least squares method to obtain the estimates of the unknown coefficients. Empirical analysis shows that compared with constant coefficient regression model and fuzzy auto regression model, the varying coefficient regression has shown some improvements no matter in fitting effects or prediction. The fuzzy auto regression model has certain limitations in model fitting and forecasting because the auto regression of centers and spreads are considered independently rather than taking the effect that centers have on the spreads into account. Fuzzy varying coefficient bilinear regression explores the problem of financial fuzzy time series more flexibly and applicatively to make the fitted values and predictions be intervals and thus more consistent with the description of real financial market. Finally, this article only discusses the uncertainty of changes of financial assets price, that is, reflects the changes of yields only by fuzzy data but ignores the probability distribution of parameters. So, later research will focus on the coefficient estimates, statistical tests, model fitting and forecasting after introducing the random error, so as to give deciders more perspective to recognize and explain the changes of financial markets.

NOTES

1It is proved by proposition 1.

2It is proved by the density of the set of rational number in the set of real numbers.

3The regression coefficients in one-step-ahead prediction are previous estimates.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Zadeh, L.A. (1965) Fuzzy Sets. Information and Control, 8, 338-353.
http://dx.doi.org/10.1016/S0019-9958(65)90241-X
[2] Li, Z.-Y., Zhang, C. and Wang, T.-J. (2010) Studying on the Interval Financial Times Series and Evaluating on the Forecast. Journal of Applied Statistics and Management, 29, 129-136.
[3] D’Urso, P. and Gastaldi, T. (2000) A Least-Squares Approach to Fuzzy Linear Regression Analysis. Computational Statistics and Data Analysis, 34, 427-440.
http://dx.doi.org/10.1016/S0167-9473(99)00109-7
[4] Li, Z.-Y., Liu, W.-Y. and Wang, T.-J. (2009) Fuzzy Bilinear Regression of Yields Series. Statistical Research, 26, 68-73.
[5] Wang, H.-D., Guo, S.-C. and Yue, L.-Z. (2014) An Approach to Fuzzy Multiple Linear Regression Mode Based on the Structured Element Theory. Systems Engineering—Theory & Practice, 34, 2628-2636.
[6] Hu, B.-Q. (2010) Foundations of Fuzzy Theory. Wuhan University Press, Wuhan, 103-114.
[7] Xu, R.N. (1991) A Linear Regression Model in Fuzzy Environment. Advances in Modelling Simulation, 27, 31-40.
[8] Fan, J.Q. and Yao, Q.W. (2003) Nonlinear Time Series. Springer, Berlin, 243-245.
[9] Cai, Z.W., Fan, J.Q. and Yao, Q.W. (2000) Functional-Coefficient Regression Models for Nonlinear Times Series. Journal of the American Statistical Association, 95, 941-956.
http://dx.doi.org/10.1080/01621459.2000.10474284

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.