Study on the Volatility of CSI Index Returns 300 Based on GARCH Modeling ()
1. Introduction
The CSI 300 Index is an index jointly launched by the Shanghai and Shenzhen stock exchanges on April 8, 2005, designed to reflect the overall performance of the A-share market (Liu, Wang, & Wu, 2017). The CSI 300 Index aims to reflect the overall landscape and operational status of stock price fluctuations in China’s securities market, serve as a benchmark for evaluating investment performance, and provide foundational conditions for index-based investing and index derivative innovation. The purpose and significance of this study is to establish a GARCH model using CSI 300 Index data from the past decade through the construction of a standardized financial data model. Using CSI 300 return data from January 4, 2010, to December 31, 2020, excluding non-trading days, this study employs Eviews 10.0 to analyze the volatility of returns on trading days using a generalized autoregressive conditional heteroskedasticity (GARCH) model. Through iterative data preprocessing and model refinement, the model is fitted to data variations, minimizing errors to a controllable range. Once successfully established, the model can forecast potential future trends of the CSI 300 Index. This enables the provision of investment recommendations to relevant investors (Chen, 2020), facilitates the advanced formulation of corresponding policies by government and banking regulators, and offers scholars a research methodology for analyzing the CSI 300 Index.
2. Empirical Analysis
2.1. Theoretical Analysis
The probability distribution of a smooth time series process is independent of the displacement in time. If an arbitrary set of random variables is taken from the sequence and the sequence is shifted forward h times, its joint probability distribution remains unchanged (Yu, 2022).
The first-order AR model, AR (1), is modeled as follows:
(1)
When
, the effect of the previous moment’s fluctuation on the present moment’s fluctuation will be smaller and smaller. Based on this, the covariance as well as the autocorrelation coefficient of this model calculated in this paper it is a fixed value, i.e., smooth at this point.
For higher-order AR models, this paper still uses a similar analytical approach (Guan, 2021), e.g., an AR (P) order model is as follows:
(2)
According to the above model formula, when
are all less than 1, then this sequence is considered to be smooth in this paper, however, if any of
is greater than or equal to 1, then this sequence is considered to be not smooth.
To determine whether in the model is smooth, that is, to determine whether all are
less than 1, this time to use the characteristic equation of the AR model:
(3)
There are p roots in this equation.
Going to test whether a particular AR series is smooth means testing whether one of the roots of that equation is greater than or equal to 1.
2.2. The Empirical Design Section Is as Follows
The equation form of the ARCH model is as follows:
Mean value equations:
.
Variance model:
.
The variance of
in the above equation, which depends only on the interference term in the immediately preceding moments, is also the process of ARCH (1), and if the interference term to which it is subjected is generalized, the process of ARCH (q) is obtained as (Ji & Yang, 2021):
(4)
In this equation,
represents the explained variable at moment t.
is the momentary disturbance term,
is the set of information at moment t, and
is the conditional variance of
. It is generally accepted that when
is greater than 0,
is greater than 0, and
is less than 1 (
), this represents a smooth process of ARCH. In this model,
obeys ARCH (q) process. And the ARCH model can reflect the volatility aggregation of the stock market (Tang & Ju, 2020).
The generalized autoregressive conditional heteroskedasticity model, i.e., GARCH model, which introduces its own lagged value in the decision model of the conditional variance
of the current period of
, is defined as follows:
(5)
(6)
The closer the value of the model coefficients in this model, i.e.,
, is to 1, the greater the volatility of the entire series. When the value is greater than 1, it indicates that the effects of this shock will persist and spread. When the sum of the values is less than 1, it means that the impact of the shock will gradually dissipate (Hu & Xing, 2022). The simplest GARCH model is the standardized GARCH (1, 1) model:
(7)
(8)
2.3. Data Sources and Pre-Processing
The data selected for this paper is the daily closing price of as a sample, and the source of the data is Invesco.com. CSI 300 for the CSI 300 index from January 4, 2010-December 31, 2020 after constant logarithmic difference processing of daily closing prices:
(9)
where
represents CSI the return of with 3,002,676 data and
is the corresponding closing price data.
2.4. Descriptive Statistics Analysis
The mean value of the logarithmic return of CSI 300 index in the sample is 0.000145, the median is 0.000345, the maximum value is 0.064989, the minimum value is −0.091544, and the standard deviation is 0.014608. The skewness is −0.693074, and the kurtosis of left skewness of the series is 7.918043, which is bigger than the kurtosis of the normal distribution, which indicates that the distribution of the data shows the state of “sharp peak and thick tail” with left skewness. This indicates that the data distribution shows “sharp peaks and thick tails” with left skewness. At the same time, the JB normality test is conducted, and the final value of the statistic is 2910.015 and significant, which indicates that the distribution of CSI 300 index returns is significantly different from the normal distribution, according to which this paper carries out the next step of the smoothness test and so on.
2.5. The Process of Empirical Analysis
2.5.1. Smoothness and Unit Root Tests
The results of the unit root test for the variable m using Eviews 10 software are shown in the table below, and based on the graphs presented it can be observed that the original hypothesis is rejected at the 5% level of significance, i.e., the series is a smooth series with no time trend or random trend. The mean value modeling can be carried out (Liao, 2023). The specific results are shown in Table 1:
Table 1. Unit root test.
|
ADF value |
Coefficient P-value |
Significance of the coefficient |
Overall P* |
Rt (c, t) |
−50.36075 |
0.1739 (t) |
insignificant |
0.0000 |
Rt (c) |
−50.33441 |
0.6241 (c) |
insignificant |
0.0001 |
Rt (none) |
−50.33918 |
0.0000 |
significant |
0.0000 |
2.5.2. Autocorrelation Test
Table 2. Autocorrelation test.
Serial number |
AC |
PAC |
Q-Stat |
Prob |
1 |
0.026 |
0.026 |
1.8562 |
0.173 |
2 |
−0.034 |
−0.034 |
4.8925 |
0.087 |
3 |
0.021 |
0.022 |
6.0219 |
0.111 |
4 |
0.021 |
0.019 |
7.2489 |
0.123 |
5 |
0.014 |
0.014 |
7.7685 |
0.169 |
6 |
−0.067 |
−0.067 |
19.827 |
0.003 |
7 |
0.047 |
0.051 |
25.754 |
0.001 |
8 |
0.031 |
0.023 |
28.311 |
0.000 |
9 |
0.033 |
0.038 |
31.256 |
0.000 |
10 |
−0.019 |
−0.019 |
32.191 |
0.000 |
11 |
−0.024 |
−0.022 |
33.767 |
0.000 |
12 |
0.011 |
0.003 |
34.113 |
0.000 |
13 |
0.051 |
0.056 |
41.219 |
0.000 |
14 |
−0.053 |
−0.055 |
48.803 |
0.000 |
15 |
−0.008 |
0.002 |
48.975 |
0.000 |
16 |
0.035 |
0.023 |
52.264 |
0.000 |
17 |
0.01 |
0.006 |
52.515 |
0.000 |
18 |
−0.007 |
−0.003 |
52.646 |
0.000 |
19 |
0.006 |
0.017 |
52.759 |
0.000 |
20 |
0.062 |
0.049 |
63.199 |
0.000 |
21 |
0.025 |
0.023 |
64.878 |
0.000 |
22 |
−0.029 |
−0.026 |
67.217 |
0.000 |
23 |
−0.059 |
−0.056 |
76.533 |
0.000 |
24 |
−0.013 |
−0.016 |
76.968 |
0.000 |
25 |
0.027 |
0.020 |
78.886 |
0.000 |
26 |
−0.038 |
−0.034 |
82.801 |
0.000 |
27 |
−0.035 |
−0.026 |
86.201 |
0.000 |
28 |
0.057 |
0.047 |
94.992 |
0.000 |
29 |
0.019 |
0.003 |
95.924 |
0.000 |
30 |
−0.013 |
−0.001 |
96.43 |
0.000 |
31 |
−0.047 |
−0.035 |
102.53 |
0.000 |
32 |
−0.038 |
−0.043 |
106.49 |
0.000 |
33 |
0.046 |
0.039 |
112.34 |
0.000 |
34 |
0.005 |
0.014 |
112.42 |
0.000 |
35 |
−0.004 |
0.004 |
112.47 |
0.000 |
36 |
−0.007 |
−0.008 |
112.59 |
0.000 |
From Table 2, it can be seen that a large portion of the p-values are less than 0.05, and this paper concludes that there is serial autocorrelation. In the later modeling analysis is in this test a condition. And ACF and PACF are not obvious truncation and partial tail phenomenon, this paper can be considered that it is consistent with the establishment of model characteristics. ARMA (Liu & Qian, 2023).
2.6. ARMA Modeling
In this paper, we constructed and characterised an ARMA(p, q) volatility model for the Shanghai Composite Index, employing the first-order difference of the logarithmic difference of the CSI 300 Index. In order to ensure the efficiency of data utilization, from eight models are fitted for this purpose. During the fitting process ARMA (2, 2), ARMA (2, 1), ARMA (11, 2), ARMA (1), AR (2), AR (1), MA (2), MA (1). This paper found that only after the significance test ARMA (1, 1) and ARMA (2, 2) models passed the significance test for all coefficients. This paper then compares the AIC, SC, and HQC values to determine the final optimal model.
Table 3. Comparison of AIC, SC, HQC.
Mould |
AIC |
SC |
HQC |
ARMA (1, 1) |
−5.618596 |
−5.614191 |
−5.617002 |
ARMA (2, 2) |
−5.616713 |
−5.607903 |
−5.613525 |
As shown in Table 3, it can be observed that in ARMA (1, 1) model the AIC value is −5.618596 and SIC value is −5.614191, and in model the ARMA (2, 2) AIC value is −5.616713 and SIC value is −5.607903, and based on the judgement of the information criterion, this paper selects the model with smaller value as the optimal model.
From the above table, it can be observed that the ARMA (1, 1) model satisfies among the three information criteria the minimum, so finally chosen AIC and SIC ARMA (1, 1) is as the mean model. The autocorrelation test is performed the residual series of the model as LM test ARMA (1, 1) shown in Table 4:
Table 4. Model autocorrelation test.
Breusch-Godfrey Seral Correlation LM Test. |
F-statistic |
0.072418 |
Prob.F (1, 2672) |
0.7879 |
Obs*R-squared |
0.072498 |
Prob.Chi-Square (1) |
0.7877 |
From Table 4, it can be seen that the developed ARMA (1, 1) model passes the LM test and does not reject the original hypothesis at the level of significance, and the lag 10% same conclusion is obtained at order, and is therefore identified as the final mean model.
2.7. Test for Effect ARCH
The test for effect is to test whether the residual series are heteroskedastic and whether the residual squared series are correlated. The graphs of correlation coefficients and partial autocorrelation coefficients of the squared residual series show that there is a strong autocorrelation ARCH (Liu, Yang, & Shen, 2019).
The ARCH-LM test method and the results below further carried out are shown in Table 5.
From Table 5, the original hypothesis is rejected at the 5% level of significance, thus the model has ARCH effect.
Table 5. ARCH effect test.
Heteroskedasticity Test: ARCH |
F-statistic |
91.689280 |
Prob.F (1, 2672) |
0.0000 |
Obs*R-squared |
88.713710 |
Prob.Chi-Square (1) |
0.0000 |
2.8. GARCH Modeling
In this paper, we fit each order model, because p = 1, q = 1; the lag order should not be too large, so the highest lag order in this paper is 2nd order. After several attempts of GARCH (p, q) model, respectively, to establish normal distribution, generalized error distribution and GARCH (1, 1), GARCH (2, 2), GARCH (GARCH (1, 2), 2, 1) model, and ultimately finally determined to choose the generalized error distribution under the GARCH under the T-distribution (1, 1) model.
The inspection process is as follows:
In the GARCH (1, 1) model:
Table 6. GARCH (1, 1) model.
GARCH = C (3) + C (4) * RESID (−1)2 + C (5) * GARCH (−1) |
Variable |
Coefficient |
Std. Error |
z-Statistic |
Prob. |
AR (1) |
0.0183 |
0.9964 |
0.018439 |
0.9853 |
MA (1) |
0.019647 |
0.994726 |
0.19752 |
0.9842 |
Variance Equation |
C |
0.0001 |
3.9E−05 |
2.5948 |
0.0095 |
RESID (−1)2 |
0.0550 |
0.008038 |
3.500223 |
0.0005 |
GARCH (−1) |
0.5999 |
0.105036 |
5.711617 |
0.0000 |
R-squared |
0.0005 |
Mean dependent var |
0.0001 |
Adiusted R-squared |
0.0001 |
S.D. dependent var |
0.0146 |
S.E. of regression |
0.0146 |
Akaike info criterion |
−5.4804 |
Sum aquared resid |
0.5702 |
Schwarz criterion |
−5.4693 |
Log likelihood |
7332.315 |
Hannan-Quinn criter |
−5.4764 |
Durbin-Watson stat |
2.019693 |
|
|
Inverted AR Roots |
0.02 |
|
|
|
Inverted MA Roots |
−0.02 |
|
|
|
Heteroskedasticity Test: ARCH |
F-statistic |
0.6484 |
Prob. F (1, 2671) |
0.4207 |
Obs*R-squared |
0.6487 |
Prob. Chi-Square (1) |
0.4205 |
Heteroskedasticity Test: ARCH |
F-statistic |
8.2310 |
Prob. F (3, 2671) |
0.0000 |
Obs * R-squared |
40.6203 |
Prob. Chi-Square (3) |
0.0000 |
Heteroskedasticity Test: ARCH |
F-statistic |
7.8838 |
Prob. F (20, 2633) |
0.0000 |
Obs*R-squared |
76.8805 |
Prob. Chi-Square (20) |
0.0000 |
In the GARCH (1, 1) model, all coefficients passed the significance test; however, higher-order ARCH effects were present and were therefore omitted.
In the T-GARCH (1, 1) model:
Table 7. T-GARCH (1, 1) model.
GARCH = C (3) + C (4) * RESID (−1)2 + C (5) * GARCH (−1) |
Variable |
Coefficient |
Std. Error |
z-Statistic |
Prob. |
AR (1) |
0.01499 |
1.5449 |
0.0096 |
0.9923 |
MA (1) |
0.01598 |
1.5436 |
0.010354 |
0.9917 |
Variance Equation |
|
|
|
|
C |
0.0021 |
0.000103 |
2.0647 |
0.0390 |
RESID (−1)2 |
0.1500 |
0.070559 |
2.1258 |
0.0335 |
GARCH (−1) |
0.5998 |
0.182038 |
3.2952 |
0.0010 |
T-DIST.DOF |
19.79064 |
14.32650 |
1.3814 |
0.1627 |
R-squared |
0.0006 |
Mean dependent var |
0.0001 |
Adiusted R-squared |
0.0002 |
S.D. dependent var |
0.0146 |
S.E. of regression |
0.0146 |
Akaike info criterion |
−5.2907 |
Sum aquared resid |
0.5701 |
Schwarz criterion |
−5.2774 |
Log likelihood |
7079.686 |
Hannan-Quinn criter |
−5.2859 |
Durbin-Watson stat |
2.0059 |
|
|
Inverted AR Roots |
0.01 |
|
|
|
Inverted MA Roots |
−0.02 |
|
|
|
As shown in Table 7, the coefficient portion is not significant and is therefore omitted.
Under conditions of heteroskedasticity, namely in the EGARCH (1, 1) model:
Table 8. E-GARCH (1, 1) model.
GARCH = C (3) + C (4) * RESID (−1)2 + C (5) * GARCH (−1) |
Variable |
Coefficient |
Std. Error |
z-Statistic |
Prob. |
AR (1) |
−0.8838 |
0.0643 |
−13.7450 |
0.0000 |
MA (1) |
0.9067 |
0.0577 |
15.7080 |
0.0000 |
Variance Equation |
|
|
|
|
C |
0.0000 |
4.77E−07 |
2.5948 |
0.0095 |
RESID (−1)2 |
0.0550 |
0.008038 |
6.8535 |
0.0000 |
GARCH (−1) |
0.9401 |
0.007974 |
117.9038 |
0.0000 |
GEDPARAMETER |
1.1639 |
0.039375 |
29.5616 |
0.0000 |
R-squared |
0.0048 |
Mean dependent var |
0.0001 |
Adiusted R-squared |
0.0044 |
S.D. dependent var |
0.0146 |
S.E. of regression |
0.0145 |
Akaike info criterion |
−5.9342 |
Sum aquared resid |
0.5678 |
Schwarz criterion |
−5.9210 |
Log likelihood |
7940.0840 |
Hannan−Quinn criter |
−5.9294 |
Durbin-Watson stat |
1.9828 |
|
|
Inverted AR Roots |
−0.88 |
|
|
|
Inverted MA Roots |
−0.91 |
|
|
|
Heteroskedasticity Test: ARCH |
F-statistic |
0.6723 |
Prob. F (1, 2671) |
0.4123 |
Obs*R-squared |
0.6726 |
Prob. Chi−Square (1) |
|
Heteroskedasticity Test: ARCH |
F-statistic |
0.6918 |
Prob. F (3, 2671) |
0.5570 |
Obs*R-squared |
2.0769 |
Prob. Chi-Square (3) |
0.5566 |
Heteroskedasticity Test: ARCH |
F-statistic |
0.5255 |
Prob. F (20, 2633) |
0.9576 |
Obs*R-squared |
10.553 |
Prob. Chi-Square (20) |
0.9570 |
Heteroskedasticity Test: ARCH |
F-statistic |
0.7248 |
Prob. F (25, 2633) |
0.8364 |
Obs*R-squared |
18.1760 |
Prob. Chi-Square (25) |
0.8348 |
As shown in Table 8, All coefficients are statistically significant and pass the heteroscedasticity test. This paper concludes that the model can be used to fit the return volatility of the CSI 300 Index.
The final result obtained is:
.
In order to test the fit of the model, as shown in Table 6, the model is tested for the effects of ARCH.
Table 9. Heteroscedasticity test.
Heteroscedasticity test: ARCH |
F-statistic |
0.6723 |
Prob. F (1, 2671) |
0.4123 |
Obs * R-squared |
0.6726 |
Prob. Chi-Square (1) |
0.4121 |
As shown in Table 9, based on the results of the test, as well as Table 6, it is known that the model is no longer heteroskedastic.
3. Conclusion and Recommendations
This empirical study reveals the following findings: 1) Characteristics of the return distribution: The mean of the index returns approaches zero, exhibiting pronounced volatility. Its skewness is non-zero, with a kurtosis exceeding 3, indicating that the return series follows a non-normal distribution. The series displays sharp peaks and fat tails. The ADF test confirms the series’ smoothness, while the LM test and GARCH (1, 1) model demonstrate significant ARCH effects, verifying the phenomenon of volatility clustering. Asymmetry effect analysis: Model comparisons reveal that both the CSI EGARCH (1, 1) and TGARCH (1, 1) models exhibit significant leverage effects on log-differenced returns. Negative news triggers substantially stronger volatility than positive news, closely reflecting investor behaviour patterns (Su, 2022). During market downturns, heightened risk aversion among investors leads to concentrated selling, amplifying volatility. Conversely, rising markets see increased risk appetite accompanied by underreaction. Concurrently, this study’s interpretation of market mechanisms generating disorderly return fluctuations reflects the immaturity of China’s securities market, manifested in: 1) a brief developmental history (officially launched in 2005); 2) an imperfect trading system; 3) a retail-dominated investor structure prone to excessive speculation; and 4) room for improvement in risk pricing mechanism effectiveness.
Based on this, corresponding regulatory policy recommendations are proposed: 1) Refine the information disclosure mechanism: enhance market transparency and establish a real-time data disclosure platform; 2) Optimise the regulatory framework: clarify the powers and responsibilities of regulatory bodies and establish cross-departmental coordination mechanisms (Wang, 2022); 3) Improve the risk pricing mechanism: define the division of responsibilities within the risk pricing mechanism and establish cross-departmental coordination mechanisms. 4) Cultivate institutional investors: guide long-term capital into the market through policies such as tax incentives to optimise the structure of market participants. Cultivating institutional investors: Guide long-term capital into the market through policies such as tax incentives, optimising the structure of market participants. Based on the above, we should strengthen the development of fundamental market systems and guide investor behaviour, which is the key pathway to promoting the healthy development of China’s financial markets. Regulators need to seek a dynamic balance between risk prevention and market vitality, driving the securities market towards maturity.
This study utilises only daily closing data for the CSI 300 Index from 4 January 2010 to 31 December 2020, which inherently carries limitations (lack of intraday information, potential structural discontinuities, etc.). Future research is advised to construct quantitative trading strategies based on high-frequency data, such as intraday momentum strategies or mean reversion strategies. Empirical validation of these strategies’ efficacy could further uncover intraday trading patterns and characteristics within the CSI 300 Index. Incorporating sector-level data (e.g., industry growth rates, sector profit margins) and firm-level data (e.g., financial statement metrics, corporate governance structures) pertaining to CSI 300 constituents would enrich subsequent investigations.