Portfolio Research Based on Mean-Realized Variance-CVaR and Random Matrix Theory under High-Frequency Data

Abstract

In this paper, random matrix theory is employed to perform information selection and denoising, and mean-realized variance-CVaR multi-objective portfolio models before (after) denoising are constructed for high-frequency data. The empirical study is conducted based on high-frequency data from stocks in the SSE 180 Index. Compared with the existing literatures, the main contribution of this paper is the introduction of both realized covariance matrix and random matrix theory in multi-objective portfolio problem. The result shows that the use of the realized covariance matrix can reduce the loss of market information, and random matrix theory could help improve the quality of information contained in correlation matrix among assets. Under the denoised mean-realized variance-CVaR criterion, the new portfolio selection has better out-of-sample performance.

Share and Cite:

Yang, Y. , Zhu, Y. and Zhao, X. (2020) Portfolio Research Based on Mean-Realized Variance-CVaR and Random Matrix Theory under High-Frequency Data. Journal of Financial Risk Management, 9, 480-493. doi: 10.4236/jfrm.2020.94026.

1. Introduction

Mean-variance model (MV model) proposed by Markowitz (1952) opened a new chapter in modern portfolio theory, and subsequently many scholars are devoted to expand and deepen it. Kolm et al. (2014) summarized the development, challenges and future development directions. In Markowitz’s MV model, the mean and variance are used to measure average return and risk of asset portfolio respectively. The calculation method of variance depends on characteristics of data. Based on the difference of frequency in data collection, data can be divided into low-frequency data and high-frequency data. As we know, high-frequency trading data with short-time-span could reduce the loss of information in financial market. And it is easier to be obtained with the rapid development of technology. Therefore, the research on investment strategy based on high-frequency data becomes necessary and much significant. The realized variance can be calculated by realized volatility proposed by Andersen & Bollerslev (1998). The literature about realized volatility (variance) is rich, most of which are devoted to its modification, expansion and application in financial high-frequency data. Scholars also introduced realized (variance) covariance into asset allocation research, e.g., Pooter et al. (2008), Yao (2010), Song & Hu (2017) and Yin (2016) etc.

In the above portfolio study with realized variance, only one risk factor (variance) is considered. Since different risk measures describe different risk character of assets, scholars considered multiple measures to construct multi-objective portfolio optimization model. The earlier studies are mean-absolute deviation-skewness model and mean-variance-skewness model in Konno et al. (1993, 1995). Due to excellent properties of CVaR, Roman et al. (2007) constructed a mean-variance-CVaR model which could result in a more balanced portfolio. Further, Li et al. (2012) and Yu & Ma (2014) used this model to study China’s foreign exchange reserves and sovereign fund investment respectively. Gao et al. (2016) extended it to dynamic situations in financial market and Shi et al. (2019) considered optimal investment and reinsurance problem in continuous time. However, the data in above literatures is low-frequency data, and the situation of high-frequency data is ready to be explored.

In addition, with the increasing complexity and diversity of financial markets, Laloux et al. (1999) and Plerou et al. (1999) first applied random matrix theory (RMT) to stock market, which demonstrated the existence of “noise” in asset correlation matrix and effect on portfolio strategy. Later, RMT is used in the study of financial risk management to improve information quality of financial market, for example Han et al. (2014), Xie et al. (2018), Bun et al. (2017) and Shen et al. (2019) etc. Li & Hong (2019) studied the stability of the network before and after “denoising” based on random matrix theory and effective frontier of portfolio under mean-variance model.

In summary, this paper will construct mean-realized variance-CVaR portfolio model, and discuss the influence of denoising technology and realized covariance on optimal multi-objective optimization strategy. The paper is organized as follows. Section 2 describes the related methods. Section 3 gives the datasets, empirical procedure and the out-of-sample performance of different portfolio strategies. Finally, Section 4 concludes the paper.

2. Methods

For convenience, we first give some notations.

Σ 1 : Realized covariance matrix Σ 2 : Covariance matrix

Σ ˜ 1 : Realized covariance matrix after denoising of Σ 1 Σ ˜ 2 : Realized covariance matrix after denoising of Σ 2

Δ 1 : Diagonal matrix of realized standard deviation Δ 2 : Diagonal matrix of standard deviation

C 1 : Asset correlation matrix based on Σ 1 C 2 : Pearson Correlation Coefficient Matrix

C ˜ 1 : Asset correlation matrix after denoising of C 1 C ˜ 2 : Asset correlation matrix after denoising of C 2

2.1. Realized Covariance Matrix

We consider the price process of an N × 1 dimensional financial assets P ( t ) = ( P 1 ( t ) , , P N ( t ) ) T , where P j ( t ) represents the price of the j asset at time t. The logarithm price vector is

q ( t ) = ( log [ P 1 ( t ) ] , , log [ P N ( t ) ] ) T . (1)

The return vector as follows:

R ( t + a , a ) = q ( t + a ) q ( a ) . (2)

Assume that time period from t to t + 1 is divided into m segments, and the rate of asset return on each segment is R ( t + k / m , 1 / m ) = q ( t + k / m ) q ( 1 / m ) , k = 1 , 2 , , m . So return matrix from time t to t + 1 is described as:

H t , t + 1 = ( R ( t + 1 / m , 1 / m ) , , R ( t + m / m , 1 / m ) ) . (3)

Therefore, the realized covariance matrix ( Σ 1 ) can be defined as:

Σ 1 = H t , t + 1 H t , t + 1 T , (4)

here the value of main diagonal element of realized covariance matrix is the realized variance of each asset.

2.2. Random Matrix Theory Denoising

2.2.1. Noise Detection

A random matrix is expressed as:

Z = 1 N A A T , (5)

where A is an N × L matrix which is composed of N uncorrelated random variables with sequence length L, and each sequence obeys N ( 0 , 1 ) distribution. Based on Wigner (1951), for the window width Q = L / N ( > 1 ) , the predicted maximum and minimum eigenvalue of random matrix can be expressed as:

λ max / min Z = σ Z 2 ( 1 + 1 Q ± 2 1 Q ) , (6)

where σ Z 2 is the variance of Z , and σ Z 2 = 1 for standardized matrix.

Based on Kenett et al. (2009), eigenvalue Entropy (SE) of a random matrix is an effective tool to evaluate the information contained in the eigenvalue, as follows:

S E = 1 log ( N ) j = 1 N [ λ j λ j 1 ] 2 j N [ λ j λ j 1 ] 2 log { [ λ j λ j 1 ] 2 j N [ λ j λ j 1 ] 2 } , (7)

where λ ( λ j > λ j 1 ) represents the eigenvalue of matrix. The smaller SE means less noise information, which shows more economic information is contained in eigenvalues, and vice versa.

2.2.2. Denoising Method

For an N × N asset correlation matrix C ( C 1 or C 2 )

C = E D E T , (8)

where D is a diagonal matrix formed by the eigenvalue { γ i } i = 1 n , and E is the eigenvector matrix of C. Let A = { γ i | γ i [ λ min Z , λ max Z ] , i = 1 , , n } , thenA is the noise set. Here PG+ denoising method will be employed. All the elements of set A are replaced by 0 to construct the new diagonal matrix D ˜ . Then the denoised asset correlation matrix C ˜ ( C ˜ 1 or C ˜ 2 ) can be expressed as:

C ˜ = E D ˜ E T . (9)

We set the diagonal element of C ˜ to be 1 to ensure that T r ( C ) = T r ( C ˜ ) = N .

As we know, the covariance matrix Σ ( Σ 1 or Σ 1 ) and the asset correlation matrix C ( C 1 or C 2 ) satisfies the following relationship

Σ = Δ C Δ T , (10)

where Δ = Δ 1 (or Δ 2 ) represents the diagonal matrix formed by standard deviation of each asset. So the Σ ˜ after denoising could be obtained through the C ˜ .

2.3. Mean-Realized Variance-CVaR Optimization Model

Suppose that R = ( R 1 , R 2 , , R N ) T is the return vector of N assets, and x = ( x 1 , x 2 , , x N ) T is the weight vector. The variance of the cumulative return of portfolio is V a r ( x T R ) = x T M x , where M might be the realized covariance matrix Σ 1 , the realized covariance matrix after noise reduction ( Σ ˜ 1 ) or the covariance matrix after noise reduction ( Σ ˜ 2 ).

Based on Roman et al. (2007), we will study the following problem:

P min V a r ( x T R ) = x T M x s . t : x T μ = d C V a R ( x T R ) = z x T 1 = 1 x j 0 , j 1 , , N (11)

where d represents the investor’s target return rate, z represents the control of CVaR, x T 1 = 1 is the weight constraint for full investment, x j 0 tells that no short-selling permitted. The specific determination of parameters d and z is shown in Appendix A.

To show the impact of random matrix and realized variance on investment strategies, the following three optimization models are arranged in this paper, see Table 1.

Table 1. Optimization models.

2.4. Model Evaluation

The average return:

Averagereturn = E [ R o u t × x ] , (12)

where R o u t represents the out sample data, and x represents the optimal investment weight.

Omega Ratio (OR) is proposed by Keating & Shadwick (2002), defined as:

OR = ε ( 1 F ( x ) ) d x ε F ( x ) d x = E [ R o u t × x ε ] + E [ ε R o u t × x ] + , (13)

where F ( x ) represents the cumulative distribution function of portfolio returns and ε is a specified threshold. Returns below the specific threshold are considered as losses and returns above as gains. For the convenience of calculation, ε = 0 is assumed ( Clemente et al., 2019). The portfolio with the highest ratio will be preferred by an investor.

3. Empirical Study

3.1. Dataset Description

The database is from Shanghai Stock Exchange 180 (SSE 180) Index, consisting of 180 stocks that best represent China’s A-Share Market. The five-minute return data of 120 stocks is collected from July 1, 2019 to August 10, 2019. And their five-minute logarithmic returns are calculated respectively. The data spanning from July 1, 2019 to July 31, 2019 is marked as in-sample data and the rest for out-of-sample data.

3.2. Empirical Procedure

The empirical study will be processed according to the following procedure.

Step 1: calculating realized covariance matrix Σ 1 based on formula (1)-for- mula (4).

Step 2: “noise” detection. The noise information in asset correlation matrix C ( C 1 or C 2 ) and random matrix Zwill be analyzed by eigenvalue entropy (SE)based on formulas (5), (6) and (7).

Step 3: constructing the denoised covariance correlation Σ ˜ ( Σ ˜ 1 or Σ ˜ 1 ) according to formulas (8), (9) and (10).

Step 4: calculating the optimal asset weights under MRVC (denoise), MRVC and MVC (denoise) based on model (11) respectively.

(1) Assumed that d p is the median of the average return of all assets, and the interval [ d min , d max ] is determined by formulas from (14) to (18). d takes the 1 6 , 2 6 , 3 6 and 4 6 quantile value of this interval respectively, denoted as d 1 , d 2 , d 3 and d 4 .

(2) Find the optimal weight x j of assets under mean-variance model based on d in last step.

(3) Based on the given d and x j * , the interval [ z d , min , z d , max ] of z is determined by formulas (19) and (20) for α = 0.01 . The values of z is assumed to be the 1 4 , 2 4 , 3 4 of quantile values of interval [ d min , d max ] and z d , max respectively, denoted as z 1 , z 2 , z 3 and z 4 .

(4) Problems of P mrvc , P mrvc rmt and P mvc rmt with the given d and z will be solved through cvx toolkit in Matlab, which result in the optimal solution x .

Step 5: the in-sample optimal weight of assets with three models are obtained from step 1 to step 4. Further, the average returns and OR values for out-of-sample dataset are calculated by formulas (12) and (13).

3.3. Empirical Results

3.3.1. Characteristic Analysis of Asset Correlation Matrix

We calculate the asset correlation matrix C ( C 1 or C 2 ), and further detect their noises based on Step 1 to Step 2, shown in Table 2 and Table 3.

From Table 2, we find the maximum (minimum) eigenvalue 44.32 (0.01) of

Table 2. Characteristic analysis of asset correlation matrix and random matrix.

Table 3. SE of asset correlation matrix and random matrix.

Notes: the symbol “A” means that all eigenvalues are considered while “B” for removing the maximum 7 eigenvalues.

matrix C 1 and its corresponding random matrix’s maximum (minimum) eigenvalue 2.43 (0.19) are greater (smaller) than C 2 ’s maximum (minimum) eigenvalue 20.84 (0.25) and its corresponding random matrix’s maximum (minimum) eigenvalue 1.77 (0.45). This tells us that matrix C 1 has smaller noise interval. Meanwhile, compared C 1 with C 2 , the percentage of noise in C 1 is smaller, which means that matrix C 1 contains more useful economic information.

It can be seen from Table 3 that the SE of the asset correlation matrix C ( C 1 or C 2 ) is much smaller than its corresponding random matrix, which means that C ( C 1 or C 2 ) contains more economic information than its random matrix. After removing the eigenvalues greater than λ max Z in C 1 and C 2 respectively, SE rises sharply, which indicates that removing larger eigenvalues might reduce the information of asset correlation matrix. Therefore, we only replace eigenvalues less than 5 with 0 when PG+ method is used.

3.3.2. Out-of-Sample Performance of Optimal Asset Allocation

Based on Step 3 to Step 4 in Section 3.2, we can obtain the optimal investment strategy under each model with different parameters, and the average return and OR values are shown in Table 4.

The following results could be found from Table 4.

1) Under any different constraints ( d , z ) of means and CVaR, both average return and OR of MRVC (denoise) are higher than MVC (denoise), which means that the introduction of realized covariance matrix for high-frequency data can help much for more effective market information and more appropriate investment decision.

2) Compared with MRVC model, the average return and OR of MVC (denoise) are improved mostly, which tells us that the use of random matrix can indeed improve the performance of investment portfolios to some extent. And the performance under MRVC (denoise) model is sensitive to the selection of parameters d and z.

To further understand out-of-sample performance of each model under different parameter, we plot the cumulative return with the optimal portfolio weights, see Figure 1.

From Figure 1 we can give the following conclusion.

1) For any different ( d , z ) , MRVC (denoise) performs the best and MRVC worst. This shows that the combined use of realized covariance matrix and random matrix theory in optimization model can better improve performance of portfolio.

2) There is little difference among three models when the market fluctuates slightly in the early stage. However, MRVC (denoise) begins to highlight its superiority when the market fluctuates sharply.

3) At a fixed return target, the superiority of MRVC (denoise) gradually increases with the relaxation of constraint on risk CVaR.

Table 4. Out-of-sample performance of optimal investment strategy.

( d 1 * , z 1 * ) ( d 1 * , z 2 * ) ( d 1 * , z 3 * )


( d 1 * , z 4 * ) ( d 2 * , z 1 * ) ( d 2 * , z 2 * )


( d 2 * , z 3 * ) ( d 2 * , z 4 * ) ( d 3 * , z 1 * )


( d 3 * , z 2 * ) ( d 3 * , z 3 * ) ( d 3 * , z 4 * )


( d 4 * , z 1 * ) ( d 4 * , z 2 * ) ( d 4 * , z 3 * )


( d 4 * , z 4 * )

Figure 1. Cumulative return graph of investment period after (before) denoising.

4. Conclusion

This paper studies multi-objective investment strategy based on mean-realized variance-CVaR and random matrix theory for high-frequency data. Compared with Roman et al. (2007), the innovation of this paper is the introduction of covariance matrix and random matrix theory in optimization problem ( Clemente et al., 2019). Compared with Li & Hong (2019), this paper considered CVaR and variance as factors of risk control simultaneously. To a certain extent, the new model can better deal with high frequency, noise and thick-tail characters of data in financial market. The empirical study found that the noise percentage in asset correlation matrix with realized covariance matrix is significantly reduced, and hence carries more effective information. The out-of-sample performance of MRVC (denoise) is significantly better than the other two models, which tells us that the use of realized covariance matrix and random matrix might help to improve information quality and effectiveness of high–frequency data in investment problem. Because of the limitation of length, this paper only considers five-minute return data of 120 stocks, and the relationship between different high-frequency data, denoising effect, and covariance matrix estimator can also be a direction for future research.

Acknowledgments

This work was partially supported by National Natural Science Foundation of China under Grant no. 71671104 and 11971301.

Appendix A

Based on Mean-Variance-CVaR model in Roman et al. (2007), C V a R 1 α can be written as follow,

C V a R 1 α = 1 α i = 1 T p i [ v j = 1 N x j r i j ] + + v = 1 α i = 1 T p i y i + v = z , (14)

Thus formula (11) can be rewritten as P 1 :

P 1 min x T M x s . t : j = 1 N x j μ j = d 1 α i = 1 T p i y i + v = z y i v j = 1 N x j r i j , i i , , T

y i 0 , i i , , T j = 1 N x j = 1 x j 0 , j i , , N (15)

Here v is the value of V a R 1 α , p i represents the probability of return rate R x i at time i, R x i = j = 1 N x j r i j , r i j represents the return rate of asset j at time i, and μ j is the expected return rate of asset j.

In order to ensure P 1 has a feasible solution, d and z need to be within a certain range, that is, d [ d min , d max ] , z [ z d , min , z d , max ] , where d min = max { d min var , d min cvar } . d min var is determined by P 2 :

P 2 min x T M x s . t : j = 1 N x j μ j = d p j = 1 N x j = 1 x j 0 , j i , , N (16)

Solving the P 3 to get x j 1 , thus d min var = j = 1 N x j 1 μ j .

d min cvar is determined by:

P 3 min 1 α i = 1 T p i [ v j = 1 N x j r i j ] + + v s . t : j = 1 N x j μ j = d p j = 1 N x j = 1 x j 0 , j i , , N (17)

Solving the P 3 to get x j 2 , thus d min var = j = 1 N x j 2 μ j .

d max is determined by:

P 4 min j = 1 N x j μ j s . t : j = 1 N x j = 1 x j 0 , j i , , N (18)

Solving P 4 to get x j 3 , thus d max = j = 1 N x j 3 μ j .

Here z d , min is determined by the model:

P 5 min 1 α i = 1 T p i [ v j = 1 N x j r i j ] + + v s . t : j = 1 N x j μ j = d * j = 1 N x j = 1 x j 0 , j i , , N (19)

Solving P 5 to get x j 4 and v 1 , thus z d , min = 1 α i = 1 T p i [ v 1 j = 1 N x j 4 r i j ] + + v 1 .

z d , max is determined by:

P 6 min 1 α i = 1 T p i [ v j = 1 N x j * r i j ] + + v s . t : j = 1 N x j = 1 x j 0 , j i , , N (20)

Solving P 6 to get v 2 , thus z d , max = 1 α i = 1 T p i [ v 2 j = 1 N x j * r i j ] + + v 2 . Here x j * = ( x 1 * , , x N * ) is the optimal portfolio weight of the solution when the mean constraint is j = 1 N x j μ j = d * in mean-variance model.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Andersen, T. G., & Bollerslev, T. (1998). Answering the Skeptics: Yes, Standard Volatility Models Do Provide Accurate Forecasts. International Economic Review, 39, 885-905.
https://doi.org/10.2307/2527343
[2] Bun, J., Bouchaud, J. P., & Potters, M. (2017). Cleaning Large Correlation Matrices: Tools from Random Matrix Theory. Physics Reports, 666, 1-109.
https://doi.org/10.1016/j.physrep.2016.10.005
[3] Clemente, G. P., Grassi, R., & Hitaj, A. (2019). Asset Allocation: New Evidence through Network Approaches. Annals of Operations Research, 1-20.
https://doi.org/10.1007/s10479-019-03136-y
[4] Gao, J., Xiong, Y., & Li, D. (2016). Dynamic Mean-Risk Portfolio Selection with Multiple Risk Measures in Continuous-Time. European Journal of Operational Research, 249, 647-656.
https://doi.org/10.1016/j.ejor.2015.09.005
[5] Han, H., Wu, L. Y., & Song, N. N. (2014). Financial Network Model Based on Random Matrix. Acta Physics Sinica, 63, 901-901.
[6] Keating, C., & Shadwick, W. F. (2002). A Universal Performance Measure. Journal of Performance Measurement, 6, 59-84.
[7] Kenett, D. Y., Shapira, Y., & Ben-Jacob, E. (2009). RMT Assessments of the Market Latent Information Embedded in the Stocks’ Raw, Normalized, and Partial Correlations. Journal of Probability and Statistics, 2009, Article ID: 249370.
https://doi.org/10.1155/2009/249370
[8] Kolm, P. N., Tütüncü, R., & Fabozzi, F. J. (2014). 60 Years of Portfolio Optimization: Practical Challenges and Current Trends. European Journal of Operational Research, 234, 356-371.
https://doi.org/10.1016/j.ejor.2013.10.060
[9] Konno, H., & Suzuki, K. I. (1995). A Mean-Variance-Skewness Portfolio Optimization Model. Journal of the Operations Research Society of Japan, 38, 173-187.
https://doi.org/10.15807/jorsj.38.173
[10] Konno, H., Shirakawa, H., & Yamazaki, H. (1993). A Mean-Absolute Deviation-Skewness Portfolio Optimization Model. Annals of Operations Research, 45, 205-220.
https://doi.org/10.1007/BF02282050
[11] Laloux, L., Cizeau, P., Bouchaud, J. P., & Potters, M. (1999). Noise Dressing of Financial Correlation Matrices. Physical Review Letters, 83, 1467.
https://doi.org/10.1103/PhysRevLett.83.1467
[12] Li, J., Huang, H., & Xiao, X. (2012). The Sovereign Property of Foreign Reserve Investment in China: A CVaR Approach. Economic Modelling, 29, 1524-1536.
https://doi.org/10.1016/j.econmod.2012.05.012
[13] Li, Y., & Hong, Z. M. (2019). “Denoising” Research on the Complex Network Model of My Country’s Stock Market—Based on Random Matrix Theory. Journal of Sanming University, 3, 15-20.
[14] Markowitz, H. (1952). Portfolio Selection. Journal of Finance, 7, 77-91.
https://doi.org/10.1111/j.1540-6261.1952.tb01525.x
[15] Plerou, V., Gopikrishnan, P., Rosenow, B., Amaral, L. A. N., & Stanley, H. E. (1999). Universal and Nonuniversal Properties of Cross Correlations in Financial Time Series. Physical Review Letters, 83, 1471.
https://doi.org/10.1103/PhysRevLett.83.1471
[16] Pooter, M. D., Martens, M., & Dijk, D. V. (2008). Predicting the Daily Covariance Matrix for s&p 100 Stocks Using Intraday Data—But Which Frequency to Use? Econometric Reviews, 27, 199-229.
https://doi.org/10.1080/07474930701873333
[17] Roman, D., Darby-Dowman, K., & Mitra, G. (2007). Mean-Risk Models Using Two Risk Measures: A Multi-Objective Approach. Quantitative Finance, 7, 443-458.
https://doi.org/10.1080/14697680701448456
[18] Shen, K., Yao, J. F., & Li, W. K. (2019). Modeling of High-Dimensional Realized Volatility Matrices with Financial Applications. Computational Statistics and Data Analysis, 131, 207-221.
https://doi.org/10.1016/j.csda.2018.06.004
[19] Shi, Y., Zhao, X., & Yan, X. (2019). Optimal Asset Allocation for a Mean-Variance-CVaR Insurer under Regulatory Constraints. American Journal of Industrial and Business Management, 9, 1568.
https://doi.org/10.4236/ajibm.2019.97103
[20] Song, P., & Hu, Y. H. (2017). Application of High-Dimensional Financial Asset Portfolio Based on Realized Covariance Matrix. Statistics and Information Forum, 32, 63-69.
[21] Wigner, E. P. (1951). On the Statistical Distribution of the Widths and Spacings of Nuclear Resonance Levels. Mathematical Proceedings of the Cambridge Philosophical Society, 47, 790-798.
https://doi.org/10.1017/S0305004100027237
[22] Xie, C., Hu, Y., & Wang, G. J. (2018). Research on Topological Properties of Stock Market Network Based on Random Matrix Theory. Operations Research and Management Science, 27, 144-152.
[23] Yao, N. (2010). Research on the Application of Volatility Timing in Dynamic Asset Allocation under High Frequency Data. East China Economic Management, 6, 156-158.
[24] Yin, L. Q. (2016). Research on the Influence of High-Frequency Price Data Information on the Dynamic Risk Measurement of Asset Portfolio-Based on the Analysis of the Covariance Matrix Model. Financial Development Research, 7, 3-8.
[25] Yu, H. Y., & Ma, S. (2014). China’s Sovereign Wealth Fund Investment Strategy: Based on the Mean-Variance-CVaR Model. Investment Research, 4, 27-40.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.