Analysis of 48 US Industry Portfolios with a New Fama-French 5-Factor Model ()
1. Introduction
In 2015, Fama and French suggest a 5-factor model (denoted as FF5-Normal)1 to capture the market, size, value, profitability and investment patterns in stock returns, which is found better than their 3-factor model in [3] . Since then, many researches about the 5-factor model are developed (see Table 1). These researches can be divided into following 2 groups. The 1st group of researches empirically tests the FF5-Normal model using different data. For example, [4] [5] find out that the FF5-Normal model works well in India.
The 2nd group is to extend Fama-French’s 5-factor model. For example, [6] [7] choose Betting against Beta (BaB), Gross Profitability (GP) and other 9 factors to create a 14-factor model and find that the market factor is the most important factor for describing expected returns. [1] [8] [9] add the SSAEPD of [10] ( [11]
Table 1. Researches about the 5-Factor model for stock market.
Based on the new model of [1] , in this paper, we try to test following hypothesis: If different data such as 48 industry portfolios3 are considered, can the new model of [1] still beat the 5-factor model in [2] ? To find answers for above question, simulation is used to check the validity of [1] ’s MatLab program4. Then, 48 industry portfolios are analyzed. Data are downloaded from the French’s Data Library, and the sample period is from Jul. 1963 to Jan. 2017. Parameters are estimated by Method of Maximum Likelihood Estimation (MLE). Likelihood Ratio test (LR) and Kolmogorov-Smirnov test (KS) are used for model diagnostics. Model comparison is done with Akaike Information Criterion (AIC).
Simulation results show the MatLab program is valid and can be used for empirical analysis. Empirical results show the 5 factors in [2] are still alive! The GARCH-type volatility and SSAEPD can successfully capture the excess kurtosis. The new model of [1] fits the data well and has better in-sample fit than the 5-factors model of Fama and French.
The organization of this paper is as follows. Section 2 is the model and methodology. Section 3 presents the empirical results. Section 4 provides the conclusions and future extensions. The appendices contain additional information that may be helpful to understand our paper.
2. Model and Methodology
2.1. The FF5-SSAEPD-GARCH Model
[1] extend Fama-French’s 5-factor model based on the GARCH-type volatility in [13] and non-Normal error distribution of SSAEPD in [10] , and show their new model is better for 25 Fama-French portfolios. This new model in [1] is listed as follows (denoted as FF5-SSAEPD-GARCH).
(1)
(2)
(3)
(4)
where
is the parameter vector to be estimated. T is the sample size. The error term
is distributed as the Standardized Standard Asymmetric Exponential Power Distribution (SSAEPD) proposed by Zhu and Zinde-Walsh.
is the conditional standard deviation, i.e., volatility.
is the return on stock portfolio.
is the risk-free return.
is the value-weighted market return.
is the return of small minus big.
stands for the return of robust minus weak.
stands for the returns of conservative minus aggressive.
is the return of high minus low orthogonalized5, which is the sum of the intercept and the residual from the regression of
on
.
Especially, with
5The reason of using
instead of
can be found in Appendix 2.
the FF5-SSAEPD-GARCH model is reduced to Fama-French’s 5-factor model and in the following section we will compare these two models.
2.2. MLE
Maximum Likelihood Estimation (MLE) is used to estimate previous model. The likelihood function is
(5)
(6)
where
(7)
(8)
3. Empirical Analysis
3.1. Data
Different from [1] , the data we analyze are the monthly returns of 48 industry portfolios for US stock market downloaded from French’s Data Library, which include Agriculture, Food, Real Estate, Finance et al. The sample period is from 1963:07 to 2017:01. The descriptive statistics of sample data are calculated by MatLab and listed in Table 2. For each observation, the skewness (except one
Table 2. Descriptive Statistics (1963:07-2017:01).
Notes: The sample period of Hlth is 1969:7-2017:01 due to the data availability; Mea. = mean, Med. = median, Max. = maximum, Min. = minmum St Dev. = standard deviation, Ske. = skewness Kur. = kurtosis, P = P-value of Jarque-Bera Test.
portfolio, the “Ships’’ industry) is not 0 and the kurtosis is more than 3. The p-value of Jarque-Bera test for each portfolio is 0, which is smaller than 5% significance level. Hence, we can reject the null hypothesis and conclude that the asset returns do not follow Normal distribution. Thus, non-Normal error assumption of SSAEPD might be able to fit the data better.
3.2. Estimation Results
The estimates for our new model are displayed in Table 3. We find out that our model can successfully capture the skewness, fat-tailness and excess kurtosis of the data. More specifically, the skewness parameter α of 46 out of 48 estimates are not equal to 0.5, which captures the skewness in the data. 84 out of 96 estimates for the tail parameters
are smaller than 2, which suggests that portfolio returns are fat-tailed distributed. Besides, all the tail parameters
and
(except one potfolio, the “Other’’ industry) are not equal to each other, which documents the asymmetric fat-tailedness. And 28 out of 48 portfolios have bigger estimates for the left tail parameter P1 which means that these returns tend to have thinner left tails.
3.3. Model Diagnostics
To test the significance of coefficients in FF5-SSAEPD-EGARCH, Likelihood Ratio test (LR) is applied6, which is calculated using Equation (9).
(9)
3.3.1. Tests for Parameter Restrictions
• Tests for Parameters in the Mean Equation
The P-values of LR are listed in Table 4. The null hypothesis of the joint significance test is
. The P-values of the joint significance test for all the 48 portfolios are 0, which means
and
are statistically jointly significant under 5% significance level.
The individual significance tests show that under 5% significance level the coefficient
in all 48 portfolios are statistically significant; 40/48, 28/48, 38/48 and 32/48 portfolios have a statistically coefficient
and
, respectively. As for coefficient
(i.e., the Alpha return), 31 out of the 48 portfolios are statistically significant under 5% significance level. Thus, most of the 48 portfolios seem to be able to earn the
returns.
As a whole, since the 5 factors are significant in most of the 48 portfolios, therefore we can conclude that with non-Normal errors such as SSAEPD and GARCH-type volatilities, the Fama-French 5 factors are still alive.
• Tests for Parameters in the GARCH Equation
In this part, some restrictions on the parameters in the GARCH equation are tested with Likelihood Ratio test (LR). And the results are listed in Table 5. Results show the GARCH-type volatility should be included in Fama-French
Table 3. Estimates for FF5-SSAEPD-GARCH (Monthly, 1963:07-2017:01).
Notes: The data period of Hlth is 1969:7-2017:01 due to the data availability.
Table 4. P-values of Likelihood Ratio Test (LR).
Notes: Sample of Hlth is from 1969:7 to 2017:01 due to the availability of data. TJ means test of joint hypothesis of
. T0 means
. T1 means
, T2 means
. T3 means
. T4 means
. T5 means
.
Table 5. P-values of likelihood ratio test (LR).
Notes: The data period of Hlth is 1969:7-2017:01 due to the lack of data from 1963:7-1969:6. T8 means
. T9 means
. T10 means
. T11 means
. T12 means
. T13 means
. T14 means
. T15 means
. T16 means
.
5-factor model. For instance, we do the joint significance test for hypothesis
. For 46 out of the 48 portfolios, the p-value of the LR are smaller than the significance level 5%, which means our GARCH-type volatilities are quite necessary. As for individual hypotheses, we discover that most P-values of LR are smaller than the significance level 5%. And to be specific, ARCH term (
) is significant in 39 out of 48 portfolios and GARCH term (
) is significant in 27 out of 48 portfolios.
• Tests for Parameters in SSAEPD
We also run significance tests for the parameters in the SSAEPD and the results of parameter restrictions show strong non-Normality. For example, for the Hypothesis
, 39 out of 48 p-values are smaller than the significance level 5%, which means that Normal error assumption is not supported by most of our data. Besides, Asymmetry is documented (
is rejected by 14 out of 48 portfolios). And non-normality is found (
is rejected by 21 out of 48 portfolios and 29 out of 48 portfolios reject the null
).
3.3.2. Residual Check
In this subsection, the residuals for previous models are checked with both Kolmogorov-Smirnov test and graphs. Our results show 41 out of the 48 portfolios have residuals which do follow SSAEPD. That means, the FF5-SSAEPD-GARCH is adequate for the 48 industry portfolios. But the FF5-Normal model is not adequate for the data, since all the 48 portfolios have residuals which do not follow the Normal error distribution.
• Kolmogorov-Smirnov Test for Residuals
To check the residuals, the Kolmogorov-Smirov test (KS)7 is employed. The p-value of KS test is displayed in Table 6. The p-values of KS test8 show the residuals from the new model do follow SSAEPD. For instance, the p-value of the portfolio of Agriculture industry is 0.79, greater than 5%, which means under 5% significance level, the null hypothesis is not rejected and the residuals from FF5-SSAEPD-EGARCH do follow the SSAEPD. Similarly, the null hypothesis cannot be rejected for all other 40 portfolios.
Then, we apply the KS test for the residuals from the FF5-Normal model9. The p-values of the KS test are also listed in Table 6. All of the 48 portfolios have smaller p-values than 0.05, which means these 48 industry portfolios reject the nulls. Hence, the error terms of the portfolios do not follow Normal distribution. And the FF-Normal model is not adequate for the data.
• PDFs of Residuals
By method of “eye-rolling’’, the PDF of residuals is compared with theoretical PDFs. Taking the portfolio of Agriculture industry for example, in Figure 1(a), the probability density function (PDF) for the estimated residuals
in FF5-SSAEPD-EGARCH and that of
are plotted. These curves are very close to each other, which means the residuals are distributed as SSAEPD. Hence, the FF5-SSAEPD-GARCH model fits the data well.
Similarly, the probability density function (PDF) for the estimated residuals
in FF5-Normal and that of
are shown in Figure 1(b). And there are big differences between these two curves, which means the residuals do not follow Normal distribution.
Table 6. P-values of KS Test for Residuals.
Notes: 1. The data period of Hlth is 1969:7-2017:01 due to the data availability. 2.* means the data doesn’t follow the specified distribution under 5% significance level. M1 = FF5-SSAEPD-GARCH, M2 = FF5-Normal.
3.4. Model Comparison
In this subsection, we compare the model in [1] with the 5-factor model of Fama and French. The Akaike Information Criterion (AIC) is used as the model selection criterion. Table 7 lists the AIC values. We find that 47 out of 48 AIC values of the FF5-SSAEPD-GARCH model are smaller than those of the FF5-Normal model. Hence, we conclude that the new model we used (FF5-SSAEPD-GARCH) is better than the 5-factor model of Fama and French.
4. Conclusions
In this paper, we empirically test the new 5-factor model suggested in [1] . Their new model generalizes the 5-factor model in [2] by introducing a non-normal
(a)(b)
Figure 1. Comparison of PDFs. (a) PDFs of the Residuals (FF5-SSAEPD-GARCH) and
; (b) PDFs of the Residuals (FF5-Normal) and
.
error term and time-varying volatilities. The non-normal error assumption their used is the SSAEPD of [10] and the time-varying volatilities is the GARCH model of [13] . For comparison, monthly US stock returns of 48 industry portfolios (1963:07-2017:01) are analyzed.
Table 7. AIC Values (Monthly, 1963:07-2017:01).
Notes: 1. The data period of Hlth is 1969:7-2017:01 due to the data availability. 2. Numbers with * are smaller AIC values. M1 = FF5-SSAEPD-GARCH, M2 = FF5-Normal.
Method of Maximum Likelihood (MLE) is used for parameters estimation. Likelihood Ratio Test (LR) is used to test the hypotheses of parameter restrictions. Kolmogorov-Smirnov test (KS) is used to check residuals. Akaike Information Criterion (AIC) is used to compare models.
Simulation results show the MatLab program for the new 5-factor model in [1] is valid. And empirical results show 1) this new model can capture the skewness, fat tails and asymmetric fat-tailedness in the data. 2) the Fama-French 5 factors are still alive even if the non-normal errors and GARCH-type volatilities are considered. Since we find out the 5 factors are statistically significant in most of the 48 portfolios. And 3) FF5-SSAEPD-GARCH model can fit the data much better than the 5-factor model in [2] .
Future extensions will include but not limited to follows. First, we can exam our results with different data. Second, we can compare our results with those from other models such as ARIMA model. Last but not the least, other factors can be introduced into this model to explain the
returns of industry portfolios.
Appendix 1.
Four-digit SIC codes are used to assign firms to 48 industries. The variables defined in the 1st column in Table 8 are used as the dependent variables in this paper.
Appendix 2. Fama-French 5-Factor Model (FF5-Normal)
Equation (10) is the new 5-factor model (denoted as FF5-Normal) suggested by Fama and French (2015). And they show this model empirically outperforms Fama-French (1993)’s 3-factor model.
(10)
where
are parameters in this model.
is the return on stock portfolio.
is the risk-free return.
is the value-weighted market return.
is the return of small minus big.
stands for the return of robust minus weak.
stands for the returns of conservative minus aggressive.
is the return of high minus low orthogonalized, which is the sum of the intercept and the residual from the regression of
on
. The reason of using
instead of
is that Fama and French (2015) show
(the high minus low book-to-market ratio) is redundant in following 5-factor model.
(11)
Appendix 3. SSAEPD
If a random variable X is distributed as AEPD, we denote
If a random variable X is distributed as standard AEPD, we denote
or in short,
If a random variable Z is distributed as standardized standard AEPD, we denote
or
with mean zero and the variance 1. That is,
,
The brief history of SSAEPD is listed in Table 9.
The probability density function (PDF) of the Standardized Standard AEPD (SSAEPD) proposed by Zhu and Zinde-Walsh (2009)10 is
(12)
Table 8. Variable definitions for 48 industries.
where
(13)
Table 9. The History of the SSAEPD distribution.
Notes: EPD = Exponential Power Distribution; SEPD = Skewed Exponential Power Distribution; SSAEPD = Standardized Standard Asymmetric Exponential Power Distribution. This table is a revision of the one in Jin (2011).
(14)
(15)
(16)
(17)
(18)
,
,
,
,
.
(or
) is the parameter controlling the left (or right) tail.
controls the skewness. The mean of
is zero and its variance is 1. When
,
, SSAEPD can be reduced to Normal (0, 1).
Appendix 4. Simulation Results
We check the MatLab program written by Zhou and Li (2016) by following simulation and find out the program is valid and can be used to analyze our empirical data. The FF5-SSAEPD-GARCH (1,1) is simulated as follows.
(19)
(20)
The data generation process is as follows:
1) Given
, generate SSAEPD random numbers
11.
*note
2) Set
, generate
and
with following formula:
Notes: T means the true value of parameters. E means the estimates. P means the error in percentage.
3) Generate
,
,
,
,
from Uniform (0,1)12.
4) Set
, and we can get
.
12For simplicity, we use Xs to represent the 5 factors in simulation.
After getting the simulated data
, we can use them to estimate the parameters in the FF5-SSAEPD-GARCH model. The simulation results are reported in Table 10, almost all the estimates are close to the true values of the parameters. Hence, we can draw the conclusion that this MatLab program is valid from empirical analysis.
NOTES
1The FF5-Normal model of [2] is in Appendix 2.
2The history of SSAEPD is displayed in Appendix 3.
3 [1] analyze 25 Fama-French portfolios, which is different from the dataset we use.
4Simulation results are listed in Appendix 4.
5The reason of using
instead of
can be found in Appendix 2.
6LR formula is from Neyman and Pearson (1993).
7The null hypothesis of KS test is
: Data follows a specified distribution. If the P-value of KS test is bigger than 5% significance level, the null hypothesis is not rejected. Otherwise, the null hypothesis is rejected.
8The null hypothesis is
: FF5-SSAEPD-GARCH residuals are distributed as
.
9The FF5-Normal model is listed in Appendix 2. The null hypothesis
: FF5-Normal residuals are distributed as
.
10For more information about SSAEPD, one can refer to Appendix 3.