1. Introduction
The fundamental building block of time series is stationarity and basically, the idea behind stationarity is that the probability laws that govern the behaviour of the process do not change overtime. This is to ensure that the time series process is in a state of statistical equilibrium and would in turn enhance a statistical setting for describing and making inferences about the structure of data that somehow fluctuate in a random manner [1] [2] [3]. According to [3], a process is said to be strictly stationary if the whole probability structure must depend only on time differences. A less restrictive requirement, called weak stationarity of order k, is that the moments up to some order k depends only on time lags and that the second order stationarity plus an assumption of normality are sufficient to produce strict stationarity (see also, [4] [5]). For simplicity, a time series is said to be stationary, if it has a mean, variance and autocovariance function that are constant over time (see [6]). Moreover, one most important and fundamental example of a stationary process is the white noise process which is defined as a sequence of independent (uncorrelated) and identically distributed random variables with zero mean and constant variance [2] [3] [5]. Thus, the white noise process is particularly important and constitutes an essential bedrock in time series model building.
In this study, our aim is to apply white noise process in measuring model adequacy targeted at confirming independence assumption, which ensures that no autocorrelation exists in the time series considered and that the ARMA model entertained is able to capture the linear structure in the dataset.
The motivation stems from the fact that the problem of statistical modeling is to achieve parsimony (i.e. the principle of model selection with the possibility of having the smallest number of parameters that completely express the linear dependence structure, providing better prediction, and generalization of new observations) conditional on the restriction of model adequacy.
Testing for model adequacy or diagnostic checking as defined by [7] incorporates all relevant information and when calibrated to the data no important significant departures from statistical assumptions made can be found. Actually, model adequacy involves residual analysis and overfitting. In time series modeling, a good model parameter estimates must be reasonably close to the true values, should have the dependence structure of the data adequately captured, and should also produce residuals that are approximately uncorrelated [2] [6] [8]. These residuals are obtained by taking the difference between an observed value of a time series and a predicted value from fitting a candidate model to the data. They are useful in checking whether a model has adequately captured the information in the data. According to [6], model adequacy is related primarily to the assumption that residuals are independent. Moreover, if the residuals of a given model are correlated, the model must be refined because it does not completely capture the statistical relationship amongst the time series [2]. Furthermore, a model is said to be adequate if the residuals are statistically independent implying that the residual series is uncorrelated. Therefore, in testing for model adequacy, which is mainly to check for independence of the residual series, an autocorrelation function (ACF), Partial autocorrelation function (ACF) and Ljung-Box test on the residuals are considered.
Another adequacy checking tool is overfitting, which has to do with adding another coefficient to a fitted model so as to see if the resulting model is better. The following are identified as the implications of fitting and overfitting:
1) Specify the original model carefully. If a simple model seems promising, check it out before trying a more complicated model.
2) When overfitting, do not increase the orders of both the autoregressive (AR) and moving average (MA) parts of the model simultaneously.
3) Extend the model in directions suggested by the analysis of the residuals. However, one setback of overfitting is the tendency of the violation of the principle of parsimony [2] [6].
Model adequacy has also been explored by the following studies: [7] - [17].
In addition, the remaining part of this work is organized as follows; Section 2 takes care of the methodology, followed by the results and then the discussion in Section 3, while the conclusion of the overall results is handled in Section 4.
2. Methodology
2.1. Return Series
The return series,
, can be obtained given that
is the price of a unit share at time, t, while
is the share price at time, t − 1.
(1)
here,
is regarded as a transformed series of the share price,
, meant to attain stationarity while B is the backshift operator. Thus, both the mean and the variance of the series are stable [18] [19].
2.2. Autoregressive Integrated Moving Average (ARIMA) Model
In [3] the extension of ARMA model to deal with homogenous non-stationary time series in which
, itself is non-stationary but its
difference is a stationary ARMA model. Denoting the
difference of
by
(2)
where
is the nonstationary autoregressive operator such that d of the roots of
are unity and the remainder lie outside the unit circle.
is a stationary autoregressive operator (see also, [20] [21]).
2.3. Stationarity
The foundation of time series analysis is stationarity. Consider a finite set of return variables
from a time series process,
. The k-dimensional distribution function is defined as
(3)
where
are any real numbers.
A process is said to be:
1) first-order stationary in distribution if its one-dimensional distribution is time invariant. That is, if
, (4)
for any integers
and
.
2) second-order stationary in distribution if
(5)
for any integers
and
.
3) the
-order stationary in distribution if
(6)
for any
and k integers.
A process is said to be strictly stationary if (3.6) is true for any n, that is,
According to [3], a process
is weakly stationary if the mean
is a fixed constant for all t and the autocovariances
depends only on the time difference or time lag k for all t.
Stationary in the wide sense or covariance stationary is also referred to as second-order stationary process.
2.4. White Noise Process
A process
is called a white noise process if it is a sequence of uncorrelated random variables from a fixed distribution with constant mean,
, usually assumed to be zero, constant variance,
and
, for all
. It is denoted by
, where WN stands for white noise [5]. By definition, a white noise process
is stationary with autocovariance function,
(7)
The autocorrelation function is given as:
(8)
while the partial autocorrelation function is
(9)
Thus, the implication of a white noise specification is that the ACF and PACF are identically equal to zero.
2.5. Autocovariance and Autocorrelation Functions
According to [5], covariance between
and
denoted by
, which is a function of the time difference, k, is called the autocovariance function
of the stochastic process. As a function of k,
is called the autocovariance function in the time series analysis since it represents the covariance between
and
from the same process. It is defined as
(10)
The sample estimate of
is
given by
(11)
Similarly, the correlation between
and
denoted by
, which is a function of the time difference, k, is called the autocorrelation function
of the stochastic process. As function of k,
is called the autocorrelation function in time series analysis since it represents the correlation between
and
from the same process. It is defined as
(12)
The corresponding sample estimate is given by
(13)
2.6. Partial Autocorrelation Function (PACF)
The conditional correlation between
and
after their mutual linear dependency on the intervening variables
has been removed, given by
, is usually referred to as the partial autocorrelation in time series analysis ([5]).
Partial autocorrelation can be derived from the regression model, where the dependent variable,
, from a zero-mean stationary process is regressed on k-lagged variables
and
, that is
(14)
where
denotes the
regression parameter and
is an error term with mean zero and uncorrelated with
, for
. Multiplying
on both sides of the above regression equation and taking the expectation, we get
. (15)
Hence,
. (16)
2.7. Diagnostic Checking of Linear Time Series Models
Diagnostic checking is applied with an objective of uncovering a possible lack-of-fit of the tentative model and possibly unraveling the cause of such a case. If no lack-of-fit is indicated, the model is ready for use. In other words, if any inadequacy is found, the iterative cycle of identification, estimation and diagnostic checking is repeated until a suitable and appropriate representation is obtained.
Once the parameters of the tentative models have been estimated, we check whether or not the residuals obtained from the estimated equation are approximately white noise. This is done by examining the ACF and PACF of the residuals to see whether they are statistically insignificant, that is, within two standard deviations at 5% level of significance. If the residuals are approximately white noise, the model may be entertained provided the parameters are significantly different from zero.
The Portmanteau lack-of-fit test uses the residual sample ACFs as a unit to check the joint null hypothesis test, which requires that several autocorrelations of
are zero. [17] proposed the Portmanteau statistics given as:
, (17)
where T is the number of observations.
A test statistic for the null hypothesis,
, against the alternative,
, for some
under the assumption that
is an i.i.d. sequence with certain moment conditions while
is asymptotically a Chi-square random variable with m degrees of freedom.
[22] modified the
statistic to increase the power of the test in finite samples as follows:
, (18)
where T is the number of observations.
The decision rule is to reject
if
, where
denotes the 100(1 – α)th percentile of a Chi-squared distribution with m – (p + q) degrees of freedom. The decision rule can also reject
if the p-value is less than or equal to α, the significance level.
In practice, the selection of m may affect the performance of the Q(m) statistic. The choice
provides better power performance [4].
3. Results and Discussion
3.1. Dataset
Data collection was based on secondary source from the records of Nigerian Stock Exchange. The data on the daily closing share prices of the sampled banks (Union bank, Unity bank and Wema bank) from January 3, 2006 to November 24, 2016 were obtained from the Nigerian Stock Exchange [23] and delivered through contactcentre@nigerianstockexchange.com.
3.2. Interpretation of Time Plot
Figures 1-3 represent the share price series for the three banks. The share prices of all the banks do not fluctuate around a common mean, which clearly indicates the presence of a stochastic trend in the share prices, and is also an indication of non-stationarity. Since the share price series are found to be non-stationary, the first difference of the natural logarithm of share price series is taken to obtain stationary (returns) series. The inclusion of the log transformation is to stabilize the variance. Figures 4-6 show that the returns series appear to be stationary.
Figure 1. Share price series of union bank of Nigeria.
Figure 2. Share price series of unity bank.
Figure 3. Share price series of wema bank.
Figure 4. Return series of union bank of Nigeria.
3.3. Building Autoregressive Integrated Moving Average (ARIMA) Model
3.3.1. Building Autoregressive Integrated Moving Average (ARIMA) Model of Union Bank of Nigeria
1) Model identification
From Figure 7 and Figure 8, both ACF and PACF indicate that mixed model could be entertained. The following models; ARIMA(1,1,0), ARIMA(0,1,1) and ARIMA(1,1,1) were considered tentatively.
Figure 7. ACF of return series of union bank of Nigeria.
Figure 8. PACF of return series of union bank of Nigeria.
2) Estimation of parameters
From Table 1, ARIMA(1,1,0) model is selected based on the grounds of significance of the parameters and minimum AIC.
3) Diagnostic checking of the model
From Figure 9 and Figure 10, all the lags coefficients of ACF and PACF are within the significance bands, that is, they are zero implying that the residual series of ARIMA(1,1,0) model appears to be a white noise series, that is, the series is independent and identically distributed with mean zero and constant variance.
Evidence from Ljung-Box Q-statistics in Table 2 shows that ARIMA(1,1,0) model is adequate at 5% level of significance given the Q-statistic at Lags 1, 4, 8 and 24. That is, the hypothesis of no autocorrelation is not rejected. Thus, confirming the independence of residual series.
3.3.2. Building Autoregressive Integrated Moving Average (ARIMA) Model of Unity
Bank
1) Model identification
From Figure 11 and Figure 12, both ACF and PACF indicate that mixed model could be entertained. The following models; ARIMA(1,1,0), ARIMA(0,1,1) and ARIMA(1,1,1) were considered tentatively.
2) Estimation of parameters
From Table 3, ARIMA(1,1,0) model is selected based on the grounds of significance of the parameters and minimum AIC.
3) Diagnostic checking of the model
From Figure 13 and Figure 14, all the lags coefficients of ACF and PACF are within the significance bands except lag 9, that is, they are zero implying that the residual series of ARIMA(1,1,0) model appears to be a white noise series, that is, the series is independent and identically distributed with mean zero and constant variance.
Evidence from Ljung-Box Q-statistics in Table 4 shows that ARIMA(1,1,0) model is adequate at 5% level of significance given the Q-statistic at Lags 1, 4, 8 and 24. That is, the hypothesis of no autocorrelation is not rejected. Hence, confirming the independence of the residual series.
3.3.3. Building Autoregressive Integrated Moving Average (ARIMA) Model of Wema Bank
1) Model identification
From Figure 15 and Figure 16, both ACF and PACF indicate that mixed model could be entertained. The following models; ARIMA(1,1,0), ARIMA(2,1,0), ARIMA(0,1,2) and ARIMA(2,1,1) were considered tentatively.
2) Estimation of parameters
From Table 5, ARIMA(2,1,0) model is selected based on the grounds of significance of the parameters and minimum AIC.
3) Diagnostic checking of the model
From Figure 17 and Figure 18, all the lags coefficients of ACF and PACF are within the significance bands, that is, they are zero implying that the residual series of ARIMA(2,1,0) model appears to be a white noise series, that is, the series is independent and identically distributed with mean zero and constant variance.
Evidence from Ljung-Box Q-statistics in Table 6 shows that ARIMA(2,1,0) model is adequate at 5% level of significance given the Q-statistic at Lags 1, 4, 8 and 24. That is, the hypothesis of no autocorrelation is not rejected. Thus, confirming the independence of the residual series.
Table 1. ARIMA models for return series of union bank of nigeria.
Source: output of data analysis.
Table 2. Ljung-box test on ARIMA(1,1,0) model for return series of union bank of nigeria.
Source: output of data analysis.
Table 3. ARIMA models for return series of unity bank.
Source: output of data analysis.
Table 4. Ljung-box test on ARIMA(1,1,0) model for return series of unity bank.
Source: output of data analysis.
Table 5. ARIMA models for return series of wema bank.
Source: output of data analysis.
Table 6. Ljung-box test on ARIMA(2,1,0) model for return series of wema bank.
Source: output of data analysis.
Figure 9. ACF of Residuals of ARIMA(1,1,0) Model fitted to Return Series of Union Bank of Nigeria.
Figure 10. PACF of Residuals of ARIMA(1,1,0) Model fitted to Return Series of Union Bank of Nigeria.
Figure 11. ACF of return series of unity bank.
Figure 12. PACF of return series of unity bank.
Figure 13. ACF of Residuals of ARIMA(1,1,0) Model fitted to the Return Series of Unity Bank.
Figure 14. PACF of Residuals of ARIMA(1,1,0) Model fitted to the Return Series of Unity Bank.
Figure 15. ACF of return series of wema bank.
Figure 16. PACF of return series of wema bank.
Figure 17. ACF of residual series of ARIMA(2,1,0) model fitted to return series of wema bank.
Figure 18. PACF of residual series of ARIMA(2,1,0) model fitted to return series of wema bank.
So far, the residuals series of the selected models for the three banks considered have been analyzed and found to follow a noise process and it thus suffice the aim of our study. The study further agrees with the works of [7] - [17] that model adequacy could be measured by white noise processes through ACF, PACF and Ljung-Box test but differs in that it considers the returns series of Nigerian banks.
4. Conclusion
In summary, our study showed that model adequacy could be measured by white noise process through ACF, PACF, and Ljung-Box test. The role of white noise process in checking the model adequacy was properly appraised and confirmed that modeling a white noise process satisfies all the conditions for stationarity (independence). However, the failure to apply overfitting approach of model adequacy is one weakness of this study and it is recommended that further study should be extended to cover overfitting.