Modelling Stochastic Volatility in the Kenyan Securities Market Using Hidden Markov Models

Matilda B. Bosire; Samuel Chege Maina

doi:10.4236/jfrm.2021.103021

Journal of Financial Risk Management > Vol.10 No.3, September 2021

Modelling Stochastic Volatility in the Kenyan Securities Market Using Hidden Markov Models

Matilda B. Bosire^*, Samuel Chege Maina
Strathmore University, Institute of Mathematical Sciences, Nairobi, Kenya.
DOI: 10.4236/jfrm.2021.103021 PDF HTML XML 331 Downloads 1,746 Views Citations

Abstract

This paper models stochastic volatility using Hidden Markov Models in Kenya. The univariate Stochastic volatility Model is calibrated to the Nairobi Securities Exchange 20 share index daily data from January 2012 to February 2021. The Hidden Markov model (HMM) is employed to establish volatility regimes while the Expected Maximization (EM) algorithm is applied in parameter estimation. Markov Chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) techniques are employed in filtering out noisy observations in parameter estimation. The 4-state model, which divides the economy into periods of very high, high, low, and very low volatility, is established to be optimal.

Keywords

Hidden Markov Models, Stochastic Volatility, Nairobi Securities Exchange 20 (NSE 20) Share Index, Volatility Regimes

Share and Cite:

Bosire, M. and Maina, S. (2021) Modelling Stochastic Volatility in the Kenyan Securities Market Using Hidden Markov Models. Journal of Financial Risk Management, 10, 367-395. doi: 10.4236/jfrm.2021.103021.

1. Introduction

The probability that stock prices will rise or decline is an increasing function of volatility, which in turn leads to an increase in the value of options. The use of volatility as a proxy to risk has resulted in an increased need to accurately model and forecast volatility which is vital for a range of applications including financial asset pricing, hedging strategies, portfolio selection and asset management. The Black-Scholes model, as a forerunner to the option pricing framework, is still widely used in the financial market. The model assumes that continuously compounded log spot asset prices are normally distributed with a constant mean and variance.

However, empirical studies have shown that this is not always the case, as market prices have shown peakedness and fat tails, and the constant variance assumption does not fit market-realistic models particularly for volatility-backed financial assets. Due to this major shortcoming, researchers have come up with asset pricing models that allow for the volatility to be heteroskedastic despite some conditioning it on some prespecified criteria. Allowing for a time-varying variance has proven to be a significant improvement to the modeling dynamics of various financial assets and macroeconomic factors, with applications present in the modeling of inflation (Chan, 2013), foreign exchange (Ahlip, 2008), volatility (Maqsood, Safdar, Shafi, & Lelit, 2017) and macroeconomic forecasting (Clark & Ravazzolo, 2015). Stochastic volatility models specify a stochastic differential equation driven by its own risk process for each of the stock price and volatility equations. While local volatility models are a valuable simplification of the time-varying volatility, they tend to produce a flattened forward implied volatility and have generally shown an inability to price complex structured products close to their market prices.

Various studies have suggested the use of Hidden Markov Models (HMMs) in the estimation of standard and non-standard stochastic volatility models. Hidden Markov models describe the underlying state of the economy by encoding all information in the financial markets into a single (state) process (Krishnamurthy, Leoff, & Sass, 2018). They have their applications in the determination of financial time series data regimes and amplification of signals for momentum indicators aimed at improving investment decision-making.

A hidden Markov Model (HMM) is a statistical Markov chain where the system being modelled is assumed to be a Markov process with unobservable or hidden states and transition probabilities. The observation process is assumed to be strictly Markovian and its behaviour depends only on the finite or infinite state process, which is only partially observable through the observation process. HMMs have further applications in econometrics, signal processing, pattern recognition, computational finance and bioinformatics. Financial signal processing techniques have been majorly employed in technical analysis by quantitative analysts in robust asset selection and high-frequency trading to gain from short random asset market price fluctuations. This entails analysis of key behaviours and patterns within financial markets over time for prediction purposes.

In Kenya, Hidden Markov models have been employed in modelling daily rainfall occurrence (Nyongesa, Zeng, & Ongoma, 2020), estimation of malaria symptom data set (Mbete, Nyongesa, & Rotich, 2019), credit scoring for the poor and unbanked financial consumers (Bundi et al., 2016) and calibration and state estimation for the Vasicek term-structure model to the evolution of interest rates (Chelimo, 2017).

The purpose of this paper is to model stochastic volatility using a hidden Markov model in the Kenyan economy. The question arises as to how to model volatility in a way that is consistent with the market-observed variation of non-constant volatility together with identifying hidden states that drive the volatility process. We consider the univariate stochastic volatility model when modelling volatility, as it is simple but flexible enough to incorporate volatility stylized facts together with allowing for non-linear modelling of the latent state space.

The HMM is employed in identifying hidden factors and possible volatility regimes in the volatility process. The Markov property of the HMM allows for computationally viable algorithms as well as a general enough structure that can model complex behavior. In the long run, volatility is a necessary prediction tool for investors in asset pricing and portfolio management. This study seeks to create a theoretical foundation for the application of HMMs in stochastic volatility modelling to the growing financial sector in Kenya and other frontier markets.

The rest of this paper is structured as follows: Section 2 provides a mathematical definition of the univariate stochastic volatility model, Hidden Markov Model (HMM), the Expected Maximization (EM) algorithm, Markov Chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) techniques are employed in filtering out noisy observations in parameter estimation. Section 3 gives the calibration results of the univariate Stochastic Volatility (SV) model under the different stated filtering techniques, while Section 4 provides a discussion on the effects of macroeconomic variables on volatility regime switches, together with an analysis of the effects of changes in states/regimes on parameter estimates under the stated filtering techniques. Finally, Section 5 provides a conclusion and limitations of the study, as well as suggestions for further research.

2. Model Description

2.1. The Stochastic Volatility Model

Volatility can be modelled probabilistically through state-space models where the logarithm of squared volatilities (latent states) follows an AR (1) process (Andersen, Davis, Kreiß, & Mikosch, 2009; Kastner, 2016). The log return of

asset price, $P_{t}$ , is defined by $r_{t} = \log \frac{P_{t}}{P_{t - 1}}$ and admits the univariate SV model:

$r_{t} = σ_{t} ε_{t}$ (1)

where $ε_{t}$ is a Gaussian white noise process with zero mean and unit variance, and $σ_{t} > 0$ is a stochastic process representing the volatility of $r_{t}$ . Based on empirical results, $σ_{t}^{2}$ follows a log-normal distribution. As such, there exists a random variable $z_{t} = \log σ_{t}^{2}$ , which is normally distributed, and which reduces Equation (1) to:

$r_{t} = \exp (\frac{z_{t}}{2}) ε_{t}$ (2)

Traditionally, $z_{t}$ is assumed to follow a first order AR (1) process with Gaussian innovation white noise:

$z_{t} = ϕ z_{t - 1} + c + ω_{t}$ (3)

where $ϕ$ and c are constants, while $ω_{t} ~ N (0, Q)$ and $ε_{t}$ are mutually independent. Additionally, if $| ϕ | < 1$ , the above process is wide-sense stationary.

When a scaling factor, $β$ , is be introduced to Equation (2) to remove the constant term c in Equation (3), we have the canonical SV model for the log return a:

$z_{t} = ϕ z_{t - 1} + ω_{t},$ (4)

$r_{t} = β \exp (\frac{z_{t}}{2}) ε_{t},$ (5)

with initial state $z_{0}$ .

Equations (4) and (5) are the state model and the measurement/observation model, respectively. These are a particular type of stochastic non-linear state space models, where $z_{t}$ is the unobserved state variable commonly interpreted as the latent time-varying volatility process, $r_{t}$ is the output of the model which is the return process in this case, $ω_{t}$ and $ε_{t}$ are considered as the process noise and the measurement noise respectively, and $θ = {ϕ, Q, β}$ are the model parameters. Squaring Equation (5) and introducing a logarithm reduces the equation to:

$x_{t} = α + z_{t} + v_{t},$ (6)

where:

$x_{t} = \log (r_{t}^{2}),$

$α = \log (β^{2}) + E [\log (ε_{t}^{2})],$

$v_{t} = \log (ε^{2} - E [\log (ε_{t}^{2})]) ~ \log (χ_{1}^{2}) - E [\log (χ_{1}^{2})] .$

This yields a univariate stochastic volatility model:

$z_{t} = ϕ z_{t - 1} + ω_{t},$ (7)

$x_{t} = α + z_{t} + v_{t},$ (8)

where $w_{t} ~ N (0, Q)$ and $v_{t} ~ \log (χ_{1}^{2}) - E [\log (χ_{1}^{2})]$ .

The vector of parameters is $θ = {α, ϕ, Q}$ , representing the level of log-variance, persistence of log-variance and volatility of log-variance respectively. We then apply the method of moments (Kim, 2005) to generate consistent estimates for a linear system:

$α^{(0)} = \frac{\sum_{t = 0}^{n} x_{t}}{n},$

$ϕ^{(0)} = \frac{\sum_{t = 3}^{n} (x_{t} - {\bar{x}}_{t}) (x_{t - 2} - {\bar{x}}_{t - 2})}{\sum_{t = 3}^{n} (x_{t - 1} - {\bar{x}}_{t - 1}) (x_{t - 2} - {\bar{x}}_{t - 2})},$

$Q^{(0)} = \frac{\sum_{t = 2}^{n} (x_{t} - {\bar{x}}_{t}) - ϕ (x_{t - 1} - {\bar{x}}_{t - 1})}{n} - \hat{v} a r (v_{t}) - ϕ^{2} \hat{v} a r (v_{t - 1}) .$

2.2. Hidden Markov Model

We consider a bivariate probabilistic HMM that consists of a state process $z_{k}$ , which is discrete and takes its values from some finite set $Z$ equipped with a countably generated σ-algebra $Ω$ , and the observation process $x_{k}$ from another finite space $X$ with corresponding σ-algebra $Λ$ , such that ( $Z$ , $Ω$ ) and ( $X$ , $Λ$ ) are measurable spaces. $z_{k}$ are considered to be latent (hidden) variables governed by $P r (x_{k} | z_{k})$ and $P r (z_{k} | z_{k - 1})$ .

The joint distribution of $x_{n}$ and $z_{n}$ is such that:

$P r (x_{1}, \dots, x_{n}, z_{1}, \dots, z_{n}) = P r (z_{1}) P r (x_{1} | z_{1}) \prod_{k = 2}^{n} P r (z_{k} | z_{k - 1}) P r (x_{k} | z_{k}),$

with stationary transitional probabilities:

$T_{(i, j)} = P r (z_{k + 1} = j | z_{k} = i) \forall i, j \in (1, \dots, n),$

emission probability:

$ξ_{i} (x) = P r (x_{k} | z_{k} = i) \forall i \in (1, \dots, n),$

and an initial distribution:

$π (i) = P r (z_{1} = i) \forall i \in (1, \dots, n) .$

These reduce the joint distribution to:

$P r (x_{1}, \dots, x_{n}, z_{1}, \dots, z_{n}) = π (i) ξ_{z_{1}} (x_{1}) \prod_{k = 2}^{n} T (z_{k - 1}, z_{k}) ξ_{z_{k}} (x_{k}) .$

Following the formal definition of (Bickel, Ritov, & Ryden, 1998), the stochastic process $(z_{n}, n \geq 1)$ with a state space $(Z, Ω)$ is a hidden Markov model if the following hold:

1) We are given, but do not observe, a strictly stationary Markov Chain $z_{1}, \dots, z_{n}$ with state space $(Z, Ω)$ .

2) For all n given $(z_{1}, \dots, z_{n})$ , the $(x_{i}, i = 1, \dots, n)$ are conditionally independent and the conditional distribution of $x_{i}$ depends only on $z_{i}$ .

3) The conditional distribution of $x_{i}$ given $z_{i}$ does not depend on i.

We assume that the process $(x_{i}, i \geq 1)$ is strictly stationary, and if the hidden Markov chain $(z_{i}, i \geq 1)$ is ergodic, then $(x_{i}, i \geq 1)$ is also ergodic.

2.2.1. Expectation-Maximization Algorithm

The EM algorithm is employed in finding the (local) maximum likelihood estimates for parameters where the model depends on unobserved latent variables and where equations cannot be solved directly. The EM iteration alternates between an expectation (E) step, which creates a function for the expectation of the log-likelihood of the complete data over the smoothing distribution, which is evaluated using the current parameter estimates, and a maximization (M) step, which re-estimates parameters by maximizing the expected marginal log-likelihood found on the E-step. These parameter estimates are then used to determine the distribution of the latent variables in the next E-step.

Given the complete data-set, ${x, z}$ , of observed data and unobserved data respectively, and a vector of unknown parameters, $θ$ , along with a likelihood function, $L (θ; x, z) = p (x, z | θ)$ , then:

E-Step:

The E-step of the EM algorithm computes the expected value of $L (θ; x, z)$ given the observed data, x, and the current parameter estimate, $θ_{t}$ . In particular, we define:

$Q (θ; θ_{t}) : = E [L (θ; x, z) | x, θ_{t}] = \int L (θ; x, z) p (z | x, θ_{t}) d z,$

where $p (z | x, θ_{t})$ is the conditional density of z given the observed data, x, and assuming $θ = θ_{t}$ .

M-Step:

The M-step consists of maximizing over $θ$ the expectation step computed above. That is, we set:

$θ_{t + 1} = \arg_{θ} \max (θ | θ_{t}) .$

We then set $θ_{t} = θ_{t + 1}$ and repeat up until $θ_{t}$ converges, with convergence to a local maximum guaranteed. The idea is to first initialize the parameters $θ$ to some (random) variables, compute the conditional probability $P r (z | θ)$ , then use the computed values of z to compute better estimates for the parameters $θ$ . This is performed iteratively until convergence, which is achieved when parameter estimates become stable and no further improvements can be made to the likelihood value. The ML estimate of $θ$ is then taken as the best estimate of the local maxima.

2.2.2. Filtering Techniques

Markov Chain Monte Carlo

Given a collection of observations $x_{1: t} : = (x_{1}, \dots, x_{t})$ , inference is made with regard to the parameter set, $θ$ , and the states, $z_{1: t} : = (z_{1}, \dots, z_{t})$ . In the Bayesian framework, inference relies on the posterior density:

$p (θ, z_{1: t} | x_{1: t}) = p (θ | x_{1: t}) p_{θ} (z_{1: t} | x_{1: t}),$

where:

$p (θ | x_{1: t}) = \frac{p_{θ} (x_{1: t}) p (θ)}{p (x_{1: t})}, p_{θ} (z_{1: t} | x_{1: t}) = \frac{p (z_{1: t}, x_{1: t})}{p_{θ} (x_{1: t})},$

with:

$p_{θ} (z_{1 : t}, x_{1 : t}) = φ_{θ} (z_{1}) \prod_{n = 2}^{t} f_{θ} (z_{n} | z_{n - 1}) \prod_{n = 1}^{t} g_{θ} (x_{n} | z_{n}) .$

It is feasible to design efficient MCMC algorithms for linear Gaussian and finite state space models where it is possible to sample from $p_{θ} (z_{1: t} | x_{1: t})$ and compute $p_{θ} (x_{1: t})$ . Common practice is to build MCMC kernel updating subblocks as:

$p_{θ} (z_{n : n + K - 1} | x_{n : n + K - 1}, z_{n : n + K - 1}) \propto \prod_{m = n}^{n + K} f_{θ} (z_{m} | z_{m - 1}) \prod_{m = n}^{t} g_{θ} (x_{m} | z_{m}) .$

The prior distribution of the parameter vector is specified by choosing independent components of each parameter, that is, $p (θ) = p (α) p (ϕ) p (Q)$ . $α \in ℝ$ is equipped with a normal and uninformative prior $α ~ N (b_{α}, B_{α})$ where it is common practice to set $b_{α} = 0$ and $B_{α} \geq 100$ for daily log-returns. For $ϕ \in [- 1,1]$ , $(ϕ + 1) / 2 \in B (a_{0}, b_{0})$ where $a_{0}$ and $b_{0}$ are hyperparameters and $B (x, y)$ denotes the beta function. As for the volatility of the variance process, $Q \in R^{+} ~ B_{Q} * Ξ_{1}^{2} = G (1 / 2, 1 / 2 B_{Q})$ or an equivalent centered normal distribution $\pm \sqrt{Q} ~ N (0, B_{Q})$ (Frühwirth-Schnatter & Wagner, 2010).

Joint (batched) sampling of all instantaneous volatilities through the “All Without a Loop (AWOL)” is a key feature that enables for a significant reduction in the correlation of the draws. Complex models such as volatility models are such that it is possible to sample from the prior only as a result of which the question arises on how to efficiently evaluate a MCMC sampler point-wise on these models. The SMC methods provide approximations for $p_{θ} (z_{1: t} | x_{1: t})$ and $p_{θ} (x_{1: t})$ sequentially, and are considered for general state HMMs.

Sequential Monte Carlo

SMC filters aim to estimate, recursively in time, the posterior distributions of hidden states of a Markov Process given some noisy and partial observations. They employ a set of particles (samples) to represent the posterior distribution and often assume the states, $z_{k}$ , and the observations, $x_{k}$ , can be modelled in the form:

• The state process, ${z_{0}, z_{1}, \dots}$ , is modelled as a Markov process on $ℝ^{d_{z}}$ (for some $d_{z} \geq 1$ ), with initial distribution $P r (z_{0})$ that evolves according to the transition probability density, $P r (z_{k} | z_{k - 1})$ .

• The observations, ${x_{0}, x_{1}, \dots}$ , take values in some state space on $ℝ^{d_{x}}$ (for some $d_{x} \geq 1$ ) and are conditionally independent provided that $z_{0}, z_{1}, \dots$ are known.

Under Baye’s Rule for conditional probability, we have the non-linear filtering equation:

$P r (z_{0}, \dots, z_{k} | x_{0}, \dots, x_{k}) = \frac{P r (x_{0}, \dots, x_{k} | z_{0}, \dots, z_{k}) P r (z_{0}, \dots, z_{k})}{P r (x_{0}, \dots, x_{k})},$

where:

$P r (x_{0}, \dots, x_{k}) = \int P r (x_{0}, \dots, x_{k} | z_{0}, \dots, z_{k}) P r (z_{0}, \dots, z_{k}) d z_{0} \dots d z_{k},$

$P r (x_{0}, \dots, x_{k} | z_{0}, \dots, z_{k}) = \prod_{i = 0}^{k} P r (z_{i} | z_{i - 1}),$

$P r (z_{0}, \dots, z_{k}) = P r_{0} (z_{0}) \prod_{i = 0}^{k} P r (z_{i} | z_{i - 1}) .$

The non-linear filtering problem involves computing the conditional distribution, $P r (z_{k} | x_{0}, \dots, x_{k - 1})$ , sequentially and the filtering equation is given by the recursion:

$\overset{updating}{\to} P r (z_{k} | x_{0}, \dots, x_{k}) = \frac{P r (x_{k} | z_{k}) P r (z_{k} | x_{0}, \dots, x_{k - 1})}{\int P r (x_{k} | {z^{'}}_{k}) P r ({z^{'}}_{k} | x_{0}, \dots, x_{k - 1}) d {z^{'}}_{k}},$

$\overset{prediction}{\to} P r (z_{k + 1} | x_{0}, \dots, x_{k}) = \int P r (z_{k + 1} | z_{k}) P r (z_{k} | x_{0}, \dots, x_{k}) d z_{k}$

We assume $P r (x_{k} | z_{k})$ is continuous with the convention $P r (z_{0} | x_{0}, \dots, x_{k}) = P r (z_{0})$ for $k = 0$ .

3. Data Analysis

3.1. Data Description

This study uses secondary data, recorded daily, for the period January 2012 to February 2021¹ from the Nairobi Securities Exchange 20 (NSE20) Share index, which comprises of twenty listed firms based on quantitative financial merit. These firms represent 80% of the market capitalization and are therefore a good proxy for measuring the study variables. The companies are selected based on measures of trading activity such as market capitalization, number of shares quoted, profitability and dividend record. The constituent firms of the NSE20 share index are listed in Table 1 by sector.

Table 1. Constituent companies of the NSE20 share index.

The demeaned log returns are seen to vary around a mean value of averagely 0.00, with periods of high and low variance as shown in Figure 1. The greatest variance can be observed in early 2014 in the plot of log returns, a trend that is also visible in the rolling volatility plot presented in Figure 2 and latent volatilities quantile plots in Figure 3 plots. The volatility evolves continuously.

Figure 1. Log returns for NSE20 share index daily price data for the period January 2012 to February 2021.

Figure 2. Rolling volatility for daily price data for the period January 2012 to February 2021.

Figure 3. Quantile plot of time against percentage latent volatilities for daily price data for the period January 2012 to February 2021.

In Figure 3, the volatility estimates are approximated using initial parameter estimates $α = - 0.0002$ , $ϕ = 0.75900$ and $Q / σ = 0.1$ . The volatility estimates are based on 0.05, 0.5 and 0.95 quantiles of latent volatility.

3.2. Determination of Regimes

3.2.1. 2-Regime Model

The two-state HMM allows for high and low volatility without any additional differentiation. Based on the findings, the 2-state process assumes that the log-volatility process starts at state 1 where state 2 represents high volatilities while state 1 represents low volatilities. The 2-state HMM filter² converges at 25 iterations with 7 degrees of freedom, a log-likelihood of −4742.109, AIC value 9498.217 and BIC value 9538.23.

The transitional matrix for a 2-state fitted HMM model, as seen in Table 2, depicts stable states, that is, a higher probability of remaining in the same state given by 0.989 and 0.973 for state 1 and state 2 respectively. There is a higher probability of staying in state 1 relative to state 2. There is a higher transitional probability from state 2 to state 1, 0.027, compared to the probability of moving from state 1 to state 2, 0.011.

Table 2. Transition matrix for the 2-state HMM.

The descriptive statistics for parameter estimates of the Gaussian response variables (Resp) are presented in Table 3. State 1 has a lower mean (intercept) value of 6.926 and a lower standard deviation 1.261 where state 2 has a higher mean value 13.591 and high standard deviation value 5.323. This shows that the resulting model has two well-separated states, where state 2 has fast responses while state 1 has slow responses. Figure 4 is a graphical representation of state changes for the 2-state HMM model.

Figure 4. State changes for a 2-state HMM applied to rolling volatility data for the period January 2012 to February 2021.

Table 3. Descriptive statistics for the 2-state HMM.

3.2.2. 3-Regime Model

The three-state HMM provides for a high volatility regime, a low volatility regime and a transitional regime bridge between the high and low volatility regimes. The 3-state process assumes that the volatility process begins at state 1, where state 1 represents low volatilities, state 2 represents high volatilities and state 3 represents the transitional regime bridge between state 1 and state 2. The 3-state HMM filter converges at 52 iterations with 14 degrees of freedom, a log-likelihood of −3876.486, AIC value 7780.972 and BIC value 7860.997.

Given the transitional matrix in Table 4, there is a higher likelihood of remaining in all states, implying stable states. The probability of staying in state 1, 0.978, is the highest whereas there is no likelihood of moving from state 1 to state 2 with an equally lower probability, 0.003, of moving from state 2 to state 1.

Table 4. Transition matrix for the 3-state HMM.

Table 5 gives the descriptive statistics for the Gaussian response variables (Resp). State 2 has the highest response rate given that it has the highest response parameter estimates, while state 1 has the lowest response rate. The graphical representation of state changes for the 3-state HMM model are as shown in Figure 5.

Figure 5. State changes for a 3-state HMM applied to rolling volatility data for the period January 2012 to January 2021.

Table 5. Descriptive statistics for the 3-state HMM.

3.2.3. 4-Regime Model

The four-state HMM provides for very high and very low volatility regimes, and two intermediate states representing high and low volatility regimes. Given the findings, the 4-state process assumes that the log-volatility process starts at state 1. State 1 represents very low volatilities while state 4 represents low volatilities. State 2 represents high volatilities while state 3 represents very high volatilities. The 4-state HMM filter converges at 67 iterations with 23 degrees of freedom, a log-likelihood of −3223.031, AIC value of 6492.063 and BIC value of 6623.531.

The transition matrix given in Table 6 indicates a higher likelihood of staying in the same state. The highest probability is in staying in state 1 which represents very low volatility while there is no likelihood of moving from state 1 to state 2 or state 3. The parameter estimates of the Gaussian response variables (Resp) have their descriptive statistics as shown in Table 7. State 3 has the highest response rate given its high intercept and standard deviation values, while state has the lowest response rate.

Table 6. Transition probability matrix for the 4-state HMM.

Table 7. Descriptive statistics for the 4-state HMM.

The 4-state HMM model’s state changes are represented graphically by Figure 6.

It has been empirically observed that the log-likelihood increases with increased number of states, which is a result of increased number of parameters to be estimated. The same is evidenced in this study, given the resultant log-likelihood values −4742.109, −3876.486 and −3223.031 for the 2-state, 3-state and 4-state

Figure 6. State changes for a 4-state HMM applied to rolling volatility data for the period January 2012 to February 2021.

model respectively. Due to the shortcomings of the log-likelihood in state estimation, we rely on the information criteria to determine the optimal states. Information criteria reward goodness of fit but impose a penalty that is an increasing function of the number of estimated parameters and which discourages model over-fitting. Based on the BIC, which imposes a greater penalty for additional parameters, the 4-state economy is optimal with the lowest BIC value of 623.531, relative to the 9538.23 and 7860.997 values for the 2-state and 3-state respectively.

3.3. Parameter Estimation

3.3.1. Markov Chain Monte Carlo (MCMC)

The single regime model assumes that there are no regimes in the economy. As such, the model’s input is all the data set used in the study, that is, a numeric vector of squared log returns without any missing values. Given the definitions of (Kim, 2005) the initial parameter estimates are: $θ = {α ~ N (m e a n (x_{t}), v a r (x_{t})), (ϕ + 1) / 2 ~ B (a_{0}, b_{0}), Q = 0.1}$ ³. The parameter estimates obtained for the single state model are given by Table 8.

Table 8. Parameter estimates for the single state SV model under MCMC.

There is low deviation from the mean values of the parameter estimates as shown by the standard deviation values. The effective sample size (ESS) is a metric for determining how well a converged Markov chain traversed the posterior space. Intuitively, it’s the number of independently distributed and identically distributed draws, with higher values indicating better mixing. Figure 7 summarizes the resultant simulation density plots of the Markov chains of the parameters where the grey-dashed lines and black-solid lines represent prior and posterior densities respectively.

Figure 7. MCMC density plots of parameter posteriors against their frequencies for the single state univariate stochastic volatility model.

The multi-regime SV model assumes 2-state, 3-state and 4-state volatility regimes with parameter estimates together with the simulation density plots of the Markov chains of each parameter under each regime presented and discussed below.

2-State Model

The 2-State model assumes that there are two regimes in the economy, where state 1 represents low volatility and state 2 represents high volatility as discussed under the determination of regimes Section (3.2.1). Parameter estimates for the 2-state model under the MCMC filtering technique are presented in Table 9.

Table 9. Parameter estimates for the 2-state SV model under MCMC.

State 1 has a lower mean value, $μ$ , and higher variance of the volatility process, $σ$ , relative to state 2 in the 2-state HMM. Their $ϕ$ values for the two states have low variance from each other and are as $| ϕ | \leq 1$ which shows that both processes are stationary. Figure 8 presents the simulation density plots of the Markov chains of each univariate SV model parameter for the 2-state model.

Figure 8. Density plots of MCMC draws from parameter posteriors against their frequencies for state 1 and state 2 respectively.

3-State Model

The 3-state process assumes that state 1 models low volatilities, state 2 represents high volatilities and state 3 represents the transitional regime bridge between state 1 and state 2. Parameter estimates for each state process in the 3-state model are presented in Table 10.

State 1 has the highest mean value while state 3 has the lowest mean value. On the contrary, state 1 has the highest variance of the volatility value while state 2 has the lowest variance of the volatility process value. The $ϕ$ values for the different states have low variance from each other and are as $| ϕ | \leq 1$ indicating that all three processes are stationary. The simulation density plots of each parameter’s Markov chains for the 3-state model are illustrated by Figure 9.

Table 10. Parameter estimates for the 3-state SV model under MCMC.

Figure 9. Density plots of MCMC draws from parameter posteriors against their frequencies for state 1, state 2 and state 3 respectively.

4-State Model

Given the findings in Section (3.2.3), the 4-state process assumes that state 1 represents very low volatilities, state 4 represents low volatilities, state 2 represents high volatilities while state 3 represents very high volatilities. Parameter estimates for each of the 4 states are presented in Table 11.

Table 11. Parameter estimates for the 4-state SV model under MCMC.

State 2 has the highest mean value while state 4 has the lowest mean value. State 3 has the lowest variance of the volatility value while state 2 has the highest volatility value. All $ϕ$ values for the 4 states have low variance from each other and are as $| ϕ | \leq 1$ which implies that all four processes are stationary. The simulation density plots of the Markov chains of each parameter under the 4-state model are given by Figure 10.

3.3.2. Sequential Monte Carlo (SMC)

Initial parameters for the single-regime model are as $α ~ N (m e a n (x_{t}), v a r (x_{t}))$ , $(ϕ + 1) / 2 ~ B (a_{0}, b_{0})$ and $Q = 0.1$ . The algorithm takes time to converge for larger data sets. In theory, it converges where the particle size and number of iterations is sufficiently large. However, in practice, it is not feasible to use infinitely large particle sizes and/or number of iterations. As a result, it is of importance to select an appropriate particle size and number of iterations to implement the particle filters algorithm. The results per maximum number of iterations M = length (observations), and the number of particles = length (observations) are as shown in Table 12.

The single state process takes on all the observations in the study. The $ϕ$ value, 0.9966 is less than one which shows that the process is stationary.

Figure 11 shows sequential summaries of the parameter posteriors, with each panel plotting the (0.05, 0.50, 0.95) posterior quantiles for the given parameter and the true parameter values are represented by the mid line. The density plots of the parameter estimates derived from the posterior distribution are shown in Figure 12.

The multi-regime SV model assumes 2-state, 3-state and 4-state volatility regimes. The parameter estimates and sequential summaries of the parameter posteriors under each regime are presented and discussed below. Each panel plots the (0.05, 0.50, 0.95) posterior quantiles and density plots for each parameter.

Figure 10. Density plots of MCMC draws from parameter posteriors against their frequencies for state 1, state 2, state 3 and state 4 respectively.

Table 12. Parameter estimates for the single state SV model under SMC.

Figure 11. SMC convergence of parameter estimates for the single-state univariate stochastic volatility model.

Figure 12. SMC density plots of posterior parameter estimates for the single state univariate stochastic volatility model.

2-State Model

Parameter estimates for the 2-state model under SMC are presented in Table 13. State 1 has a lower mean and variance of the volatility process value. Both processes are stationary as their $ϕ$ values are less than 1.

Figure 13 represents sequential summaries of the 0.05, 0.50 and 0.95 posterior quantiles for each parameter under the 2-state model, together with the density plots of the parameter estimates derived from the posterior distribution.

Figure 13. SMC convergence of parameter estimates and resultant parameter density plots for state 1 and state 2 respectively.

Table 13. Parameter estimates for the 2-state SV model under SMC.

3-State Model

The parameter estimates for the 3-state model under SMC, presented in Table 14, indicate that the highest mean value is in state 3, followed by state 1 and then state 2. State 1 has the highest variance of the volatility process. The values of $ϕ$ indicate that all three processes are stationary. Sequential summaries which depict convergence of the parameter estimates are presented in Figure 14, which additionally presents the density plots of the parameter estimates derived from the posterior distribution.

Figure 14. SMC convergence of parameter estimates and resultant parameter density plots for state1, state 2 and state 3 respectively.

Table 14. Parameter estimates for the 3-state SV model under SMC.

4-State Model

The four-state HMM provides for very high and very low volatility regimes, and two intermediate states representing high and low volatility regimes. Parameter estimates for each of these state processes are presented in Table 15.

Table 15. Parameter estimates for the 4-state SV model under SMC.

State 4 has the highest mean value while state 2 has the highest variance of the volatility process. The values of $ϕ$ are as $| ϕ | \leq 1$ showing all four processes are stationary. The 4-state model’s plots of sequential summaries and convergence of parameter estimates, together with the density plots of the parameter estimates derived from the posterior distribution are shown in Figure 15.

Figure 15. SMC convergence of parameter estimates and resultant parameter density plots for state 1, state 2, state 3 and state 4 respectively.

3.4. Forecasting Volatility

The HMM filter on rolling volatility indicates well separated volatility states for the 2-state, 3-state and 4-state regimes, and characterizes the transition process underlying the data. Given that volatility is not directly observable in the market, we rely on the transition matrices, given by Table 2, Table 4 and Table 6, for predictive purposes when it comes to forecasting volatility. The transition matrices for all three models indicate stable states given higher probabilities of staying within a state relative to moving to other states. This supports the volatility clustering stylized fact, as periods of high volatility are usually followed by periods of high volatility while periods of low volatility are followed by periods of low volatility.

The 3-state model assumes that state 1 represents low volatility, state 2 represents high volatility while state 3 is the bridge between low and high volatility. According to the empirical results, there is no probability of moving from state 1 to state 2, and the probability of moving from state 2 to state 1 is low. This indicates that the volatility process has a low likelihood of jumping from low to high or from high to low levels. The same holds for the 4-state model where the transition probability from state 1 (very low volatility) to state 2 (high volatility) and to state 3 (very high volatility) is 0. Transition probabilities from state 2 to state 1, state 3 to state 1 and state 4 (low volatility), and from state 4 to state 3, are also relatively low.

4. Discussion

Latent Markov models have been employed in modelling regime switches in economics. The HMM filter provides a clear separation of states where the input data is rolling volatility over a 30-day window. The 4-state economy is optimal given that it has a relatively low AIC and BIC value. Under the 4-state regime, the HMM filter classifies the economy into four volatility regimes: very high, high, low and very low volatility. Experimental results show that all three state models begin at state 1 which represents low volatility for the 2-state and 3-state models, and very low volatility for the 4-state model.

Given the 4-state optimal model, very low volatility characterizes the beginning of the modelling period which could be attributed to moderate economic growth but of significant increase from the year 2011. On average, low volatility can be seen in early 2016, 2018 and 2019 which are characterized by fairly stable exchange and inflation rates. In early 2016, a relatively stable currency and lower inflation rates were witnessed due to reduction of the benchmark interest rate from 11.5% to 10.5%, resulting in lower credit costs. Political risk dominated the years 2012-2014 as a result of the International Criminal Court (ICC) hearings, resulting in some price volatility during the same period.

Periods of high volatility during the modelling period could be attributed to the general elections in March 2013 and September 2017 with a consequent nullification of the results in the last quarter of 2017. The Central Bank’s decision to lift its base rate in July 2015 resulted in volatile 91-day Treasury Bill (T-bill) prices in late 2015. Further to that, in mid-2016, a Banking Amendment Act that capped interest rates levied by financial lending institutions caused some market volatility. The International Criminal Court’s decision in January 2014 to postpone the start of the prosecutor’s trial until February 2014 (ICC, 2014), as well as many other critical information releases on the same case, may have contributed to the very high volatility observed in early 2014. The election business cycle may have amplified this, resulting in uncertainty about monetary policy and fiscal spending transmission or policy risk, which often leads to delayed recovery from political events (Born & Pfeifer, 2014). Charges of crimes against humanity against Kenya’s top political figures may have discouraged foreign investment in Kenyan companies. High inflation and a low exchange rate of the US dollar against the Kenyan shilling resulted in a depreciated currency in early 2014. The COVID-19 pandemic has had its impact on both the domestic and global fronts resulting in a contracted quarterly gross domestic product contrary to the growth that had been witnessed in the last decade.

The univariate stochastic volatility model takes the volatility of the variance process into account and is a general state-space model. The AR (1) process, whose one-time shock impacts values of the evolving variable infinitely far into the future, captures the persistent nature of volatility. This may be the case particularly where there is no market resolution following a macroeconomic or political event. The EM algorithm is employed in parameter estimation as it achieves maximum likelihood parameter estimates without having to deal with the likelihood which cannot be presented in a closed form stochastic volatility model, and where the data is not fully observable.

The MCMC and SMC rely on Bayesian estimation of parameters which are distributed approximately with regard to the posterior. In Bayesian inference, an assumption about the prior distributions of the parameter set is made and parameter estimates are then simulated using the assumed priors. The posterior distribution is obtained by evaluating the likelihood of each parameter estimate, $P r (θ)$ , given observation information, $P r (θ | x_{t})$ . Filtering techniques are employed to extract latent state variables from noisy data to be used in parameter estimation. MCMC utilizes batched and offline samplers while SMC employs sequential and online techniques for parameter estimation.

Parameter estimates for the single state models under MCMC (Table 8) and SMC (Table 12) differ but have a low deviation from each other. For both filters, the $σ$ values exhibit the most deviation in the various state models. Under SMC, $σ$ values have the most deviation under the 4-state model returning the highest values for states 1, 2 and 4. The $μ$ and $ϕ$ values remain relatively within the same range, depending on whether the particular state models low, moderate or high volatility. For MCMC, parameter estimates for the level of log-variance are close to each other in the 2-state model but their range widens in the 3-state and 4-state models. Resultant posterior density plots under MCMC and SMC depict a relatively even distribution for the $μ$ parameter, a negative (left) skew for the $ϕ$ parameter and positive (right) skew for the $σ$ parameter.

The Markov Chain is a stochastic model that describes a series of possible events where the probability of each event is solely determined by the state obtained in the previous event. A transition probability matrix occurs in a first-order Markov chain process, which specifies the likelihood of transitioning from one state to another in successive time periods. As volatility is not directly observable in the market, we rely on transition probabilities to forecast volatility. The transition matrices for all three models imply to stable states which supports the stylized fact of volatility clustering.

5. Conclusions and Further Work

Hidden Markov Models (HMMs) have their applications in state inference based on observations. Secondary price data for the Nairobi Securities Exchange 20 (NSE 20) share index is used to estimate rolling volatility with a goal of estimating volatility regimes by employing an HMM filter. The univariate stochastic volatility model established by (Taylor, 1982), and acknowledged by (Andersen et al., 2009), provides a straightforward yet versatile framework for modeling time-varying volatility, as evidenced by empirical research.

The use of the Schwartz Bayesian Information Criterion (BIC) is motivated by a well-known drawback of the log-likelihood in state estimation, which increases as the number of states increases. The BIC allows us to select an optimal model, that is, the model with the lowest BIC value. Estimation results under determination of regimes indicate that the 4-state model, which divides the economy into periods of very high, high, low and very low volatility, is optimal.

We employ Markov Chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) filtering techniques which generate approximations to filtering distributions and are commonly used in non-linear and/or non-Gaussian state-space models. Given the different parameter estimates for the state processes under each modelling regime and under each filtering technique, a more versatile framework for modeling the volatility process is implemented. As a result, when analyzing volatility dynamics in the pricing of various volatility-backed financial assets, regime switching should be factored.

A useful extension to this research, particularly in frontier markets, can employ a multivariate stochastic volatility model. A comparative analysis of the said models with the univariate SV model can offer insights on the ability of the models to capture stylized facts of the volatility process under regime changes. Given the nature of financial time series data, it would be important to consider non-linear, non-Gaussian filtering techniques in state-space estimation. Finally, while this study focused on daily data, tick-by-tick data could be used in modeling market returns and subsequent volatility for better risk management and control, particularly in highly liquid markets. Tick-by-tick data exhibits better convergence properties particularly in maximum likelihood estimation of parameters.

NOTES

¹Source: https://www.investing.com/.

²This is achieved upon fitting a HMM filter, using the library (depmixS4) on rolling volatility data.

³ $μ$ and $α$ are used interchangeably and represent the level of log-variance while both Q and $σ$ represent the volatility of the log variance process.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1]	Ahlip, R. (2008). Foreign Exchange Options under Stochastic Volatility and Stochastic Interest Rates. International Journal of Theoretical and Applied Finance, 11, 277-294. https://doi.org/10.1142/S0219024908004804
[2]	Andersen, T. G., Davis, R. A., Kreiβ, J.-P., & Mikosch, T. V. (Eds.) (2009). Handbook of Financial Time Series. Springer Science & Business Media.
[3]	Bickel, P. J., Ritov, Y., & Ryden, T. (1998). Asymptotic Normality of the Maximum-Likelihood Estimator for General Hidden Markov Models. The Annals of Statistics, 26, 1614-1635. https://doi.org/10.1214/aos/1024691255
[4]	Born, B., & Pfeifer, J. (2014). Policy Risk and the Business Cycle. Journal of Monetary Economics, 68, 68-85. https://doi.org/10.1016/j.jmoneco.2014.07.012
[5]	Chan, J. C. (2013). Moving Average Stochastic Volatility Models with Application to Inflation Forecast. Journal of Econometrics, 176, 162-172. https://doi.org/10.1016/j.jeconom.2013.05.003
[6]	Chelimo, J. K. (2017). Calibration of Vasicek Model in a Hidden Markov Context: The Case of Kenya. Unpublished Doctoral Dissertation, Strathmore University.
[7]	Clark, T. E., & Ravazzolo, F. (2015). Macroeconomic Forecasting Performance under Alternative Specifications of Time-Varying volatility. Journal of Applied Econometrics, 30, 551-575. https://doi.org/10.1002/jae.2379
[8]	Bundi, N. D., Patrick, W. et al. (2016). Credit Scoring for M-Shwari Using Hidden Markov Model. European Scientific Journal, 12, 176-188. https://doi.org/10.19044/esj.2016.v12n15p176
[9]	Frühwirth-Schnatter, S., & Wagner, H. (2010). Stochastic Model Specification Search for Gaussian and Partial Non-Gaussian State Space Models. Journal of Econometrics, 154, 85-100. https://doi.org/10.1016/j.jeconom.2009.07.003
[10]	ICC (International Criminal Court) (2014). Situation in the Republic of Kenya in the Case of the Prosecutor v. Uhuru Muigai Kenyatta. https://www.icc-cpi.int/CourtRecords/CR2014_02248.PDF
[11]	Kastner, G. (2016). Efficient Bayesian Interference for Stochastic Volatility.
[12]	Krishnamurthy, V., Leoff, E., & Sass, J. (2018). Filterbased Stochastic Volatility in Continuous-Time Hidden Markov Models. Econometrics and Statistics, 6, 1–21. https://doi.org/10.1016/j.ecosta.2016.10.007
[13]	Kim, J. (2005). Parameter Estimation in Stochastic Volatility Models with Missing Data Using Particle Methods and the Emalgorithm. Doctoral Dissertation, University of Pittsburgh. http://d-scholarship.pitt.edu/8265/
[14]	Maqsood, A., Safdar, S., Shafi, R., & Lelit, N. J. (2017). Modeling Stock Market Volatility Using Garch Models: A Case Study of Nairobi Securities Exchange (NSE). Open Journal of Statistics, 7, 369-381. https://doi.org/10.4236/ojs.2017.72026
[15]	Mbete, D., Nyongesa, K., & Rotich, J. (2019). Estimation of Malaria Symptom Data Set Using Hidden Markov Model. Asian Journal of Probability and Statistics, 4, 1-29. https://doi.org/10.9734/ajpas/2019/v4i230110
[16]	Nyongesa, A. M., Zeng, G., & Ongoma, V. (2020). Non-Homogeneous Hidden Markov Model for Downscaling of Short Rains Occurrence in Kenya. Theoretical and Applied Climatology, 139, 1333-1347. https://doi.org/10.1007/s00704-019-03016-2
[17]	Taylor, S. J. (1982). Financial Returns Modelled by the Product of Two Stochastic Processes— A Study of the Daily Sugar Prices 1961-75. In O. D. Anderson (Ed.), Time Series Analysis: Theory and Practice 1 (pp. 203-226). North-Holland.

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies