Can Choice of Reference Density Improve Power of M-Estimation Based Unit Root Tests? ()

Tapan Kar^{1}, Malay Bhattacharyya^{2}

^{1}Indian Institute of Management Nagpur, Maharashtra, India.

^{2}Indian Institute of Management Kashipur, Uttarakhand, India.

**DOI: **10.4236/jmf.2022.122019
PDF
HTML XML
113
Downloads
549
Views
Citations

In this paper, we investigate if the choice of the reference density could improve the power of M-estimation-based unit root tests. For this investigation, we consider models where the AR-coefficient is very close to one (local-to-unity) in the true data generating process. Motivated by the stylized facts that empirical return distributions have large skewness and high leptokurtosis, we explore if Johnson SU and Pearson Type IV distributions can be used as the reference densities to improve the power of the M-estimation based unit root tests. Through extensive simulations, we find that the proposed procedure, in finite samples, is as powerful as the Dickey-Fuller test for normal errors and is significantly more powerful than several existing tests for non-normal errors. We apply the proposed test to the Nelson and Plosser data set and to the nominal monthly interest rate of India.

Share and Cite:

Kar, T. and Bhattacharyya, M. (2022) Can Choice of Reference Density Improve Power of M-Estimation Based Unit Root Tests?. *Journal of Mathematical Finance*, **12**, 340-355. doi: 10.4236/jmf.2022.122019.

1. Introduction

Following the seminal work of Dickey and Fuller [1], there has been immense interest in the statistical test for the unit root hypothesis in time series data. Stock [2], Phillips, and Xiao [3] have surveyed many such tests. Elliot, Rothenberg, and Stock [4] proposed a family of tests and also a modified Dickey-Fuller test for Gaussian time series with an unknown mean or a trend. Most of these tests are based on the OLS method assuming Gaussian errors. Monte Carlo procedure indicates that these tests do not have high statistical power when the errors show a heavy-tailed behaviour.

Many scholars explored tests with higher power for non-Gaussian errors based on M-estimation. Cox and Llatas [5], Lucas [6], Rothenberg and Stock [7], Knight [8], Herce [9], Xiao [10], Thompson [11] and [12] and others have made significant contributions in the M-estimation domain.

They all assumed a known density function on the error process for the model to estimate the model parameters using the M-estimation procedure. Henceforth we shall call this assumed density function on the error process the “reference density”.

For example, Lucas [6] investigated student *t* distribution. Herce [9] used the absolute value function (double exponential density). Hansen [13] proposed a unit root test using a covariate approach. However, identifying the covariates may be difficult in practice. Hasan and Koenker [14] proposed rank-based tests. Thompson [11] and [12] showed that for each rank-based test, there exists an M-estimation test with the same asymptotic power. Shin and So [15] used a nonparametric method to estimate the unknown error process. The method developed by them is difficult to implement. Moreover, for the Cauchy errors, their test has very poor power [11]). Koenker and Xiao [16] apply quantile regression to analyse the unit root process. In his monograph, Choi [17] has explained (page 97) that the LAD estimation-based test of Herce [9] has lower power than that of the quantile regression test. Using the approach proposed by Potscher and Prucha [18], Lima and Xiao [19] used a partially adaptive estimation method (PADF) to estimate the unknown error density. Hallin, Van Den Akker, and Werker [20] proposed a class of tests using the ranks of the samples. The rank-based test, however, requires an independently and identically distributed (IID) error process which may not be feasible in many practical applications.

Johnson SU (JHSU) and Pearson Type IV (PIV) distributions have been used to model financial time series data and in the risk management literature. Nagahara [21] used PIV density to model the stock return distribution. Bhattacharyya and Madhav [22] used the Johnson SU distribution and other methods to estimate the VaR for leptokurtic equity index returns. Bhattacharyya, Chaudhary, and Yadav [23] used PIV distribution to obtain the conditional VaR. Bhattacharyya, Misra, and Kodase [24] used PIV distribution to obtain the conditional MaxVaR.

Johnson SU distribution covers a wide range of shapes depending on its parameter values. It may also be a good approximation of Pearson type IV distribution [25]. The main advantage of this distribution is its ability to capture high kurtosis and skewness which is commonly observed in financial and economic time series data.

The main objective of this paper is to explore if the use of Johnson SU distribution (see Section 2) as a reference density (hereafter we call this test the JHSU test) could improve power. This is, perhaps, the first use of Johnson SU distribution in M-estimation literature. Using some results in Lucas [6], Xaio [10], and Thompson [11] [12], we obtain our test statistic and its asymptotic properties.

In our simulation studies, we generate data assuming various time series models such as AR (1), MA (1), etc. with error processes assumed as standard normal, and chi-square distributions. Please note that, in practice, the data generating process, and therefore, the error process will be unknown. We use the M-estimation method to estimate the parameters of a selected model from the data thus generated. In the M-estimation method, we need to assume a probability density for the error process, unlike OLS estimation. This assumed density is called the “reference density”. We explore Johnson SU and Pearson Type IV as the reference densities. These choices are especially explored to investigate if there is any improvement in the local power. We compare our choices with two other well-known tests such as ADF, PADF and test with reference density as distribution with 3 degrees of freedom.

From the Monte Carlo simulations, we observe that the JHSU test, in finite samples, is as efficient as the augmented Dickey-Fuller (ADF) test for normal errors, and more powerful than many existing traditional tests for non-normal errors. The JHSU test is surprisingly easy to implement.

In Section 2, we sketch a brief outline of the Johnson SU distribution. Section 3 presents the time series model that we study in this paper and obtain the test statistic and its asymptotic distribution. The method of estimation of parameters, and the calculation of the test statistic, and the critical value have been explained in Section 4. Section 5 presents Monte Carlo simulation results. Section 6 describes the empirical studies, and we conclude in Section 7.

2. Johnson SU Distribution

Johnson [25] proposed three transformations, *f*, of the following form:

$Z=\gamma +\delta f\left(\frac{X-\xi}{\lambda}\right),$

where *Z* is a standard normal variable and *X* is a continuous random variable whose distribution is unknown with shape parameters
$\gamma $ and
$\delta $, scale parameter
$\lambda $ and location parameter
$\xi $.
$\lambda >0$ and
$\delta >0$.

$f\left(y\right)$ may be chosen as $\mathrm{log}\left(y\right)$, ${\text{sinh}}^{-1}\left(y\right)$ or $\mathrm{log}\left(\frac{y}{1-y}\right)$.

For the Johnson SU distribution, *f* is chosen as
${\mathrm{sinh}}^{-1}$ so that

$Z=\gamma +\delta {\mathrm{sinh}}^{-1}\left(\frac{X-\xi}{\lambda}\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}-\infty <X<\infty $

$X=\xi +\lambda {f}^{-1}\left(\frac{Z-\gamma}{\delta}\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{where}\text{\hspace{0.17em}}{f}^{-1}\left(x\right)=\left(\frac{{\text{e}}^{x}-{\text{e}}^{-x}}{2}\right).$

The parameters $\gamma $ and $\delta $ control skewness and kurtosis. The distribution is positively (negatively) skewed if $\gamma $ is negative (positive). Increasing $\delta $, holding $\gamma $ constant, reduces the kurtosis. Johnson SU distribution can capture a wide range of shapes depending on its parameter values.

The probability density function of Johnson SU distribution is given by

$g\left(x\right)=\frac{\delta}{\lambda \sqrt{2\pi}}R\left(\frac{x-\zeta}{\lambda}\right)\mathrm{exp}\left\{-\frac{1}{2}{\left[\gamma +\delta V\left(\frac{x-\zeta}{\lambda}\right)\right]}^{2}\right\},-\infty <x<\infty $ (1)

The mean $\mu =\xi -\lambda {\theta}^{1/2}\mathrm{sinh}\left(\Phi \right)$, where $\theta ={\text{e}}^{{\delta}^{-2}}$ and $\Phi =\frac{\gamma}{\delta}$. When mean is zero $\xi =\lambda {\theta}^{1/2}\mathrm{sinh}\left(\Phi \right)$.

3. The Model and Asymptotic Distribution of the Test Statistic

We assume the following data generating process (DGP) for our analysis.

${y}_{t}={a}_{0}+{a}_{1}t+{u}_{t}$,

${u}_{t}-{u}_{t-1}=\pi {u}_{t-1}+{v}_{t},\text{\hspace{0.17em}}\text{\hspace{0.17em}}F\left(L\right){v}_{t}={\epsilon}_{t}$

The errors ${\epsilon}_{t}$ are independently and identically distributed with expectation zero and finite variance ${\sigma}_{\epsilon}^{2}$ with ${u}_{0}=0$.

The term $F\left(L\right)$ is the lag polynomial $\left[1-{F}_{1}-{F}_{2}{L}^{2}-\cdots -{F}_{p}{L}^{p}\right]$. All the roots of $F\left(L\right)=0$ lie outside the unit circle. First, we re-write this model in the augmented Dicky and Fuller format, which is defined below

$\Delta {y}_{t}={y}_{t-1}\beta +{\mu}_{0}+{\mu}_{1}t+{\displaystyle {\sum}_{i=1}^{p}{\mu}_{i+1}\Delta {y}_{t-i}}+{\epsilon}_{t}$

where $\Delta {y}_{t}={y}_{t}-{y}_{t-1}$, $\beta =\pi F\left(1\right)$. For the DGP, the null and alternative hypotheses are

${H}_{0}:\pi =0$ (*i.e.*,
$\beta =0$)
${H}_{1}:\pi <0$

Let $\stackrel{^}{\beta}$ be the M-estimator of $\beta $ using Johnson SU as a reference density. Then $\stackrel{^}{\beta}$ will minimize the following objective function

${\sum}_{i=p+2}^{N}h}\left(\Delta {y}_{t}-{y}_{t-1}\beta -{\mu}_{0}-{\mu}_{1}t-{\displaystyle {\sum}_{i=1}^{p}{\mu}_{i+1}\Delta {y}_{t-i}}\right)$,

where
$h=-\mathrm{log}\left(g\right)$ (*g* = Johnson SU density function).

For Johnson SU density, $h\left(x\right)$ is given below

$h\left(x\right)=\left(\frac{1}{2}\right)\mathrm{log}\left(G\left(x\right)\right)+\left(\frac{1}{2}\right){\left[\gamma +\delta \mathrm{log}\left[\frac{K\left(x\right)}{\lambda}\right]\right]}^{2}-\mathrm{log}\left(\delta \right)+\text{constant}$ (2)

where $K\left(x\right)$ and $G\left(x\right)$ are given by

$K\left(x\right)=\left[H\left(x\right)+\sqrt{G\left(x\right)}\right]$ (3)

$G\left(x\right)={\lambda}^{2}+{\left(x-\xi \right)}^{2}$ (4)

and

$H\left(x\right)=\left(x-\xi \right)$ (5)

Assumption 1. The function $h\left(x\right)$ is continuously differentiable and its second and higher order derivatives are bounded.

We denote the first and the second derivatives of $h\left(x\right)$ by $\psi \left(x\right)$ and $\phi \left(x\right)$ respectively. Further, we define the following

$\omega =E\left[\phi \left({\epsilon}_{t}\right)\right],\text{\hspace{0.17em}}\rho =\text{Corr}\left({\epsilon}_{t},\psi \left({\epsilon}_{t}\right)\right),\text{\hspace{0.17em}}{\sigma}_{\psi}^{2}=\text{Var}\left[\psi \left({\epsilon}_{t}\right)\right]\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{and}\text{\hspace{0.05em}}\text{\hspace{0.17em}}\vartheta =\frac{{\sigma}_{\epsilon}\omega}{{\sigma}_{\psi}}$

Following Thompson (2004) an approximate estimate of $\stackrel{^}{\beta}$ is

$\stackrel{^}{\beta}\approx \frac{\left[{X}^{t}P\right]\psi \left(\Delta Y-{Z}_{1}\stackrel{\u02dc}{M}\right)}{\omega \text{\hspace{0.05em}}{\stackrel{^}{\sigma}}_{\psi}{\left[{X}^{t}PX\right]}^{1/2}}$ (6)

where $X={\left({y}_{p+1},{y}_{p+2},\cdots ,{y}_{N-1}\right)}^{t}$.

${Z}_{1}$ is the matrix with row equal to $\left(1,t,\Delta {y}_{t-1},\Delta {y}_{t-2},\cdots ,\Delta {y}_{t-p}\right)$. $t=p+2,p+3,\cdots ,N$.

When time trend is not present (*i.e.*,
${a}_{1}=0$)
${Z}_{1}$ is defined by
$\left(1,\Delta {y}_{t-1},\Delta {y}_{t-2},\cdots ,\Delta {y}_{t-p}\right)$.

$\stackrel{\u02dc}{M}={\left({\stackrel{\u02dc}{\mu}}_{0},{\stackrel{\u02dc}{\mu}}_{1},{\stackrel{\u02dc}{\mu}}_{2},\cdots ,{\stackrel{\u02dc}{\mu}}_{p+1}\right)}^{t}$ is a $\left(p+2\right)$ dimensional parameter vector that minimizes the objective function ${\sum}_{i=p+2}^{N}h}\left(\Delta {y}_{t}-{\mu}_{0}-{\mu}_{1}t-{\displaystyle {\sum}_{i=1}^{p}{\mu}_{i+1}\Delta {y}_{t-i}}\right)$. ${\stackrel{^}{\sigma}}_{\psi}$ is an estimate of ${\sigma}_{\psi}$.

$\psi \left(x\right)$ is
$\left(N-p-1\right)$ dimensional column vector with components
$\psi \left({x}_{i}\right)$. For the case of Johnson SU distribution
$\psi \left({x}_{i}\right)$ is defined in equation (2.3.15). *P* is the projection matrix defined by
$\left[I-{Z}_{1}{\left(\left({Z}_{1}^{t}{Z}_{1}\right)\right)}^{-1}{Z}_{1}^{t}\right]$ where *I* is the Identity matrix. For Johnson SU as a reference density
$\psi \left(x\right)$ is defined below

$\psi \left(x\right)=\frac{H\left(x\right)}{G\left(x\right)}+\left[\gamma +\delta \mathrm{log}\left(\frac{K\left(x\right)}{\lambda}\right)\right]\left(\frac{\delta}{\sqrt{G\left(x\right)}}\right)$ (7)

$K\left(x\right)$, $G\left(x\right)$ and $H\left(x\right)$ are given in Equations (3)-(5), respectively.

Following Thompson [26], we can remove $\omega $ from the above Equation (6) since it does not affect the asymptotic power but, in small samples, can affect the size of the test because of poor estimation of $\omega $. After removing $\omega $ test statistic

in the *t*-ratio format is
$\stackrel{^}{\beta}=\frac{\left[{X}^{t}P\right]\psi \left(\Delta Y-{Z}_{1}\stackrel{\u02dc}{M}\right)}{{\stackrel{^}{\sigma}}_{\psi}{\left[{X}^{t}PX\right]}^{1/2}}$. We reject null for small values of
$\stackrel{^}{\beta}$. Our next task is to find the asymptotic distribution of
$\stackrel{^}{\beta}$.

In this paper, we restrict our alternative hypothesis to the AR coefficient
$\pi $ being very close to 1. We set
$\pi =C/N$, where *C *a constant, when making a limiting argument. *N* is the sample size. So, the parameter space is a shrinking neighbourhood of zero (see Chan and Wei [27] and Phillips [28]). In the presence of unit root, *C *is equal to zero, obviously.

Assumption 2 $E\left[\psi \left({\epsilon}_{t}\right)\right]=0$.

Asymptotic of the Test Statistic

The asymptotic distribution function of the test statistic is represented in terms of the function of Brownian motion. Let ${W}_{0}$ be a standard Brownian motion defined on $\left[0,1\right]$ and ${W}_{C}(.)$ be a related diffusion process

${W}_{c}\left(t\right)={\displaystyle {\int}_{0}^{t}\mathrm{exp}\left(c\left(t-s\right)\right)\text{d}{W}_{0}(\; s\; )}$

that satisfies the stochastic differential equation

$\text{d}{W}_{c}\left(t\right)=c{W}_{c}\left(t\right)+\text{d}{W}_{0}\left(t\right)$.

Let ${D}_{C}$ be another process defined, for the intercept only model, by

${D}_{C}\left(r\right)={W}_{C}\left(r\right)-{\displaystyle {\int}_{0}^{1}{W}_{c}\left(s\right)\text{d}s}$

and for the model with time trend by

${D}_{C}\left(r\right)={W}_{C}\left(r\right)-2{\displaystyle {\int}_{0}^{1}\left(2-3s-r\left(3-6s\right)\right){W}_{C}\left(s\right)\text{d}s}$

Thompson [12], under the assumptions (1) and (2), proved that the limiting distribution of $\stackrel{^}{\beta}$ converges weakly to the random variable ${F}_{C}$, which is defined below

${F}_{C}\equiv \rho \frac{{T}_{C}}{\sqrt{{R}_{C}}}+\sqrt{1-{\rho}^{2}}\left(\frac{{\displaystyle {\int}_{0}^{1}{D}_{C}\left(r\right)\text{d}{W}_{1}\left(r\right)}}{\sqrt{{R}_{C}}}\right)+\vartheta C\sqrt{{R}_{C}}$

From the above, it is clear that, while $\rho $ controls the null distribution (under null, $C=0$), the power is determined by both $\rho $ and $\vartheta $, because power of the test is $\text{Prob}\left[{F}_{c}\le {q}_{t}\left(\rho \right)\right]$, ${q}_{t}\left(\rho \right)$ is obtained from $\text{Pr}\left[{F}_{0}\le {q}_{t}\left(\rho \right)\right]=\alpha $, and $\alpha $ is the size of the test. Thompson [12] argues that $\vartheta $ dominates the power function relative to $\rho $.

As the value of
$\vartheta $ increases, the asymptotic distribution shifts to the left, because*C *is negative when the alternative hypothesis is true. Since the alternate hypothesis is one-sided (left), the rejection zone is on the left tail of the distribution of the test statistic. Therefore, a left shift of the asymptotic distribution gives the source of power improvement.

Table 1 gives the lists of the functional form of $h\left(x\right)$ and $\psi \left(x\right)$ for different reference densities.

4. Calculation of the Test Statistic and Critical Value

To compute the test statistic, we adopt the following steps.

1) Select lag length $={k}_{\text{maic}}$, using MAIC criterion developed by Ng and Perron [29] setting the maximum lag at $\left[12{\left(\frac{N}{100}\right)}^{1/4}\right]$.

Table 1. Expressions of $h\left(x\right)$ and $\psi \left(x\right)$ for different reference density functions.

2) Run the following regression:

$\Delta {y}_{t}^{\mathrm{det}}={\pi}^{1}{y}_{t-1}^{\mathrm{det}}+{\displaystyle {\sum}_{p=1}^{{k}_{\text{maic}}}\Delta {y}_{t-p}^{\mathrm{det}}}+{\epsilon}_{t}$

${y}_{t}^{\mathrm{det}}$ is the de-trended series according to Elliot, Rothenberg and Stock [4].

3) Estimate parameters of Johnson SU density from the estimated residuals of the above regression equation by maximum likelihood method with the condition $\xi =\lambda {\theta}^{\frac{1}{2}}\mathrm{sinh}\left(\Phi \right)$. (Since mean of the true error process is zero) where $\Phi $ and $\theta $ are as in Section 2.

4) Find $\stackrel{^}{M}$ by minimizing $h\left(\Delta Y-{Z}_{1}M\right)$ and define $\stackrel{^}{\epsilon}=\Delta Y-Z\stackrel{^}{M}$, where $\stackrel{^}{\epsilon}$ is a ( $N-{k}_{\text{maic}}-1$) dimensional row vector, because the errors are estimated under the null hypothesis (see Hasan and Koenker [14]).

5) The estimates of the nuisance parameters $\rho $ and ${\sigma}_{\psi}$ are calculated as:

${\stackrel{^}{\sigma}}_{\epsilon}^{2}=\frac{{\left(\stackrel{^}{\epsilon}-\stackrel{\xaf}{\stackrel{^}{\epsilon}}\right)}^{t}\left(\stackrel{^}{\epsilon}-\stackrel{\xaf}{\stackrel{^}{\epsilon}}\right)}{N},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\stackrel{^}{\sigma}}_{\psi}^{2}=\frac{{\left(\psi \left(\stackrel{^}{\epsilon}\right)-\stackrel{\xaf}{\psi}\right)}^{t}\left(\psi \left(\stackrel{^}{\epsilon}\right)-\stackrel{\xaf}{\psi}\right)}{N}$,

$\stackrel{^}{\rho}={\left(\stackrel{^}{\epsilon}-\stackrel{\xaf}{\stackrel{^}{\epsilon}}\right)}^{t}\left(\psi \left(\stackrel{^}{\epsilon}\right)-\stackrel{\xaf}{\psi}\right)/\left(N{\stackrel{^}{\sigma}}_{\epsilon}{\stackrel{^}{\sigma}}_{\psi}\right)$

where $\stackrel{\xaf}{\stackrel{^}{\epsilon}}$ and $\stackrel{\xaf}{\psi}$ is the sample means of $\stackrel{^}{\epsilon}$ and $\psi \left(\stackrel{^}{\epsilon}\right)$ respectively.

6) Calculate test statistic $\stackrel{^}{\beta}=\frac{\left[{X}^{t}P\right]\psi \left(\Delta Y-Z\stackrel{^}{M}\right)}{{\left[{X}^{t}PX\right]}^{1/2}}{\stackrel{^}{\sigma}}_{\psi}^{-1}$.

7) Compute approximate
$\alpha \%$ critical value for model *M *by the polynomial in
$\left(1-\rho \right)$ given below.

${Q}_{\alpha ,M}\left(\rho \right)={A}_{0,\alpha ,M}+{A}_{1,\alpha ,M}\left(1-\rho \right)+{A}_{2,\alpha ,M}{\left(1-\rho \right)}^{2}+{A}_{3,\alpha ,M}{\left(1-\rho \right)}^{3}$

The coefficients of the polynomial ${Q}_{\alpha ,M}\left(\rho \right)$ are adapted from Thompson [12] (p. 368) and are reported in Table 2 for two different models, for ready reference.

8) Reject null hypothesis
${H}_{0}:\pi =0$ for model *M*, if
$\stackrel{^}{\beta}<{Q}_{\alpha ,M}\left(\stackrel{^}{\rho}\right)$ at
$\alpha \%$ level of significance.

5. Monte Carlo Evidence

In this section, using Monte Carlo Simulation, we evaluate the small sample performance of the tests for sample size 100. For our simulations, we have considered

Table 2. Estimated coefficients*.

*The suffix *M* is removed for notational simplicity. Source: Thompson [12] (p. 368).

two sets of values for $\left({a}_{0},{a}_{1}\right)$. For each such set, we have assumed three different error processes for ${v}_{t}$. Finally, we have assumed four different distributions for the errors ${\u03f5}_{t}$. Thus, a total of 24 different models have been used in the simulations. On all these 24 models, five different tests, based on different reference densities, have been investigated. All the calculations have been performed in R-studio.

Data have been generated according to the model defined below.

${y}_{t}={a}_{0}+{a}_{1}t+{u}_{t},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{u}_{t}-{u}_{t-1}=\pi {u}_{t-1}+{v}_{t}$

Two sets of values considered for $\left({a}_{0},{a}_{1}\right)$ are $\left(1,0\right)$ and $\left(1,1\right)$. Three selected error processes for ${v}_{t}$ are:

1) IID ${v}_{t}={\u03f5}_{t}$

2) AR (1): ${v}_{t}=0.5{v}_{t-1}+{\epsilon}_{t}$, and

3) MA (1): ${v}_{t}={\epsilon}_{t}+0.5{\epsilon}_{t-1}$.

We set the initial condition ${\epsilon}_{1}=0$.

The error process ${\epsilon}_{t}$ has been generated from the following four distributions.

1) Standard normal distribution.

2) Student *t* distribution with
$df=3$.

3) Lognormal distribution with mean centred at zero.

4) Chi-square distribution with $df=1$ with mean centred at zero.

Five tests, with the following notations, have been used for all the 24 models described above.

1) ADF—Augmented Dicky-Fuller test.

2) T3—Student *t* distribution with
$df=3$ as the reference density.

3) PADF—Partially adaptive estimation method proposed by Lima and Xaio (2010).

4) PIV—Pearson Type IV distribution as reference density.

5) JHSU—Johnson SU distribution as reference density.

We have used the package *Urca* and the function *ur.df has* been used to perform the test setting the lag length at
${k}_{\text{maic}}$. JHSU test has been performed according to the steps from 1 - 4 described in Section 4. To estimate the parameters of the Johnson SU density by the maximum likelihood method, we use the *constroptim* function.

We have performed 1000 replications of each test for a sample size of 100. All the results have been reported at a 5% significance level. The numbers in the tables below represent the rejection ratios of the null hypothesis by different tests among 1000 replications. We have also investigated the ERS (Elliot *et al*. [4]) test and compared it with ours. We have found that the power of the JHSU test is better than that of the ERS test for the asymmetric error process. Hence, we have not reported the results of the ERS test.

Tables 3-5 report the results for *only* *intercept *caseswith error process
${v}_{t}$ as IID, AR (1), and MA (1) respectively. Tables 6-8 report the results when the *time trend* is included in the model and the error process
${v}_{t}$ as IID, AR (1), and MA (1) respectively. The first column in each table represents the assumed distribution

Table 3. Rejection ratios of the null hypothesis of different tests among 1000 replications with drift only model *i.e.*
$\left({a}_{0},{a}_{1}\right)=\left(1,0\right)$ taking i.i.d error process, 5% significance level.

Table 4. Rejection ratios of the null hypothesis of different tests among 1000 replications with drift only model *i.e.*
$\left({a}_{0},{a}_{1}\right)=\left(1,0\right)$ taking AR (1) error process, 5% significance level.

Table 5. Rejection ratios of the null hypothesis of different tests among 1000 replications with drift only model *i.e.*
$\left({a}_{0},{a}_{1}\right)=\left(1,0\right)$ taking MA (1) error process, 5% significance level.

Table 6. Rejection ratios of the null hypothesis of different tests among 1000 replications with time trend model *i.e.*
$\left({a}_{0},{a}_{1}\right)=\left(1,1\right)$ taking i.i.d error process significance level at 5%.

Table 7. Rejection ratios of the null hypothesis of different tests among 1000 replications with time trend model *i.e.*
$\left({a}_{0},{a}_{1}\right)=\left(1,1\right)$ taking AR (1) error process significance level at 5%.

Table 8. Rejection ratios of the null hypothesis of different tests among 1000 replications with time trend model *i.e.*
$\left({a}_{0},{a}_{1}\right)=\left(1,1\right)$ taking MA (1) error process significance level at 5%.

of
${\epsilon}_{t}$, the second gives the values of *C*, the third column shows the sample size, and the fourth to eighth columns report the rejection ratios when the tests 1 - 5 (described above) respectively applied to the DGP. The boldface numbers in the table indicate the highest rejection ratios. The R-code of the simulation studies can be available upon request.

The above results clearly suggest that the JHSU test, in respect of power, has a very good small sample performance. JHSU test is as powerful as the ADF test when the innovation process is Gaussian and has substantial gain in power when the errors are non-Gaussian. In terms of power JHSU test also has higher power than that of PIV, T3, and PADF tests for Lognormal and chi-square error distribution processes.

6. Empirical Evidence

We have considered two data sets for application. The first is the extended Nelson and Plosser (1982) data set and the second is the nominal monthly interest rate of India from January 2005 to March 2017. Nelson and Plosser extended data set are openly available at NPEXT: Nelson and Plosser extended data in urca: Unit Root and Cointegration Tests for Time Series Data (rdrr.io). The second data are collected from International Monetary Fund International Financial Statistics (IMF-IFS) database (https://data.imf.org/?sk=4c514d48-b6ba-49ed-8ab9-52b0c1a0179b&sId=1409151240976).

First, we consider the case of the Nelson and Plosser data set. Many researchers have used the data of Nelson and Plosser [30] to investigate whether macroeconomic time series are random walks or stationary processes around a level or a trend. As their data set is considered a testing ground for new procedures, we also implement our proposed method on Nelson and Plosser data. Lag length is obtained by MAIC criterion setting maximum lag at
$\left[12{\left(\frac{N}{100}\right)}^{1/4}\right]$, where *N* is the sample size.

Table 9 reports the Jarque-Bera test statistic for all the series of Nelson and Plosser data sets. Table 10 contains the unit root analysis. We considered three tests, viz., ADF, ERS, and JHSU. We have taken the time trend model for our analysis.

We observe from Table 10, that the JHSU test rejects the hypothesis of unit root for “GNP per Capita” at 1% and for “Unemployment” at 5% level. For “Real GNP and “Unemployment”, ERS test rejects only at 5% level but for no series at 1% level. For “GNP per Capita” ADF test rejects at 5% level.

Normality assumption is rejected at 5% significance level for the “GNP per capita series”. JHSU test rejects the unit root hypothesis at 1% level and ADF test rejects it at 5% significance level. ERS test is not able to reject the null hypothesis. This clearly shows the power improvement in the power of the JHSU test.

From Table 10, we note that for the “Real GNP” series, ERS rejects the null but ADF cannot reject the null at 5% significance level. Also, for the “Real GNP” series, the Normality assumption is rejected (See Table 9) at 5% significance level. JHSU test cannot reject the null for the “Real GNP” series. This supports the finding of the JHSU test for the Real GNP series.

Table 9. Jarque-Bera statistic of Nelson Plosser data set.

Table 10. Unit root analysis of Nelson Plosser data set.

Note: * 5%; ** 1%.

Table 11. Descriptive statistics of interest rate of India.

Table 12. Unit root analysis of interest rate of India.

JHSU and ERS tests give the same result (reject the null at 5% significance level) when the normality assumption has not been rejected, e.g., for the “Unemployment” series.

In our second study, we consider the case of the nominal monthly interest rate of India from January 2005 to March 2017. Table 11 gives the descriptive statistics and Table 12 reports the unit root analysis.

Here, we use the drift-only model. In this study, the series has excess kurtosis (Table 11) and also rejects the normality assumption. JHSU test rejects the unit root hypothesis at 5% significance level (Table 12) but other tests do not. Again, this shows that the JHSU test has higher power than that of others.

7. Conclusions

In this paper, we have explored the unit root test based on Johnson SU distribution as a reference density. A step-by-step method for computing the test statistic is detailed. Monte Carlo evidence shows a significant power improvement over the ADF test when the innovations are non-Gaussian. The choice of JHSU is much better than other reference densities that are used in literature. JHSU test is very powerful for asymmetric error processes. JHSU test dominates the partially adaptive estimation method proposed by Lima and Xiao (2010). It also dominates Pearson Type IV and student *t*_{3} density-based tests when the error follows asymmetric distributions. For symmetric errors, the JHSU test performs 89as well as most other traditional procedures.

We have also obtained very satisfactory results when the proposed test procedure has been applied to real data sets. Therefore, the JHSU test can be a viable and much better option for the practitioners and researchers.

Apart from the application of this test procedure, future studies can be carried out on whether one should use a deterministic time trend model or drift while testing stationarity using the JHSU test.

Acknowledgements

The authors wish to thank the editor and the referee for their valuable comments that helped improve the presentation of the paper.

Conflicts of Interest

The authors declare no conflicts of interest.

[1] | Dickey, D.A. and Fuller, W.A. (1979) Distribution of the Estimators for 5 Autoregressive Time Series with a Unit Root. Journal of American Statistical Association, 74, 427-431. https://doi.org/10.1080/01621459.1979.10482531 |

[2] | Stock, J.H. (1994) Unit Roots, Structural Breaks and Trends. Handbook of Econometrics, 4, 2739-2841. https://doi.org/10.1016/S1573-4412(05)80015-7 |

[3] | Phillips, P.C.B. and Xiao, Z. (1998) A Primer on Unit Root Testing. Journal of Economic Surveys, 12, 423-467. https://doi.org/10.1111/1467-6419.00064 |

[4] | Elliot, G., Rothenberg, T.J. and Stock, J.H. (1996) Efficient Tests for an Autoregressive Unit Root. Econometrica, 64, 813-836. https://doi.org/10.2307/2171846 |

[5] |
Cox, D. and Llatas, I. (1991) Maximum Likelihood Type Estimation for Nearly Nonstationary Autoregressive Time Series. Annals of Statistics, 19, 1109-1128. https://doi.org/10.1214/aos/1176348240 |

[6] | Lucas, A. (1995) Unit Root Test Based on M Estimators. Econometric Theory, 11, 331-346. https://doi.org/10.1017/S0266466600009191 |

[7] |
Rothenberg, T.J. and Stock, J.H. (1997) Inference in a Nearly Integrated Autoregressive Model with Nonnormal Innovations. Journal of Econometrics, 80, 269-286. https://doi.org/10.1016/S0304-4076(97)00040-7 |

[8] |
Knight, K. (1991) Limit Theory for M Estimators in an Integrated Infinite Variance Process. Econometric Theory, 7, 200-212. https://doi.org/10.1017/S0266466600004400 |

[9] |
Herce, M.A. (1996) Asymptotic Theory of LAD Estimation in a Unit Root Process with Finite Variance Errors. Econometric Theory, 12, 129-153. https://doi.org/10.1017/S0266466600006472 |

[10] |
Xiao, Z. (2001) Likelihood Based Inference in Trending Time Series with a Root near Unity. Econometric Theory, 17, 1082-1112. https://doi.org/10.1017/S0266466601176036 |

[11] |
Thompson, S.B. (2004) Optimal versus Robust Inference in Nearly Integrated Non-Gaussian Models. Econometric Theory, 20, 23-55. https://doi.org/10.1017/S0266466604201025 |

[12] |
Thompson, S.B. (2004) Robust Tests of the Unit Root Hypothesis Should Not Be “Modified”. Econometric Theory, 20, 360-381. https://doi.org/10.1017/S0266466604202055 |

[13] |
Hansen, B.E. (1995) Rethinking the Univariate Approach to Unit Root Testing, Using Covariates to Increase Power. Econometric Theory, 11, 1148-1171. https://doi.org/10.1017/S0266466600009993 |

[14] | Hasan, M.N. and Koenker, R.W. (1997) Robust Rank Test of Unit Root Hypothesis. Econometrica, 65, 133-161. https://doi.org/10.2307/2171816 |

[15] |
Shin, D.W. and So, B.S. (1999) Unit Root Test Based on Adaptive Maximum Likelihood Estimation. Econometric Theory, 15, 1-23. https://doi.org/10.1017/S0266466699151016 |

[16] |
Koenker, R. and Xiao, Z. (2004) Unit Root Quantile Autoregression Inference. Journal of the American Statistical Association, 99, 775-787. https://doi.org/10.1198/016214504000001114 |

[17] | Choi, I. (2015) Almost All about Unit Roots. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9781316157824 |

[18] | Potscher, B.I.P. (1986) A Class of Partially Adaptive One-Step M Estimators for the Nonlinear Regression Model with Dependent Observations. Journal of Econometrics, 32, 219-251. https://doi.org/10.1016/0304-4076(86)90039-4 |

[19] |
Lima, L.R. and Xiao, Z. (2010) Testing Unit Root Based on Partially Adaptive Estimation. Journal of Time Series Econometrics, 2, 1-32. https://doi.org/10.2202/1941-1928.1038 |

[20] |
Hallin, M., Van Den Akker, R. and Werker, B.J. (2011) A Class of Simple Distribution Free Rank-Based Unit Root Tests. Journal of Econometrics, 163, 200-214. https://doi.org/10.1016/j.jeconom.2011.03.007 |

[21] |
Nagahara, Y. (1999) The PDF and CF of Pearson Type IV Distributions and the ML Estimation of the Parameters. Statistics and Probability Letters, 43, 251-264. https://doi.org/10.1016/S0167-7152(98)00265-X |

[22] | Bhattacharyya, M. and Siddarth, M. (2012) A Comparison of VaR Estimation Procedures for Leptokurtic Equity Index Returns. Journal of Mathematical Finance, 2, 13-30. https://doi.org/10.4236/jmf.2012.21002 |

[23] | Bhattacharyya, M., Chaudhary, A. and Yadav, G. (2008) Conditional VaR Estimation Using Pearson Type IV Distribution. European Journal of Operational Research, 191, 386-397. https://doi.org/10.1016/j.ejor.2007.07.021 |

[24] |
Bhattacharyya, M., Nityanand, M. and Bharat, K. (2009) MaxVaR for Non-Normal Heteroskedastic Returns. Quantitative Finance, 9, 925-935. https://doi.org/10.1080/14697680802595684 |

[25] | Johnson, N.L. (1949) Systems of Frequency Curves Generated by Method of Translation. Biometrica, 36, 149-176. https://doi.org/10.1093/biomet/36.1-2.149 |

[26] | Thompson, S.B. (2004) Robust Confidence Interval for Autoregressive Coefficients near One. In: Andrews, D. and Stock, J., Eds., Identification and Inference for Econometric Models: Essays in Honor of Thomas Rothenberg, Cambridge University Press, Cambridge, 375-402. https://doi.org/10.1017/CBO9780511614491.017 |

[27] |
Chan, N.Z. and Wei, C.Z. (1987) Asymptotic Inference for Nearly Nonstationary AR (1) Process. Annals of Statistics, 15, 1050-1063. https://doi.org/10.1214/aos/1176350492 |

[28] | Phillips, P.C.B. (1987) Towards a Unified Asymptotic Theory for Autoregression. Biometrika, 74, 535-547. https://doi.org/10.1093/biomet/74.3.535 |

[29] |
Ng, S. and Perron, P. (2001) Lag Length Selection and Construction of Unit Root Tests with Good Size and Power. Econometrica, 69, 1519-1554. https://doi.org/10.1111/1468-0262.00256 |

[30] | Nelson, C.R. and Plosser, C.R. (1982) Trends and Random Walks in Macroeconomic Time Series: Some Evidence and Implication. Journal of Monetary Economics, 10, 139-162. https://doi.org/10.1016/0304-3932(82)90012-5 |

Journals Menu

Contact us

+1 323-425-8868 | |

customer@scirp.org | |

+86 18163351462(WhatsApp) | |

1655362766 | |

Paper Publishing WeChat |

Copyright © 2024 by authors and Scientific Research Publishing Inc.

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.