Partial Time-Varying Coefficient Regression and Autoregressive Mixed Model

Hui Li; Zhiqiang Cao

doi:10.4236/ojs.2023.134026

Open Journal of Statistics > Vol.13 No.4, August 2023

Partial Time-Varying Coefficient Regression and Autoregressive Mixed Model

Hui Li¹, Zhiqiang Cao^2*
¹School of Statistics, Beijing Normal University, Beijing, China.
²Department of Mathematics, College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China.
DOI: 10.4236/ojs.2023.134026 PDF HTML XML 223 Downloads 1,200 Views

Abstract

Regression and autoregressive mixed models are classical models used to analyze the relationship between time series response variable and other covariates. The coefficients in traditional regression and autoregressive mixed models are constants. However, for complicated data, the coefficients of covariates may change with time. In this article, we propose a kind of partial time-varying coefficient regression and autoregressive mixed model and obtain the local weighted least-square estimators of coefficient functions by the local polynomial technique. The asymptotic normality properties of estimators are derived under regularity conditions, and simulation studies are conducted to empirically examine the finite-sample performances of the proposed estimators. Finally, we use real data about Lake Shasta inflow to illustrate the application of the proposed model.

Keywords

Regression and Autoregressive, Time Series, Partial Time-Varying Coefficient, Local Polynomial

Share and Cite:

Li, H. and Cao, Z. (2023) Partial Time-Varying Coefficient Regression and Autoregressive Mixed Model. Open Journal of Statistics, 13, 514-533. doi: 10.4236/ojs.2023.134026.

1. Introduction

Regression and autoregressive mixed models are widely used tools to study the correlation between random indicators of time series types and their influencing factors. The expression of the model is

$Y_{i} = α_{0} + α_{1} X_{1 i} + \dots + α_{s} X_{s i} + β_{1} Y_{i - 1} + \dots + β_{p} Y_{i - p} + ε_{i},$ (1)

where time series ${Y_{i}}$ , $i = p + 1, \dots, n$ , is the dependent variable; time series ${X_{j i}}, j = 1, \dots, s$ , represents s covariates; with $α_{j}$ as their regression coefficients; $α_{0}$ is the intercept term; $β_{j}, j = 1, \dots, p$ , represents p autoregressive coefficients and satisfies stationary conditions; and ${ε_{i}}$ is a white noise series, which is independent of both the covariates ${X_{j i}}$ and response variable ${Y_{i}}$ . Model (1) has been studied extensively in classical textbooks [1] [2] . Model (1) can also be regarded as a special case of an autoregressive model with exogenous variables or an autoregressive distributed lag model [3] , which has a wide range of applications in finance, econometrics and biomedicine.

The unknown parameters in model (1) are assumed to be constant, indicating that the effects between the dependent variable and the covariates will not change with the observation time. However, in practice, complex nonlinear correlations between dependent variable and covariates are sometimes encountered. The varying-coefficient (or functional coefficient) time series model is an effective tool to handle data with such characteristics. The so-called varying-coefficient is that the parameters in the model are not constants but functions of some variables such as observation time or some delay components of time series. If the expression of the coefficient function in the model is known, it is called a parametric varying-coefficient model. Otherwise, it is a nonparametric model. It is well known that a parametric model will fit better than a nonparametric model if the coefficient function is known. However, if we make a mistake in the expression of the coefficient function, then it will cause serious deviations in the estimation and even result in misleading outcomes. In this case, the nonparametric model has more advantages. Nonparametric varying-coefficient models in time series were studied extensively, see [4] - [16] . In addition, there is a long history of models in the Bayesian realm to track the issue, where the time-variability of coefficients is (semi-) automatically shrunk to constancy in various ways [17] [18] [19] .

It is noted that many scholars have carried out research on model extension for the time-varying coefficient model (i.e., coefficients are functions of time) proposed by Robinson [4] , see [20] - [29] . Since the model in Robinson [4] is the basis of many studies and the model proposed in this article is an extension of the time-varying coefficient model of Robinson [4] and Cai [9] , we first give the model expression:

$Y_{i} = β_{0} (t_{i}) + β_{1} (t_{i}) X_{1 i} + \dots + β_{d} (t_{i}) X_{d i} + ε_{i}, i = 1, \dots, n,$ (2)

where time series ${Y_{i}}$ is the dependent variable; time series ${X_{j i}}, j = 1, \dots, d$ , represent d covariates; $β_{j} (\cdot)$ , $j = 0, \dots, d$ , are $d + 1$ coefficient functions of observation time whose form are completely unknown and satisfy a certain degree of smoothness; and ${ε_{i}}$ is an i.i.d (independent and identical distribution) series and is independent of ${X_{j i}}$ . It should be noted that the independent variable of the coefficient function is $t_{i} = i / n$ instead of i. This is a constraint that we need to satisfy when deducing large-sample properties of nonparametric smooth estimators of coefficients in a time-varying coefficient model, and the specific reasons can be seen in [4] [5] [9] .

To avoid the “curse of dimensionality”, many researchers have studied nonparametric estimation methods for varying-coefficient models. For example, Robinson [4] proposed a pseudo-local maximum likelihood estimation method to study the local estimates of functional coefficients in the time-varying coefficient time series model (2). Chen and Tsay [6] and Hastie and Tibshirani [30] proposed a kernel smooth nonparametric estimation method for functional coefficients in autoregressive models. Ramsay and Silverman [31] and Brumback and Rice [32] estimated the parameters of the varying-coefficient regression model by using the B-spline method. Cai et al. [8] studied local polynomial estimates for functional-coefficient regression models. Huang et al. [33] proposed a B-spline estimation for a varying-coefficient regression model based on repeated measures data. Aneiros-Perez and Vieu [34] used Nadaraya-Watson-type weights to estimate the semi-functional partial linear regression model (SFPLR model) and derived the corresponding asymptotic properties of the estimators. Li et al. [20] and Fan et al. [22] applied local polynomial expansion techniques to explore parameter estimation for time series models with partial time dependencies. Liu et al. [10] used a local linear approach to estimate the nonparametric trend and seasonal effect functions. Cai et al. [11] adopted a nonparametric generalized method of moments to estimate a new class of semiparametric dynamic panel data models. Chen et al. [23] applied a local linear method with cross-validation bandwidth selection to estimate the time-varying coefficients of the heterogeneous autoregressive model. By combining B-spline smoothing and the local linear method, Hu et al. [27] proposed a two-step estimation method for a time-varying additive model. Tu and Wang [14] applied adaptive kernel weighted least squares to estimate functional coefficient cointegration models. Li et al. [28] used classical kernel smoothing methods to estimate the coefficient functions in nonlinear cointegrating models. Karmakar et al. [29] applied local linear M-estimation to estimate the time-varying coefficients of a general class of nonstationary time series, among others.

For time series data with complex correlations between sample components, sometimes neither the constant coefficient regression and autoregressive mixed model (1) nor the time-varying coefficient regression model (2) can fit the data well. Motivated by this, in this article, we introduce the idea of a time-varying coefficient into a simple mixed model (1) and propose a partial time-varying coefficient regression and autoregressive mixed model. The proposed model can not only estimate the constant effects of some covariates on the dependent variable but also characterize the non-constant effects of other covariates, which greatly increases the flexibility and scope of the models (1) and (2). To the best of our knowledge, explicitly introducing part of the delay terms of the dependent variable into time-varying regression models as covariates is less actively studied. However, for time series with a strong correlation between components, sometimes introducing the delay term of the dependent variable as part of covariates can make full use of the data information and improve the model fitting degree. We will study this problem from the way of Cai [9] in this article, although the Bayesian parallel stream can be incorporated and put into perspective.

The remaining parts of this article are organized as follows. In Section 2, we introduce a partial time-varying coefficient regression and autoregressive mixed model, give the estimation method of model parameters and derive the large sample properties of the proposed estimators. We conduct a simulation study in Section 3 to examine the finite-sample performances of the proposed estimators. In Section 4, we use Shasta Lake inflow data to illustrate the application value of the model. Finally, we conclude with a discussion in Section 5.

2. Model and Estimation

2.1. Proposed Model

As mentioned in Section 1, many researchers have studied different types of varying-coefficient regression models, but research on partial time-varying coefficient regression and autoregressive mixed models is less actively studied. To fill this gap and study the correlation effects of dependent variables with covariates and delay components of partial dependent variables, we propose the following partial time-varying coefficient regression and autoregressive mixed model:

$Y_{i} = α_{1}^{T} (t_{i}) X_{i, s_{1}} + α_{2}^{T} X_{i, s_{2}} + β_{1}^{T} (t_{i}) Y_{i, p_{1}} + β_{2}^{T} Y_{i, p_{2}} + ε_{i},$ (3)

where $α_{1} (\cdot) = {α_{1, 1} (\cdot), \dots, α_{1, s_{1}} (\cdot)}^{T}$ with the superscript T as transposition, $t_{i} = i / n$ with $i = p_{1} + p_{2} + 1, \dots, n$ , $α_{2} = {(α_{2,1}, \dots, α_{2, s_{2}})}^{T}$ , $β_{1} (\cdot) = {β_{1, 1} (\cdot), \dots, β_{1, p_{1}} (\cdot)}^{T}$ , $β_{2} = {(β_{2,1}, \dots, β_{2, p_{2}})}^{T}$ , $X_{i, s_{1}} = {(X_{i, 1}, \dots, X_{i, s_{1}})}^{T}$ , $X_{i, s_{2}} = {(X_{i, (s_{1} + 1)}, \dots, X_{i, (s_{1} + s_{2})})}^{T}$ , $Y_{i, p_{1}} = {(Y_{i - i_{1}}, \dots, Y_{i - i_{p_{1}}})}^{T}$ , $Y_{i, p_{2}} = {(Y_{i - i_{p_{1} + 1}}, \dots, Y_{i - i_{p_{1} + p_{2}}})}^{T}$ with $i_{1}, \dots, i_{p_{1} + p_{2}}$ as a permutation of time indicator $1, \dots, p_{1} + p_{2}$ . For the convenience of derivation, in this article, we set $i_{1}, \dots, i_{p_{1} + p_{2}}$ to be the order from 1 to $(p_{1} + p_{2})$ ; $s = s_{1} + s_{2}$ and $p = p_{1} + p_{2}$ represent the number of regression covariates and the order of autoregression, respectively; and ${ε_{i}}$ is assumed to be a white noise series. Model (3) is an iterative formula for the sequence ${Y_{i}}$ . To make the time series ${Y_{i}}$ have a solution, certain constraints on the model parameters should be made, such as the autoregressive coefficient satisfies the stationary condition and $E (ε_{i} Y_{j}) = 0$ when $j < i$ . Since the autoregressive coefficients in model (3) are not necessarily constant, the corresponding constraints are stricter; for example, the time series in the model satisfies the α-mixing condition. We will give the conditions that need to be satisfied in Section 2.3.

Model (3) enhances the generalizability of time series regression models and contains many existing statistical models as specials. For example, when $α_{1} (\cdot) = 0$ , $β_{1} (\cdot) = 0$ , $β_{2} = 0$ , model (3) is the traditional linear regression model; when $α_{2} = 0$ , $β_{1} (\cdot) = 0$ , $β_{2} = 0$ , model (3) reduces to the time-varying coefficient linear regression model; when $β_{1} (\cdot) = 0$ , $β_{2} = 0$ , model (3) is a partial time-varying coefficient linear regression model; when $α_{1} (\cdot) = 0$ , $α_{2} = 0$ , $β_{1} (\cdot) = 0$ , model (3) is the autoregressive time series model; when $α_{1} (\cdot) = 0$ , $α_{2} = 0$ , $β_{2} = 0$ , model (3) is the time-varying coefficient autoregressive time series model; when $α_{1} (\cdot) = 0$ , $β_{1} (\cdot) = 0$ , model (3) becomes the traditional constant coefficient regression and autoregressive mixed model.

Model (3) is a semi-parametric model since some of its parameters are unknown functions, and we need to apply non-parametric estimation methods, such as local polynomial expansion methods [35] and spline techniques [36] . In this article, we combine the local polynomial expansion technique with the least squares estimation method to estimate unknown model parameters.

2.2. Estimation Method

First, we briefly introduce the local polynomial expansion method. The main idea of this method is that for a function whose form is completely unknown but satisfies certain smoothness conditions at a fixed local point, we apply Taylor expansion to approximate the function as a polynomial about the local point. By using this method, the estimation of the function can be transformed into a parameter estimation problem of local polynomial coefficients. In this article, we elucidate the estimation method by using the first-order Taylor expansion. The higher-order Taylor expansion only increases the number of parameters to be estimated, and there is no essential difference in the algorithm. Specifically, the coefficient functions $α_{1} (\cdot)$ and $β_{1} (\cdot)$ in model (3) are assumed to have a second continuous derivative, denoted as ${α^{'}}_{1} (\cdot)$ , ${α^{″}}_{1} (\cdot)$ , ${β^{'}}_{1} (\cdot)$ and ${β^{″}}_{1} (\cdot)$ , respectively. $\forall t_{0} \in (0,1)$ , by Taylor expansion, we have

$α_{1} (t_{i}) \approx α_{1} (t_{0}) + {α^{'}}_{1} (t_{0}) (t_{i} - t_{0}), β_{1} (t_{i}) \approx β_{1} (t_{0}) + {β^{'}}_{1} (t_{0}) (t_{i} - t_{0}) .$

Then, in the local neighborhood of $t_{0}$ , model (3) can be approximated as follows:

$\begin{matrix} Y_{i} \approx X_{i, s_{1}}^{T} α_{1} (t_{0}) + Y_{i, p_{1}}^{T} β_{1} (t_{0}) + X_{i, s_{2}}^{T} α_{2} (t_{0}) + Y_{i, p_{2}}^{T} β_{2} (t_{0}) \\ + X_{i, s_{1}}^{T} {α^{'}}_{1} (t_{0}) (t_{i} - t_{0}) + Y_{i, p_{1}}^{T} {β^{'}}_{1} (t_{0}) (t_{i} - t_{0}) + ε_{i} \\ = γ^{T} Z_{i} + ε_{i}, i = 1, \dots, n, \end{matrix}$

where

$γ = γ (t_{0}) = {α_{1}^{T} (t_{0}), β_{1}^{T} (t_{0}), α_{2}^{T} (t_{0}), β_{2}^{T} (t_{0}), {({α^{'}}_{1} (t_{0}))}^{T}, {({β^{'}}_{1} (t_{0}))}^{T}}^{T},$

and

$Z_{i} = Z_{i} (t_{0}) = {X_{i, s_{1}}^{T}, Y_{i, p_{1}}^{T}, X_{i, s_{2}}^{T}, Y_{i, p_{2}}^{T}, X_{i, s_{1}}^{T} (t_{i} - t_{0}), Y_{i, p_{1}}^{T} (t_{i} - t_{0})}^{T} .$

We can obtain a locally weighted least squares estimate of the coefficient function in the model (3) by taking the minimum of the weighted sum of squared errors as follows:

$\sum_{i = p + 1}^{n} K_{h} (t_{i} - t_{0}) {(Y_{i} - γ^{T} Z_{i})}^{2},$

where $K (\cdot)$ is the kernel function, , and h is the bandwidth. Using the least squares estimation theory, it is not difficult to obtain the estimation of parameter $γ$ , denoted as $\hat{γ}$ , which has the following expression:

$\hat{γ} = \hat{γ} (t_{0}) = {(Z^{T} K Z)}^{- 1} Z^{T} K Y,$ (4)

where $K = diag {K_{h} (t_{p + 1} - t_{0}), \dots, K_{h} (t_{n} - t_{0})}$ , $Y = {(Y_{p + 1}, \dots, Y_{n})}^{T}$ , and $Z = Z (t_{0})$ is a $(n - p) \times (p + s + s_{1} + p_{1})$ matrix with the kth row as ${Z_{k + p} (t_{0})}^{T}, k = 1, \dots, n - p$ .

For the residual term $ε_{i}$ in model (3), its variance can be estimated locally as:

${\hat{σ}}^{2} = {\hat{σ}}^{2} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} {(Y_{i} - {\hat{γ}}^{T} Z_{i})}^{2} .$

2.3. Asymptotic Normality

For the convenience of description, we introduce the following definitions and notations. Let $H$ be a diagonal matrix of order $(p + s + p_{1} + s_{1})$ ; the first $p + s$ elements on the main diagonal are 1, and the other $p_{1} + s_{1}$ elements of the main diagonal are h. Let $μ_{j} = \int u^{j} K (u) d u$ , $ν_{j} = \int u^{j} K^{2} (u) d u$ for $j = 0, 1, 2, 3$ . Denote $G_{i} = {(X_{i, s_{1}}^{T}, Y_{i, p_{1}}^{T}, X_{i, s_{2}}^{T}, Y_{i, p_{2}}^{T})}^{T}$ , $V_{i} = {(X_{i, s_{1}}^{T}, Y_{i, p_{1}}^{T})}^{T}$ , $i = p + 1, \dots, n$ and $Ω_{1} = Ω_{1} (t_{0}) = E {G_{i} G_{i}^{T} | t = t_{0}}$ , $Ω_{2} = Ω_{2} (t_{0}) = E {G_{i} V_{i}^{T} | t = t_{0}}$ , $Ω_{3} = Ω_{3} (t_{0}) = E {V_{i} V_{i}^{T} | t = t_{0}}$ . Define

$D = D (t_{0}) = (\begin{matrix} D_{1} & D_{2} \\ D_{2}^{T} & D_{3} \end{matrix}), Σ = Σ (t_{0}) = (\begin{matrix} Σ_{1} & Σ_{2} \\ Σ_{2}^{T} & Σ_{3} \end{matrix}),$

where $D_{i} = D_{i} (t_{0}) = μ_{i - 1} Ω_{i} (t_{0}), i = 1,2,3$ , and the definition of $Σ_{i} = Σ_{i} (t_{0})$ is provided in the Appendix.

Throughout the derivation, we impose the following conditions:

C1 The measurable functions $α_{1, i} (\cdot) (i = 1, \dots, s_{1})$ and $β_{1, j} (\cdot) (j = 1, \dots, p_{1})$ have second continuous derivatives in the interval $[0,1]$ .

C2 The kernel function $K (\cdot)$ is symmetrically bounded and has a compact support on the interval $[- 1,1]$ .

C3 $(Y_{i}, X_{i, s_{1}}, X_{i, s_{2}}, i = 1, \dots, n)$ is a strictly stationary α-mixing process, there exists such that $E {| Y_{i} |}^{2 (2 + δ)} < \infty$ , and the mixing coefficient $α (i) = O (i^{- τ})$ , where $τ = (2 + δ) (1 + δ) / δ$ .

C4 The bandwidth satisfies $n h^{1 + 4 / δ} \to \infty$ .

C5 For $t_{0} \in (0,1)$ , the matrix $D (t_{0})$ is full rank.

C6 For $t_{0} \in (0,1)$ , the matrix $Σ (t_{0})$ is positively definite.

Note that C1, C2 and C4 are the conditions that the local polynomial expansion method needs to satisfy, C3 is the condition required for the moment of random variables, and C5 and C6 are requirements to be satisfied by the large sample properties of the estimators. Conditions C1-C4 are similar to those in Robinson [4] and Cai [9] .

Theorem 1. Under Conditions C1-C6, when $h \to 0$ and $n h \to \infty$ , we have

$\sqrt{n h} H (\hat{γ} - γ_{0} - \frac{h^{2}}{2} b) \overset{D}{\to} N (0, Σ_{γ}),$

where $b = b (t_{0}) = (\begin{matrix} Ω_{1}^{- 1} Ω_{2} μ_{2} η^{'} (t_{0}) \\ 0 \end{matrix})$ and $η^{'} (t_{0}) = (\begin{matrix} {α^{″}}_{1} (t_{0}) \\ {β^{″}}_{1} (t_{0}) \end{matrix})$ , $Σ_{γ} = Σ_{γ} (t_{0}) = D^{- 1} Σ D^{- 1}$ .

Therefore, we obtain

$\sqrt{n h} {\hat{ξ} (t_{0}) - ξ (t_{0}) - \frac{h^{2}}{2} Ω_{1}^{- 1} Ω_{2} μ_{2} η^{'} (t_{0})} \overset{D}{\to} N (0, Σ_{ξ}),$

where $ξ_{0} = ξ (t_{0}) = {α_{1}^{T} (t_{0}), α_{2}^{T} (t_{0}), β_{1}^{T} (t_{0}), β_{2}^{T} (t_{0})}^{T}$ and $Σ_{ξ} = Σ_{ξ} (t_{0}) = D_{1}^{- 1} Σ_{1} D_{1}^{- 1}$ .

In practice, the variance-covariance matrix of $(\hat{ξ} (t_{0}) - ξ_{0})$ can be calculated by using the following formula:

${(n h)}^{- 1} D_{n,1}^{- 1} Σ_{n,1} D_{n,1}^{- 1},$ (5)

where

$D_{n,1} = D_{n,1} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} G_{i} G_{i}^{T} K_{h} (t_{i} - t_{0}),$

$Σ_{n,1} = \frac{h}{(n - p) ν_{0}} (\sum_{i = p + 1}^{n} {\hat{ε}}_{i} G_{i} K_{h} (t_{i} - t_{0})) {(\sum_{j = p + 1}^{n} {\hat{ε}}_{j} G_{j} K_{h} (t_{j} - t_{0}))}^{T},$

and ${\hat{ε}}_{i}, i = p + 1, \dots, n$ , are residuals of model (3). The proof of Theorem 1 is provided in the Appendix.

For constant coefficients in the model (3), the corresponding estimators obtained from (4) are not $\sqrt{n}$ consistent. To improve its convergence rate, we use the following formula (6) to obtain a global estimator, that is,

$(\begin{matrix} {\tilde{α}}_{2} \\ {\tilde{β}}_{2} \end{matrix}) = \int_{T} Γ (t_{0}) (\begin{matrix} {\hat{α}}_{2} (t_{0}) \\ {\hat{β}}_{2} (t_{0}) \end{matrix}) d t_{0},$ (6)

where the weight matrix $Γ (t_{0})$ satisfies $\int_{T} Γ (t_{0}) d t_{0} = I_{s \times s}$ , and $I$ is an identity matrix. Typically, we choose $Γ (t_{0})$ to be the standardized inverse covariance matrix [37] of ${\hat{α}}_{2} (t_{0})$ and ${\hat{β}}_{2} (t_{0})$ , which can be obtained from (5). Then, the finite sample variances of the global estimators ${({\tilde{α}}_{2}^{T}, {\tilde{β}}_{2}^{T})}^{T}$ can be calculated by the following formula (7):

$\frac{1}{n - p} \sum_{i = p + 1}^{n} Γ (t_{i}) A D_{n}^{- 1} (t_{i}) Σ_{n}^{- 1} (t_{i}) D_{n}^{- 1} (t_{i}) A^{T} Γ^{T} (t_{i}),$ (7)

where $A$ is a $(s_{2} + p_{2}) \times (p + s + s_{1} + p_{1})$ matrix with $a_{i j}$ as the $(i, j)$ th element of $A$ ; furthermore, $a_{i j} = 1$ for $i = 1, \dots, s_{2} + p_{2}$ , $j = i + s_{1} + p_{1}$ ; otherwise, $a_{i j} = 0$ . Definitions of $D_{n}$ and $Σ_{n}$ can be seen in the Appendix.

2.4. Selection Best Bandwidth and Order

When using estimator (4) to obtain $\hat{γ}$ , bandwidth h and order p ( $p = p_{1} + p_{2}$ ) must be determined. Cheng et al. [26] proposed a Bayesian approach to determine bandwidth selection for local constant estimators of time-varying coefficients, Chen et al. [23] applied the cross-validation (CV) bandwidth selection to estimate the time-varying coefficient heterogeneous autoregressive model. Inspired by the idea of the average mean squared (AMS) criterion proposed by Cai et al. [8] , we slightly modify this criterion to simultaneously select the best bandwidth h and the optimal order p of model (3). The procedure is as follows: First, we divide the sample data into m groups and denote the sample size of the qth group as $n_{q}$ , $q = 1, \dots, m$ . Second, given the values of h and p, and based on data of removing the qth group, we estimate $γ$ by using (4), denoted as ${\hat{γ}}_{q}$ . Third, we calculate the estimated mean square error for sample data of the qth group, denoted as ${AMS}_{q} (h, p)$ , and then we obtain the total mean squared error when q goes from 1 to m. Finally, the optimal h (i.e., $h_{opt}$ ) and p (i.e., $p_{opt}$ ) that minimize the (AMS) error has the following form:

${(h_{opt}, p_{opt})}^{T} = \underset{h, q}{\arg \min} \sum_{q = 1}^{m} {AMS}_{q} (h, p),$

where for $q = 1, \dots, m$ ,

${AMS}_{q} (h, p) = \frac{1}{n_{q} - p} \sum_{i =1 + (q - 1) n_{q} + p}^{1 + q n_{q}} {Y_{i} - {\hat{γ}}^{T} Z_{i}}^{2} .$

Note that in the second step, except for all covariates of $X_{i, s_{1}}$ and $X_{i, s_{2}}$ , we can also try to use subsets of $X_{i, s_{1}}$ and $X_{i, s_{2}}$ in $Z_{i}$ when we estimate $γ$ by using (4). Obviously, different subsets of covariates of $X$ may result in different best combinations of h and p, and we take the combination of h and p corresponding to the minimum value of AMS for all subsets of covariates of interest as the optimal bandwidth and the order of the autoregressive part. Therefore, AMS can also be used to conduct variable selection of model (3).

3. Simulation Studies

We perform simulation studies to examine the finite-sample performances of the proposed estimators and consider the following partial time-varying coefficient regression and autoregressive mixed model:

$Y_{i} = α_{1} (t_{i}) X_{i, 1} + α_{2} X_{i, 2} + β_{1} (t_{i}) Y_{i - 1} + β_{2} Y_{i - 2} + ε_{i}, i = 3, 4, \dots, n,$ (8)

where $α_{1} (t_{i}) = \sin (t_{i})$ , $α_{2} = 0.3$ , $β_{1} (t_{i}) = 0.5 \exp (t_{i})$ , $β_{2} = - 0.1$ ; $X_{i,1}$ follows the autoregressive moving average model with $p = 1$ and $q = 1$ , i.e., ARMA(1,1); and $X_{i,2}$ is an autoregressive (AR) model of order 1, i.e., AR(1). Typically, $X_{i, 1} = 0.1 X_{i - 1, 2} + e_{i, 1} + 0.3 e_{i - 1, 1}$ and $X_{i, 2} = 0.2 X_{i - 1, 2} + e_{i, 2}$ . Both $e_{i,1}$ and $e_{i,2}$ are standard white noise series with mean 0 and variance 1, i.e., $e_{i,1} ~ W N (0,1)$ and $e_{i,2} ~ W N (0,1)$ . Also, $ε_{i} ~ W N (0,0.04)$ and that is independent of both $X_{i,1}$ and $X_{i,2}$ . We run 500 Monte Carlo simulations for 9 grid points at $t_{0} = 0.1, \dots, 0.9$ in the time interval $[0,1]$ and examine the average of 500 simulations of the parameters in model (8) at these 9 points. We apply the Gaussian kernel function with standard error 2 to the estimation of $γ$ . Once $\hat{γ}$ is obtained by (4), then ${\hat{ξ}}_{0}$ is immediately known and its variance estimation can be calculated by (5). For constant parameter vector $α_{2}$ and $β_{2}$ , we obtain their global estimators ${\tilde{α}}_{2}$ and ${\tilde{β}}_{2}$ by using formula (6), and compute the corresponding variance estimation of ${\tilde{α}}_{2}$ and ${\tilde{β}}_{2}$ by (7). For each simulation, we consider the sample size as $n = 300$ and $n = 800$ , respectively. According to simulation results under sample size 300 and 800, we find that estimation results are good when bandwidth is between 0.02 and 0.08. Thus, three bandwidths of h are chosen, that is, $h = 0.03, 0.05$ and 0.07. Finally, we report the following quantities under each h and n for ${\hat{ξ}}_{0}$ in Table 1 & Table 2: Monte Carlo average estimators of ${\hat{ξ}}_{0}$ , Monte Carlo standard deviation of estimates (ESD), Monte Carlo average of estimated standard errors (ASE) and coverage probability of nominal 95% confidence intervals (CP).

Table 1. Estimation results for time-varying coefficient $α_{1} (t) = \sin (t)$ and $β_{1} (t) = 0.5 e^{- t}$ under different scenarios.

Table 1 shows that both estimators of $α_{1} (\cdot)$ and $β_{1} (\cdot)$ are approximate to their true values. The ASEs are close to the ESDs, which demonstrates the good performance of the variance estimation by Theorem 1. 95% confidence interval coverage probabilities are reasonably accurate, matching the nominal level. According to the estimation results from Table 2, biases of ${\tilde{α}}_{2}$ and ${\tilde{β}}_{2}$ can be ignored, ASEs are still approximate to the ESDs and CPs of both estimators are close to 95%, which means good variance estimation of global estimators ${\tilde{α}}_{2}$ and ${\tilde{β}}_{2}$ . Furthermore, when bandwidth h increases from 0.03 to 0.07, as expected, biases of ${\hat{α}}_{1} (\cdot)$ and ${\hat{β}}_{1} (\cdot)$ tend to be larger, and corresponding ASEs and ESDs tend to be smaller. When the sample size n increases from 300 to 800, the ASEs and ESDs of all estimators decrease, which indicates that increasing the sample size can improve our proposed estimators.

Figure 1 & Figure 2 present the true and estimated time-varying coefficients

Table 2. Estimation results for constant coefficients $α_{2} = 0.3$ and $β_{2} = - 0.1$ under different scenarios.

Figure 1. Estimated curves of $α_{1} (t) = \sin (t)$ with bandwidth $h = 0.05$ . (a) Sample size $n = 300$ ; (b) sample size $n = 800$ . The solid lines are the true functions of $α_{1} (t)$ , the dashed lines are the estimates of $α_{1} (t)$ , and the dash-dotted lines are the 95% confidence intervals.

Figure 2. Estimated curves of $β_{1} (t) = 0.5 e^{- t}$ with bandwidth $h = 0.05$ . (a) sample size $n = 300$ ; (b) sample size $n = 800$ . The solid lines are the true functions of $β_{1} (t)$ , the dashed lines are the estimates of $β_{1} (t)$ , and the dash-dotted lines are the 95% confidence intervals.

of model (8) as well as 95% pointwise confidence intervals. It can be intuitively seen that the estimated curves of ${\hat{α}}_{1} (\cdot)$ and ${\hat{β}}_{1} (\cdot)$ are very close to the true curves, which again confirms the good performance of our proposed estimators.

4. Real Data Analysis

In this Section, we use Lake Shasta inflow data [38] to illustrate the application value of our proposed model. The data includes 454 months of measured values for several climatic variables: air temperature (Temp), dew point (DewPt), cloud cover (CldCvr), wind speed (WndSpd), precipitation (Precip), and inflow (Inflow) at Lake Shasta, California, USA. We are interested in building models to predict Lake Shasta inflow based on these climate variables. We treat the lake water inflow (Inflow) as a time series, but it has both autocorrelation and heteroscedasticity through the Box-Ljung autocorrelation test and Lagrange Multiplier test. We let the dependent variable be the inflow of lake water. According to the suggestion by Shumway and Stoffer [38] , it is better to model this variable after logarithmic transformation. Therefore, we set $Y_{i} = \log ({Inflow}_{i}), i = 1, \dots,454$ and then use the proposed partial time-varying coefficient regression and autoregressive mixed model (3) to model the relationship between inflows and the other five climate variables.

Before using (4) to obtain estimators, we need to determine the optimal bandwidth h, the autoregressive order p and which covariates have time-varying coefficient effects as well as which covariates have constant coefficient effects. In other words, we need to conduct variable selection. As we mentioned in Section 2.4, the criterion AMS can not only determine the optimal bandwidth h and the autoregressive order p, but also conduct variable selection. In this analysis, when choosing a combination of covariates, i.e., which variables should be put into $X_{i, s_{1}}$ and which variables are put into $X_{i, s_{2}}$ , as well as which lag terms of the dependent variable are put into $Y_{i, p_{1}}$ and $Y_{i, p_{2}}$ , respectively, one rule we follow is that the vectors $X_{i, s_{1}}$ and $Y_{i, p_{1}}$ cannot be empty simultaneously. Otherwise, model (3) will reduce to the traditional regression and autoregressive mixed model. The autoregressive order p we consider starts at 1 and increases sequentially. A rough criterion for p is that there is no autocorrelation and heteroscedasticity in the residuals of the model. When p is given, we determine the optimal combination of bandwidth h with a combination of covariates according to the value of AMS. If the estimator of a constant effect covariate selected by the AMS is insignificant, then we delete this covariate and calculate AMS again. By using this strategy, we can guarantee that all covariates with constant coefficients selected by the AMS are significant.

The procedure of using the proposed model to analyze real data is as follows: First, we let Temp, DewPt, CldCvr, WndSpd and Precip be five candidate covariates. When calculating AMS, according to the suggestion of Cai et al. [8] , we choose $m = 4$ and $n_{q} = [0.1 n]$ , where [ ] denotes a rounding function, and the optimal bandwidth satisfies $h \propto n^{- 1 / 5}$ . Since the sample size of Shasta Lake inflow data is 454, we set the selection range of bandwidth h from 0.08 to 0.50. Combined with other rules mentioned in the previous paragraph, that is, the $X_{i, s_{1}}$ and $Y_{i, p_{1}}$ cannot be empty simultaneously, the estimation for covariates selected by the model with constant coefficients should be significant. Then, we consider that model (3) contains only 1, 2, 3, 4 and 5 of the candidate covariates and calculate the corresponding values of AMS based on given p and h. Next, by comparing AMS under all various combinations, we find the minimum value of AMS occurring at $h = 0.15$ , $p = 4$ , and the covariate combination is $X_{i, s_{1}} = {({CldCvr}_{i}, {WndSpd}_{i})}^{T}$ , $X_{i, s_{2}} = {Precip}_{i}$ , $Y_{i, p_{1}} = {(\log ({Inflow}_{i - 1}), \log ({Inflow}_{i - 2}), \log ({Inflow}_{i - 3}))}^{T}$ and $Y_{i, p_{2}} = \log ({Inflow}_{i - 4})$ . Thus, $(s_{1}, s_{2}, p_{1}, p_{2}) = (2,1,3,1)$ . Figure 3(a) shows the AMS changes with different h under this combination in model (3).

Therefore, the final model to be estimated is:

$\begin{matrix} Y_{i} = α_{1,1} (t_{i}) {CldCvr}_{i} + α_{1,2} (t_{i}) {WndSpd}_{i} + α_{2,1} {Precip}_{i} + β_{1,1} (t_{i}) Y_{i - 1} \\ + β_{1,2} (t_{i}) Y_{i - 2} + β_{1,3} (t_{i}) Y_{i - 3} + β_{2,1} Y_{i - 4} + ε_{i}, \end{matrix}$ (9)

where $Y_{i} = \log ({Inflow}_{i})$ . Once $X_{i, s_{1}}$ , $X_{i, s_{2}}$ , $Y_{i, p_{1}}$ and $Y_{i, p_{2}}$ are determined, we then use (4) to obtain local estimators and (6) to get global estimators of constant coefficients. To evaluate the prediction performance of model (9), we use the first 451 sample data to estimate the unknown parameters and obtain ${\hat{ε}}_{i} ~ W N (0,0.049)$ . The estimated time-varying coefficients ${\hat{α}}_{1,1} (\cdot)$ , ${\hat{α}}_{1,2} (\cdot)$ , ${\hat{β}}_{1,1} (\cdot)$ , ${\hat{β}}_{1,2} (\cdot)$ and ${\hat{β}}_{1,3} (\cdot)$ are presented in Figure 3(b), and the estimated constant coefficients ${\tilde{α}}_{2,1}$ and ${\tilde{β}}_{2,1}$ are reported in Table 3. Interestingly, considering the time-varying coefficient estimation results, it seems that the prediction

Figure 3. (a) AMS results under different bandwidths h for model (9); (b) time-varying coefficient estimation results of model (9).

Table 3. Constant coefficient estimation results of model (9).

effect of lag order 2 of $Y_{i}$ (i.e., $Y_{i - 2}$ ) increases with time $t_{i}$ while the lag order 1 of $Y_{i}$ (i.e., $Y_{i - 1}$ ) has no such characteristic, and effect of the lag order 3 of $Y_{i}$ (i.e., $Y_{i - 3}$ ) is close to zero. The time-varying effects from both ${CldCvr}_{i}$ and ${WndSpd}_{i}$ decrease with $t_{i}$ .

The last three sample values are used to calculate the relative prediction error (RPE) defined as:

$RPE = | \frac{{\hat{Y}}_{i} - Y_{i}}{Y_{i}} |,$

where ${\hat{Y}}_{i} = \log ({\hat{Inflow}}_{i}) (i = 452,453,454)$ are predicted by model (9).

Table 4 shows the relative prediction error results of model (9) for the three forward steps. It can be seen that the average RPE of model (9) is only 2.6%. For comparison, Table 4 also shows the 3-step forward prediction results of the lake inflow based on the FAR(p) model. According to the AMS criterion of Cai et al. [8] , the optimal fitted FAR(p) model is obtained when $h = 0.17$ , $p = 3$ and $d = 1$ , that is,

$Y_{i} = {\hat{α}}_{0} (Y_{i - 1}) + {\hat{α}}_{1} (Y_{i - 1}) Y_{i - 1} + {\hat{α}}_{2} (Y_{i - 1}) Y_{i - 2} + {\hat{α}}_{3} (Y_{i - 1}) Y_{i - 3},$ (10)

In addition, we also apply classical regression and autoregressive models to analyze the lake water inflow and find that residuals have heteroskedasticity. Therefore, we combine AR (p) model with the autoregressive conditional

Table 4. 3-step forward relative prediction error of $\log ({Inflow}_{t})$ under different models.

heteroskedasticity (ARCH) model, i.e., AR(p)-ARCH(q) model, to fit the residuals. The final optimal AR(p)-ARCH(q) model is:

$\begin{matrix} Y_{i} = 1.335 + 0.460 Y_{i - 1} + 0.159 Y_{i - 2} + 0.415 {CldCvr}_{i} \\ + 0.212 {WndSpd}_{i} + 0.002 {Precip}_{i} + R_{i}, \end{matrix}$ (11)

$R_{i} = - 0.093 R_{i - 1} - 0.088 R_{i - 2} - 0.161 R_{i - 3} + δ_{i},$

$δ_{i} = ϕ_{i} η_{i}, η_{i} ~ W N (0,1),$

$ϕ_{i}^{2} = 0.040 + 0.078 δ_{i}^{2} .$

By comparing the prediction results of the three models in Table 4, our proposed model (9) performs better than the other two models. Compared to model (10), model (9) takes into account the effects of covariates on lake inflow; compared to model (11), model (9) uses the advantages of the time-varying coefficient. Since our proposed model combines the strengths of the FAR(p) model and the classical regression and autoregressive mixed model, it has advantages on modeling time series data with complex correlations between sample components.

5. Conclusions and Future Works

In this article, we propose a partial time-varying coefficient regression and autoregressive mixed model, which can be regarded as an extension of traditional regression and autoregressive mixed model and time-varying coefficient regression model. The proposed model is very flexible and can handle both complex correlations between time series components and effects of other covariates, so it can improve model fitting when building relationships between complex time series and covariates. We apply the local polynomial expansion technique and the least squares estimation method to obtain local estimates of the parameter functions in the model. At the same time, we also propose a global estimator algorithm for the constant coefficients in the model. In addition, we derive the asymptotic normality of the proposed estimators and conduct simulation studies to examine their finite-sample performances, and find our proposed estimators perform well. Finally, we apply the model to analyze the relationships between the water inflow of Shasta Lake and the other five climatic variables.

As we mentioned in Section 1, local polynomial technique is widely used to estimate varying-coefficient models, including time-varying coefficient time series models. One advantage of this method is it has solid theory so that we can derive out explicit biases of corresponding estimators. Furthermore, local polynomial method is easily implemented. Based on the local estimators, we can obtain global estimators for constant coefficients by using formula (6). Note that it is needed to select the best bandwidth when using local polynomial method. However, if we apply other smooth methods such as spline approach to estimate time-varying coefficients in (3), we also need to determine some turning parameters, e.g., knots and degree.

Although we can apply the local polynomial expansion technique and the least squares estimation method to obtain the local estimators as well as corresponding variances, how to test these estimated time-varying coefficients is another important problem. For example, the AMS criterion suggests that model (9) includes five time-varying coefficients, that is, $α_{1,1} (\cdot)$ , $α_{1,2} (\cdot)$ , $β_{1,1} (\cdot)$ , $β_{1,2} (\cdot)$ and $β_{1,3} (\cdot)$ , but Figure 3(b) shows that some estimated curves seem to be flat across time. Thus, it makes us wonder whether those estimated time-varying coefficients can be treated as constant. One way to assess this is to construct the confidence band to see whether a time-varying coefficient is a function or a constant. This method may be realized by obtaining weak Bahadur representations of estimators [29] . Another way is to develop a more objective and theoretical test following the simulation-assisted hypothesis testing procedure proposed in Zhang and Wu [21] . This will be our future research work.

Acknowledgements

Dr. Cao’s research is supported by Natural Science Foundation of Top Talent of SZTU (grant no. GDRC202135).

Appendix: Proof of Theorem 1

We first introduce some notations for the convenience. Denote

$D_{n} = D_{n} (t_{0}) = (\begin{matrix} D_{n,1} & D_{n,2} \\ D_{n,2}^{T} & D_{n,3} \end{matrix}), Σ_{n} = Σ_{n} (t_{0}) = (\begin{matrix} Σ_{n,1} & Σ_{n,2} \\ Σ_{n,2}^{T} & Σ_{n,3} \end{matrix}),$

where

$D_{n, 1} = D_{n, 1} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} G_{i} G_{i}^{T} K_{h} (t_{i} - t_{0}),$

$D_{n, 2} = D_{n, 2} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} G_{i} V_{i}^{T} (\frac{t_{i} - t_{0}}{h}) K_{h} (t_{i} - t_{0}),$

$D_{n, 3} = D_{n, 3} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} V_{i} V_{i}^{T} {(\frac{t_{i} - t_{0}}{h})}^{2} K_{h} (t_{i} - t_{0}),$

$Σ_{n, 1} = Σ_{n, 1} (t_{0}) = \frac{h}{(n - p) ν_{0}} (\sum_{i = p + 1}^{n} {\hat{ε}}_{i} G_{i} K_{h} (t_{i} - t_{0})) (\sum_{j = p + 1}^{n} {\hat{ε}}_{j} G_{j}^{T} K_{h} (t_{j} - t_{0})),$

$Σ_{n, 2} = Σ_{n, 2} (t_{0}) = \frac{h}{(n - p) ν_{0}} (\sum_{i = p + 1}^{n} {\hat{ε}}_{i} G_{i} K_{h} (t_{i} - t_{0})) (\sum_{j = p + 1}^{n} {\hat{ε}}_{j} V_{j}^{T} (\frac{t_{j} - t_{0}}{h}) K_{h} (t_{j} - t_{0})),$

$Σ_{n, 3} = Σ_{n, 3} (t_{0}) = \frac{h}{(n - p) ν_{0}} (\sum_{i = p + 1}^{n} {\hat{ε}}_{i} V_{i} (\frac{t_{i} - t_{0}}{h}) K_{h} (t_{i} - t_{0})) (\sum_{j = p + 1}^{n} {\hat{ε}}_{j} V_{j}^{T} (\frac{t_{j} - t_{0}}{h}) K_{h} (t_{j} - t_{0})) .$

Define

$T_{n} = T_{n} (t_{0}) = (\begin{matrix} T_{n, 0} (t_{0}) \\ T_{n, 1} (t_{0}) \end{matrix}), T_{n}^{*} = T_{n}^{*} (t_{0}) = (\begin{matrix} T_{n, 0}^{*} (t_{0}) \\ T_{n, 1}^{*} (t_{0}) \end{matrix}),$

where

$T_{n, 0} = T_{n, 0} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} G_{i} K_{h} (t_{i} - t_{0}) Y_{i},$

$T_{n, 1} = T_{n, 1} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} V_{i} (\frac{t_{i} - t_{0}}{h}) K_{h} (t_{i} - t_{0}) Y_{i},$

$T_{n, 0}^{*} = T_{n, 0}^{*} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} G_{i} K_{h} (t_{i} - t_{0}) ε_{i},$

$T_{n, 1}^{*} = T_{n, 1}^{*} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} V_{i} (\frac{t_{i} - t_{0}}{h}) K_{h} (t_{i} - t_{0}) ε_{i} .$

Noting the definitions of $T_{n}$ and $T_{n}^{*}$ , it is easy to know

$T_{n,0} - T_{n,0}^{*} = D_{n,1} ξ_{0} + h D_{n,2} η_{0} + \frac{h^{2}}{2} D_{n,1}^{*} {η^{'}}_{0} + o_{p} (h^{2}),$

$T_{n,1} - T_{n,1}^{*} = D_{n,2}^{T} ξ_{0} + h D_{n,3} η_{0} + \frac{h^{2}}{2} D_{n,2}^{*} {η^{'}}_{0} + o_{p} (h^{2}),$

$ξ_{0} = ξ (t_{0}) = {α_{1}^{T} (t_{0}), α_{2}^{T} (t_{0}), β_{1}^{T} (t_{0}), β_{2}^{T} (t_{0})}^{T},$

$η_{0} = η (t_{0}) = {({({α^{'}}_{1} (t_{0}))}^{T}, {({β^{'}}_{1} (t_{0}))}^{T})}^{T},$

${η^{'}}_{0} = η^{'} (t_{0}) = {({({α^{″}}_{1} (t_{0}))}^{T}, {({β^{″}}_{1} (t_{0}))}^{T})}^{T},$

$D_{n, 1}^{*} = D_{n, 1}^{*} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} G_{i} V_{i}^{T} {(\frac{t_{i} - t_{0}}{h})}^{2} K_{h} (t_{i} - t_{0}),$

$D_{n, 2}^{*} = D_{n, 2}^{*} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} V_{i} V_{i}^{T} {(\frac{t_{i} - t_{0}}{h})}^{3} K_{h} (t_{i} - t_{0}) .$

After a simple combination, we can obtain

$T_{n} - T_{n}^{*} = D_{n} H γ_{0} + \frac{h^{2}}{2} (\begin{matrix} D_{n,1}^{*} η^{'} (t_{0}) \\ D_{n,2}^{*} η^{'} (t_{0}) \end{matrix}) + o_{p} ({(t - t_{0})}^{2}) .$ (A.1)

Note that the equivalent form of formula (4) is:

$\hat{γ} = H^{- 1} D_{n}^{- 1} T_{n} .$ (A.2)

By (A.1) and (A.2), it is easy to know

$H (\hat{γ} - γ_{0}) = D_{n}^{- 1} T_{n}^{*} + \frac{h^{2}}{2} D_{n}^{- 1} (\begin{matrix} D_{n,1}^{*} η^{'} (t_{0}) \\ D_{n,2}^{*} η^{'} (t_{0}) \end{matrix}) + o_{p} (h^{2}) .$ (A.3)

Next, we discuss the large sample properties of $T_{n}^{*}$ . To facilitate the derivation, the following notation is introduced: $σ^{2} (t_{0}, g) = Var (Y | t = t_{0}, G = g)$ , $u_{t} = ε_{t} / σ (t, G_{t})$ . For $p + 1 \leq k, l \leq n$ , $t, s \in [0,1]$ , let

$R_{l, k}^{(1)} (t, s) = Cov (G_{l} u_{l} σ (t, G_{l}), G_{k} u_{k} σ (s, G_{k})),$

$R_{l, k}^{(2)} (t, s) = Cov (G_{l} u_{l} σ (t, G_{l}), V_{k} u_{k} σ (s, G_{k})),$

$R_{l, k}^{(3)} (t, s) = Cov (V_{l} u_{l} σ (t, G_{l}), V_{k} u_{k} σ (s, G_{k})) .$

It can be seen from the above that for a stationary process $R_{l, l + k}^{(i)} (t, t) = R_{k}^{(i)} (t), i = 1, 2, 3$ . Using a similar derivation from Cai (2007), we obtain

$\begin{array}{l} n h Var (T_{n, 0}^{*} (t_{0})) \\ = \frac{n h}{{(n - p)}^{2}} \sum_{p + 1 \leq i, j \leq n} Cov (G_{i} u_{i} σ (t_{i}, G_{i}), G_{j} u_{j} σ (t_{j}, G_{j})) K_{h} (t_{i} - t_{0}) K_{h} (t_{j} - t_{0}) \\ \approx \frac{h}{n - p} \sum_{i = p + 1}^{n} E (G_{i} G_{i}^{T} u_{i}^{2} σ^{2} (t_{i}, G_{i})) K_{h}^{2} (t_{i} - t_{0}) \\ + 2 \frac{h}{n - p} \sum_{p + 1 \leq i < j \leq n} Cov (G_{i} u_{i} σ (t_{i}, G_{i}), G_{j} u_{j} σ (t_{j}, G_{j})) K_{h} (t_{i} - t_{0}) K_{h} (t_{j} - t_{0}) \end{array}$

$\begin{array}{l} \approx \frac{h}{n - p} \sum_{i = p + 1}^{n} R_{i, i}^{(1)} (t_{i}, t_{i}) K_{h}^{2} (t_{i} - t_{0}) \\ + \frac{2 h}{n - p} \sum_{p + 1 \leq i < j \leq n} R_{i, j}^{(1)} (t_{i}, t_{j}) K_{h} (t_{i} - t_{0}) K_{h} (t_{j} - t_{0}) \\ \approx ν_{0} R_{0}^{(1)} (t_{0}) + 2 ν_{0} \sum_{k = 1}^{\infty} R_{k}^{(1)} (t_{0}) . \end{array}$ (A.4)

Thus, we have $n h Var (T_{n, 0}^{*} (t_{0})) \to Σ_{1} (t_{0}) = (R_{0}^{(1)} (t_{0}) + 2 \sum_{k = 1}^{\infty} R_{k}^{(1)} (t_{0})) ν_{0}$ . Similarly, we obtain

$n h Var (T_{n, 1}^{*} (t_{0})) \to Σ_{3} (t_{0}) = (R_{0}^{(3)} (t_{0}) + 2 \sum_{k = 1}^{\infty} R_{k}^{(3)} (t_{0})) ν_{0},$

$n h Cov (T_{n, 0}^{*} (t_{0}), T_{n, 1}^{*} (t_{0})) \to Σ_{2} (t_{0}) = (R_{0}^{(2)} (t_{0}) + 2 \sum_{k = 1}^{\infty} R_{k}^{(2)} (t_{0})) ν_{0} .$ (A.5)

Next, we derive the properties of $D_{n} (t_{0})$ . From the approximation of the Riemann summation of definite integrals, it is known that when $0 < t_{0} < 1$ , we have

$D_{n,1} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} G_{i} G_{i}^{T} K_{h} (t_{i} - t_{0}) \approx Ω_{1} \int_{t_{0} / h}^{(1 - t_{0}) / h} K (u) d u \approx Ω_{1} μ_{0},$

$D_{n,2} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} G_{i} V_{i}^{T} (\frac{t_{i} - t_{0}}{h}) K_{h} (t_{i} - t_{0}) \approx Ω_{2} \int_{t_{0} / h}^{(1 - t_{0}) / h} u K (u) d u \approx Ω_{2} μ_{1},$

$D_{n,3} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} V_{i} V_{i}^{T} {(\frac{t_{i} - t_{0}}{h})}^{2} K_{h} (t_{i} - t_{0}) \approx Ω_{3} \int_{t_{0} / h}^{(1 - t_{0}) / h} u^{2} K (u) d u \approx Ω_{3} μ_{2},$

$D_{n,1}^{*} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} G_{i} V_{i}^{T} {(\frac{t_{i} - t_{0}}{h})}^{2} K_{h} (t_{i} - t_{0}) \approx Ω_{2} \int_{t_{0} / h}^{(1 - t_{0}) / h} u^{2} K (u) d u \approx Ω_{2} μ_{2},$

$D_{n,2}^{*} (t_{0}) = \frac{1}{n - p} \sum_{i = p + 1}^{n} V_{i} V_{i}^{T} {(\frac{t_{i} - t_{0}}{h})}^{3} K_{h} (t_{i} - t_{0}) \approx Ω_{3} \int_{t_{0} / h}^{(1 - t_{0}) / h} u^{3} K (u) d u \approx Ω_{3} μ_{3} .$ (A.6)

Since the kernel function $K (\cdot)$ is symmetrical, $μ_{0} = 1$ , $μ_{1} = μ_{3} = 0$ . Therefore,

$B_{n} (t_{0}) = \frac{h^{2}}{2} D_{n}^{- 1} (\begin{matrix} D_{n,1}^{*} η^{'} (t_{0}) \\ D_{n,2}^{*} η^{'} (t_{0}) \end{matrix}) \to \frac{h^{2}}{2} (\begin{matrix} Ω_{1}^{- 1} Ω_{2} μ_{2} η^{'} (t_{0}) \\ 0 \end{matrix}) = b = b (t_{0}),$ (A.7)

when $n \to \infty$ . Combining the results of (A.3)-(A.7), it is not difficult to obtain the conclusion of Theorem 1.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1]	Hamilton, J.D. (1994) Time Series Analysis. Princeton University Press, Princeton.
[2]	Box, G.E.P., Jenkins, G.M. and Reinsel, G.C. (2008) Time Series Analysis: Forecasting and Control. 4th Edition, John Wiley & Sons, New York. [CrossRef]
[3]	Gujarati, D.N. and Porter, D.C. (2010) Essentials of Econometrics. 4th Edition, McGraw-Hill/Irwin, New York.
[4]	Robinson, P.M. (1989) Nonparametric Estimation of Time-Varying Parameters. In: Hackl, P., Eds., Statistical Analysis and Forecasting of Economic Structural Change, Springer-Verlag, Berlin, 253-264. [CrossRef]
[5]	Robinson, P.M. (1989) Time-Varying Nonlinear Regression. In: Hackl, P. and Westlund, A.H., Eds., Economic Structure Change, Springer-Verlag, Berlin, 179-190. [CrossRef]
[6]	Chen, R. and Tsay, R.S. (1993) Functional-Coefficient Autoregressive Models. Journal of the American Statistical Association, 88, 298-308. [CrossRef]
[7]	Hoover, D.R., Rice, J.A., Wu, C.O. and Yang, L.P. (1998) Nonparametric Smoothing Estimates of Time-Varying Coefficient Models with Longitudinal Data. Biometrika, 85, 809-822. [CrossRef]
[8]	Cai, Z., Fan, J. and Yao, Q. (2000) Functional-Coefficients Regression Models for Nonlinear Time Series. Journal of the American Statistical Association, 95, 941-956. https://www.tandfonline.com/doi/abs/10.1080/01621459.2000.10474284 [CrossRef]
[9]	Cai, Z. (2007) Trending Time-Varying Coefficient Time Series Models with Serially Correlated Errors. Journal of Econometrics, 136, 163-188. [CrossRef]
[10]	Liu, X., Cai, Z. and Chen, R. (2007) Functional Coefficient Seasonal Time Series Models with an Application of Hawaii Tourism Data. Computational Statistics, 30, 719-744. [CrossRef]
[11]	Cai, Z., Chen, L. and Fang, Y. (2015) Semiparametric Estimation of Partially Varying-Coefficient Dynamic Panel Data Models. Econometric Reviews, 34, 695-719. [CrossRef]
[12]	Cai, Z., Fang, Y. and Tian, D. (2018) Assessing Tail Risk Using Expectile Regressions with Partially Varying Coefficients. Journal of Management Science and Engineering, 3, 183-213. [CrossRef]
[13]	Chen, Y., Chua, W.S. and Koch, T. (2018) Forecasting Day-Ahead High-Resolution Natural-Gas Demand and Supply in Germany. Applied Energy, 228, 1091-1100. [CrossRef]
[14]	Tu, Y. and Wang, Y. (2019) Functional Coefficient Cointegration Models Subject to Time-Varying Volatility with an Application to the Purchasing Power Parity. Oxford Bulletin of Economics and Statistics, 86, 1401-1423. [CrossRef]
[15]	Cai, Z., Fang, Y., Lin, M. and Su, J. (2019) Inferences for a Partially Varying Coefficient Model with Endogenous Regressors. Journal of Business & Economic Statistics, 37, 158-170. [CrossRef]
[16]	Yousuf, K. and Ng, S. (2021) Boosting High Dimensional Predictive Regressions with Time Varying Parameters. Journal of Econometrics, 224, 60-87. [CrossRef]
[17]	Kalli, M. and Griffin, J.E. (2014) Time-Varying Sparsity in Dynamic Regression Models. Journal of Econometrics, 178, 779-793. [CrossRef]
[18]	Feldkircher, M., Huber, F. and Kastner, G. (2017) Sophisticated and Small versus Simple and Sizeable: When Does It Pay off to Introduce Drifting Coefficients in Bayesian VARs? Department of Economics Working Paper Series 260, WU Vienna University of Economics and Business.
[19]	Bitto, A. and Fruhwirth-Schnatter, S. (2019) Achieving Shrinkage in a Time-Varying Parameter Model Framework. Journal of Econometrics, 210, 75-97. [CrossRef]
[20]	Li, D., Chen, J. and Lin, Z. (2011) Statistical Inference in Partially Time-Varying Coefficient Models. Journal of Statistical Planning and Inference, 141, 995-1013. [CrossRef]
[21]	Zhang, T. and Wu, W.B. (2012) Inference of Time-Varying Regression Models. Annals of Statistics, 40, 1376-1402. [CrossRef]
[22]	Fan, G.L., Liang, H.Y. and Wang, J.F. (2013) Statistical Inference for Partially Time-Varying Coefficient Errors-in-Variables Models. Journal of Statistical Planning and Inference, 143, 505-519. [CrossRef]
[23]	Chen, X.B., Gao, J., Li, D. and Silvapulle, P. (2018) Nonparametric Estimation and Forecasting for Time-Varying Coefficient Realized Volatility Models. Journal of Business & Economic Statistics, 36, 88-100. [CrossRef]
[24]	Kim, S., Zhao, Z. and Xiao, Z. (2018) Efficient Estimation for Time-Varying coEFficient Logitudinal Models. Journal of Nonparametric Statistics, 30, 680-702. [CrossRef]
[25]	Cuaresma, J.C., Doppelhofer, G., Feldkircher, M. and Huber, F. (2019) Spillovers from US Monetary Policy: Evidence from a Time Varying Parameter Global Vector Autoregressive Model. Journal of the Royal Statistical Society Series A: Statistics in Society, 182, 831-861. [CrossRef]
[26]	Cheng, T., Gao, J. and Zhang, X. (2019) Bayesian Bandwidth Estimation in Nonparametric Time-Varying Coefficient Models. Journal of Business & Economic Statistics, 37, 1-12. [CrossRef]
[27]	Hu, L., Huang, T. and You, J. (2019) Two-Step Estimation of Time-Varying Additive Model for Locally Stationary Time Series. Computational Statistics & Data Analysis, 130, 94-110. [CrossRef]
[28]	Li, D., Phillips, P.C.B. and Gao, J. (2019) Kernel-Based Inference in Time-Varying Coefficient Cointegrating Regression. Journal of Econometrics, 215, 607-632. [CrossRef]
[29]	Karmakar, S., Richter, S. and Wu, W.B. (2022) Simultaneous Inference for Time-Varying Models. Journal of Econometrics, 227, 408-428. [CrossRef]
[30]	Hastie, T. and Tibshirani, R. (1993) Varying-Cofficient Models. Journal of the Royal Statistical Society: Series B (Methodological), 55, 757-796. [CrossRef]
[31]	Ramsay, J.O. and Silverman, B.W. (1997) Functional Data Analysis. Springer-Verlag, New York. [CrossRef]
[32]	Brumback, B.A. and Rice, J.A. (1998) Smoothing Spline Models for the Analysis of Nested and Crossed Samples of Curves. Journal of the American Statistical Association, 93, 961-976. [CrossRef]
[33]	Huang, J., Wu, C.O. and Zhou, L. (2004) Polynomial Spline Estimation and Inference for Varying Coefficient Models with Longitudinal Data. Statistica Sinica, 14, 763-788.
[34]	Aneiros-Perez, G. and Vieu, P. (2008) Nonparametric Time Series Prediction: A Semi-Functional Partial Linear Modeling. Journal of Multivariate Analysis, 99, 834-857. [CrossRef]
[35]	Fan, J. and Gijbels, I. (1996) Local Polynomial Modelling and Its Applications. Chapman & Hall, London.
[36]	De Boor, C (2001) A Practical Guide to Splines. Springer-Verlag, New York.
[37]	Tian, L., Zucker, D. and Wei, L.J. (2005) On the Cox Model with Time-Varying Regression Coefficients. Journal of the American Statistical Association, 100, 172-183. [CrossRef]
[38]	Shumway, R.H. and Stoffer, D. (2017) Time Series Analysis and Its Applications. 4th Edition, Springer-Verlag, New York. [CrossRef]

Journals Menu

Follow SCIRP

	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies