Smoothed Empirical Likelihood Inference for Nonlinear Quantile Regression Models with Missing Response

Honghua Dong; Xiuli Wang

doi:10.4236/ojapps.2023.136074

Open Journal of Applied Sciences > Vol.13 No.6, June 2023

Smoothed Empirical Likelihood Inference for Nonlinear Quantile Regression Models with Missing Response

Honghua Dong, Xiuli Wang^*
School of Mathematics and Statistics, Shandong Normal University, Jinan, China.
DOI: 10.4236/ojapps.2023.136074 PDF HTML XML 78 Downloads 292 Views

Abstract

In this paper, three smoothed empirical log-likelihood ratio functions for the parameters of nonlinear models with missing response are suggested. Under some regular conditions, the corresponding Wilks phenomena are obtained and the confidence regions for the parameter can be constructed easily.

Keywords

Nonlinear Model, Quantile Regression, Smoothed Empirical Likelihood, Missing at Random

Share and Cite:

Dong, H. and Wang, X. (2023) Smoothed Empirical Likelihood Inference for Nonlinear Quantile Regression Models with Missing Response. Open Journal of Applied Sciences, 13, 921-933. doi: 10.4236/ojapps.2023.136074.

1. Introduction

Quantile regression (QR) proposed by Koenker and Bassett [1] has become a popular alternative to least squares method for providing comprehensive description of the response distribution and robustness against heavy-tailed error distributions. Because of these significant advantages, QR has become an effective method for statistical research. There are many literatures on the estimation of quantile regression models; among them, Koenker [2] is a monograph worth studying. As for some papers, see for example Kim [3] , Cai and Xu [4] , Wu et al. [5] , Cai and Xiao [6] and among others.

In recent years, quantile regression with missing data has attracted scholars’ considerable attention. There are several methods, such as complete-case (CC) analysis method, inverse probability weighted method (IPW) and imputation method to handle the missing data. For example, Wei et al. [7] proposed a multiple imputation estimator for parameter estimation in linear QR with missing covariates. Sherwood et al. [8] suggested the inverse probability weighted (IPW) method for linear QR when the covariates are missing at random. Chen et al. [9] also examined the estimation of linear QR model based on nonparametric inverse probability weighted, estimating equations projection, and a combination of both when observations are missing at random. Sherwood [10] investigated the variable selection for the additive partially linear quantile regression with missing covariates. Zhao et al. [11] studied several IPW estimators for parameters in QR when covariates or responses are subject to missing not at random.

It is well known that empirical likelihood (EL) method, introduced by Owen [12] [13] , has many advantages in constructing confidence intervals. For example, it does not need to construct a pivot quantity, and the confidence regions shape and orientation are determined entirely by the data itself. Some scholars have used this method to QR, and some good theoretical results have been obtained under this framework. See for example, Chen and Hall [14] , Wang and Zhu [15] , Tang and Leng [16] , Zhao et al. [17] , Zhao and Tang [18] , Luo and Pang [19] , Zhao and Zhou [20] , Zhou et al. [21] . However, estimation equations based on quantile regression models are not differentiable at parameter points, such that EL method fails to achieve the higher order accuracy. To achieve the higher-order asymptotic refinements, Whang [22] proposed to smooth the estimating equations for the empirical likelihood under the linear QR models. Later, Lv and Li [23] proposed the smoothed empirical likelihood (SEL) for partially linear quantile regression models with missing response, and the SEL statistics for the parameters and the nonparametric part were defined, and the asymptotic Chi-squared distributions were shown. Recently, for the linear QR models with missing response at random, Luo et al. [24] proposed three SEL ratios for the regression parameter, and the asymptotical distributions were shown to be standard $χ^{2}$ distribution under some conditions. Linear quantile regression models offer a flexible approach in many applications. It is also of considerable interest to investigate nonlinear quantile models. As far as we know, there is little work done for nonlinear quantile models with missing responses at random. Just like mentioned in Koenker [2] and other literature about nonlinear models, the computation of the entire process in the nonlinear case is considerably more challenging than the linear case where the computation task is quite easy. So it is not directly to extend the work of Whang [22] and Luo et al. [24] to nonlinear quantile models because of the complexity of nonlinear models with missing responses at random. Therefore, the main purpose of this paper is to develop the smoothed EL inferences on $β$ with missing responses at random. The rest of this paper is organized as follows. In Section 2, the smoothed empirical likelihood ratios for the parametric vector are proposed, and the asymptotic properties of the proposed empirical log-likelihood ratios are investigated in Section 3. Section 4 is the proofs of the main results. Conclusions are given in Section 5.

2. Methodology

In this paper, we consider the nonlinear quantile regression model

$Y_{i} = f (X_{i}, β_{τ}) + ε_{i}, i = 1, 2, \dots, n,$ (2.1)

where $X_{i}$ is a d-dimensional covariate and $Y_{i}$ is a response variable, $β_{τ}$ is a p-dimensional parameter vector, and $ε_{i}$ is an independent and identically distributed random variable, which satisfies $P (ε_{i} \leq 0 | X_{i}) = τ$ , where the quantile level $τ \in (0,1)$ . For simplicity, we write $β_{τ}$ as $β$ throughout this paper. For model (2.1), we focus on the case where all values of $X_{i}$ are completely observed, but some values of response $Y_{i}$ are missing. That is, we have the incomplete observations $(δ_{i}, X_{i}, Y_{i}), i = 1, 2, \dots, n$ from model (2.1), where $δ_{i}$ is an indicator variable, when $Y_{i}$ can be observed, then $δ_{i} = 1$ , and when $Y_{i}$ is missing with $δ_{i} = 0$ . Throughout this paper, we assume that $Y_{i}$ is missing at random (MAR), and the MAR assumption satisfies

$P {δ_{i} = 1 | Y_{i}, X_{i}} = P {δ_{i} = 1 | X_{i}} = π (X_{i}) .$ (2.2)

The formula (2.2) implies that $δ_{i}$ is conditionally independent with $Y_{i}$ for given $X_{i}$ , and $π (X_{i})$ is called the propensity score or selection probability function.

2.1. Smoothed Quantile Empirical Likelihood with Complete-Case Data

In Koenker [2] , if $(X_{i}, Y_{i}), i = 1, 2, \dots, n$ can be observed, the quantile estimator $\hat{β}$ of the parameter $β$ in model (2.1) is obtained by minimizing the following objective function

$\sum_{i = 1}^{n} ρ_{τ} (Y_{i} - f (X_{i}, β)),$ (2.3)

where $ρ_{τ} (u) = u (τ - I (u < 0))$ is the quantile loss function and $I (\cdot)$ is the indicator function. $\hat{β}$ can be obtained by solving the following equation, which is the optimal condition corresponding to (2.3), i.e.

$\sum_{i = 1}^{n} \nabla f_{i} ψ (Y_{i}, X_{i}, β) = 0,$ (2.4)

where $\nabla f_{i} = [\partial f (X_{i}, β) / \partial β]$ , $ψ (Y_{i}, X_{i}, β) = I (f (X_{i}, β) - Y_{i} \geq 0) - τ$ is the quantile score function, and $E (ψ (Y_{i}, X_{i}, β)) = 0$ if $β$ is the true value.

Since some values of response $Y_{i}$ from model (2.1) are missing, with MAR assumption we can prove that

$E (δ_{i} \nabla f_{i} ψ (Y_{i}, X_{i}, β)) = 0.$ (2.5)

So based on complete-case data, the quantile estimator ${\hat{β}}_{Q}$ of the parameter $β$ is the solution of

$\sum_{i = 1}^{n} δ_{i} \nabla f_{i} ψ (Y_{i}, X_{i}, β) = 0.$ (2.6)

As pointed out by Whang [22] , the function $ψ (Y_{i}, X_{i}, β)$ in (2.6) is not differentiable at point $β$ . This will cause some difficulties in higher-order asymptotic analysis, since most of the empirical likelihood-based research is based on a smooth function of sample moments. Then following Whang [22] , let $K (\cdot)$ denote a bounded kernel function that is compactly supported on $[- 1,1]$ and integrated to one, define $G (x) = \int_{u < x} K (u) d u$ , $G_{h} (x) = G (x / h)$ , where h is a positive bandwidth, then a smooth function of $ψ$ is defined as

$ψ_{h} (Y_{i}, X_{i}, β) = G_{h} (f (X_{i}, β) - Y_{i}) - τ .$ (2.7)

It can be proved that $ψ_{h} (Y_{i}, X_{i}, β)$ is asymptotically unbiased.

Introducing the auxiliary random vector

$η_{i, C} (β) = δ_{i} \nabla f_{i} ψ_{h} (Y_{i}, X_{i}, β) .$ (2.8)

According to the above discussion, we know that $E [η_{i, C} (β)] = 0$ , so the smoothed empirical log-likelihood ratio function of parameter $β$ with complete-case data can be defined as

$R_{C} (β) = - 2 \max {\sum_{i = 1}^{n} \log (n p_{i}) | p_{i} \geq 0, \sum_{i = 1}^{n} p_{i} = 1, \sum_{i = 1}^{n} p_{i} η_{i, C} (β) = 0},$ (2.9)

where log(.) is the logarithmic fuction based on e. If zero is inside the convex hull of $(η_{1, C} (β), \dots, η_{n, C} (β))$ , then a unique value for $R_{C} (β)$ exists. Using the Lagrange multiplier method and some simple calculations, $R_{C} (β)$ can be written as

$R_{C} (β) = 2 \sum_{i = 1}^{n} \log (1 + λ^{T} η_{i, C} (β)),$ (2.10)

where $λ$ is a Lagrange multiplier which is determined by

$\sum_{i = 1}^{n} \frac{η_{i, C} (β)}{1 + λ^{T} η_{i, C} (β)} = 0.$ (2.11)

2.2. Smoothed Weighted Quantile Empirical Likelihood

Similar to Section 2.1, we introduce the following auxiliary random vector

$η_{i, W} (β) = \frac{δ_{i}}{π (X_{i})} \nabla f_{i} ψ_{h} (Y_{i}, X_{i}, β) .$ (2.12)

Using the MAR assumption, we can prove that $E (η_{i, W} (β)) = 0$ if $β$ is the true value, thus the smoothed weighted quantile empirical log-likelihood ratio function for $β$ can be defined accordingly.

Since (2.12) contains an unknown function $π (X_{i})$ which needs to be estimated first. We can use a kernel smoothing method to estimate. Specifically, $π (X_{i})$ can be defined by

$\hat{π} (X_{i}) = \frac{\sum_{j = 1}^{n} L_{a} (X_{j} - X_{i}) δ_{j}}{\sum_{j = 1}^{n} L_{a} (X_{j} - X_{i})},$ (2.13)

where $L_{a} (\cdot) = L (\cdot / a) / a^{d}$ is the d-dimensional kernel function, and a is the bandwidth.

When the dimension of the covariate X is very high, the nonparametric estimation will encounter with the curse of dimensionality. In this case, a parametric approach might be more feasible for the estimation of $π (X_{i})$ . A commonly used model is the logistic regression given by

$π (X_{i}, γ) = \frac{\exp (γ_{0} + X_{i}^{T} γ_{1})}{1 + \exp (γ_{0} + X_{i}^{T} γ_{1})} = \frac{\exp (Γ_{i}^{T} γ)}{1 + \exp (Γ_{i}^{T} γ)},$ (2.14)

where $Γ_{i} = {(1, X_{i}^{T})}^{T}$ , $γ = {(γ_{0}, γ_{1}^{T})}^{T} \in Θ$ is $(d + 1)$ -dimensional unknown parameter vector. Here $γ$ can be estimated by maximizing the log-likelihood function

$L (γ) = \sum_{i = 1}^{n} {δ_{i} \log π (X_{i}, γ) + (1 - δ_{i}) \log (1 - π (X_{i}, γ))} .$ (2.15)

Let $\hat{γ}$ be the maximum likelihood estimation of $γ$ , then the parameter estimator of $π (X_{i})$ can be written as $π (X_{i}, \hat{γ})$ . If the parametric model for $π (\cdot)$ is correctly specified, the inverse probability weighted method is consistent and feasible.

For convenience, we use $\hat{π} (X_{i})$ to represent the estimation of $π (X_{i})$ , it can be the estimator estimated by the parameter method or by nonparametric method. Denote

${\hat{η}}_{i, W} (β) = \frac{δ_{i}}{\hat{π} (X_{i})} \nabla f_{i} ψ_{h} (Y_{i}, X_{i}, β),$ (2.16)

the smoothed weighted quantile empirical log-likelihood ratio function of parameter $β$ is

$R_{W} (β) = - 2 \max {\sum_{i = 1}^{n} \log (n p_{i}) | p_{i} \geq 0, \sum_{i = 1}^{n} p_{i} = 1, \sum_{i = 1}^{n} p_{i} {\hat{η}}_{i, W} (β) = 0} .$ (2.17)

2.3. Smoothed Imputed Quantile Empirical Likelihood

From above discussion, we can see neither approach makes full use of the information contained in the data. As pointed out in Xue [25] , discarding the missing data may lead to incorrect conclusion when there are a lot of missing values in the considered data set. To resolve the issue, we first use the nonlinear quantile imputation to impute $Y_{i}$ by $f (X_{i}, {\hat{β}}_{Q})$ , with ${\hat{β}}_{Q}$ obtained by (2.6), this kind of imputation is also used by Zhao and Tang [18] and Zhou et al. [21] . With the imputed value in hand, and then using the inverse probability weighted technique, we define the final imputed value by

${\hat{Y}}_{i} = \frac{δ_{i}}{\hat{π} (X_{i})} Y_{i} + (1 - \frac{δ_{i}}{\hat{π} (X_{i})}) f (X_{i}, {\hat{β}}_{Q}),$ (2.18)

where $\hat{π} (X_{i})$ is given in Section 2.2. Then the imputation based auxiliary random vector is

${\hat{η}}_{i, I} (β) = \nabla f_{i} ψ_{h} ({\hat{Y}}_{i}, X_{i}, β),$ (2.19)

Accordingly, the smoothed imputed quantile empirical log-likelihood ratio function for $β$ is defined as

$R_{I} (β) = - 2 \max {\sum_{i = 1}^{n} \log (n p_{i}) | p_{i} \geq 0, \sum_{i = 1}^{n} p_{i} = 1, \sum_{i = 1}^{n} p_{i} {\hat{η}}_{i, I} (β) = 0} .$ (2.20)

The ratio is more appropriate, because it sufficiently uses the information contained in the data.

3. Asymptotic Properties

In this section, we will give the asymptotic distributions for the three smoothed quantile empirical log-likelihood ratios proposed in Section 2.1-2.3. Firstly, we give some symbols and assumptions that needed in proof.

Assuming the probability density function of X is $p (x)$ , let $g (\cdot | x)$ and $F (\cdot | x)$ be the density and distribution function of error $ε$ conditional on $X = x$ , respectively.

C₁. ${Y_{i}, X_{i}}, i = 1, \dots, n$ , are independent and identically distributed random vectors.

C₂. Both $π (x)$ and $p (x)$ have bounded derivatives up to order r almost surely, and $\inf_{x} π (x) > 0$ .

C₃. $L (x)$ is a kernel function of order r, and there is a constant $C_{1}, C_{2}, ρ$ , such that $C_{1} I (‖ x ‖ \leq ρ) \leq L (x) \leq C_{2} I (‖ x ‖ \leq ρ)$ .

C₄. The kernel function $K (x)$ has bounded and compactly supported on $[- 1,1]$ , and for a constant $C_{K} \neq 0$ , it satisfies

$\begin{array}{l} \int u^{j} K (u) d u = {\begin{array}{l} 1, & j = 0, \\ 0, & 1 \leq j \leq r - 1, \\ C_{K}, & j = r . \end{array} \end{array}$

C₅. For $L \geq 1$ , let $\tilde{G} (u) = {(G (u), G^{2} (u), \dots, G^{L + 1} (u))}^{T}$ , where $G (u) = \int_{v < u} K (v) d v$ . For any $θ \in R^{L + 1}$ satisfying $‖ θ ‖ = 1$ , there is a partition on $[- 1,1]$ : $- 1 = b_{0} < b_{1} < \dots < b_{L + 1} = 1$ , such that $θ^{T} \tilde{G} (u)$ is either strictly positive or strictly negative on $(b_{l - 1}, b_{l}), l = 1, 2, \dots, L + 1$ .

C₆. The bandwidth h satisfies $n h^{2 r} \to 0, n h / \log (n) \to \infty$ as $n \to \infty$ .

C₇. The matrices $A_{i}, i = 1, 2$ and $B_{i}, i = 1, 2, 3$ defined in Lemma 3 of Section 4 are non-singular.

C₈. $P (‖ X_{n} ‖ > M_{n}) = o (n^{- 1 / 2})$ , where $0 < M_{n} \to \infty$ .

C₉. The bandwidth a satisfies $n a^{2 d} M_{n}^{- 2 d} \to 0, n a^{4 r} \to 0$ .

C₁₀. The maximum likelihood estimation $\hat{γ}$ of $γ$ is $\sqrt{n}$ -consistent and satisfies the regularity condition of asymptotic normality.

The following Theorem states the asymptotic distribution of $R (β)$ .

Theorem 1. Suppose that conditions in C₁-C₁₀ hold, and $β$ is the true value of the parameter, then

$R (β) \overset{D}{\to} χ_{d}^{2},$

where $R (β)$ can be $R_{C} (β), R_{W} (β), R_{I} (β)$ , $χ_{d}^{2}$ is a chi-square distribution with d degrees of freedom, and $\overset{D}{\to}$ represents convergence in distribution.

According to the above theorem, the confidence region of the parameter $β$ can be constructed. More specifically, for a given $α$ with $0 < α < 1$ , let $χ_{d}^{2} (1 - α)$ satisfies $P (χ_{d}^{2} \leq χ_{d}^{2} (1 - α)) = 1 - α$ , then the approximate $(1 - α)$ confidence region for $β$ can be defined as

$C_{α} (β) = {β | R (β) \leq χ_{d}^{2} (1 - α)} .$ (3.1)

4. Proofs

Before giving the proof of the main theorem, some lemmas are useful for proving the main theorem.

Lemma 1 Suppose conditions C₂, C₃ and C₈ hold, then

$E {[\hat{π} (X_{i}) - π (X_{i})]}^{2} = O ({(n a^{d})}^{- 1} M_{n}^{d}) + O (a^{2 r}) + o (n^{- 1 / 2}) .$

Lemma 1 is the Lemma 2 in Xue [25] , so the proof is omitted.

Lemma 2 Suppose that conditions C₂, C₁₀ hold and the $π (x) = π (x, γ)$ is correctly specified, then

$\hat{γ} - γ = O_{p} (n^{- 1 / 2}), \max_{1 \leq i \leq n} | π (x, \hat{γ}) - π (x, γ) | = o_{p} (1),$

where $γ$ is the true value.

Lemma 2 is the lemma A.2 of Tang and Zhao [26] , please see the proof details in Tang and Zhao [26] .

Lemma 3. Suppose conditions C₁-C₁₀ hold, then as $n \to \infty$ , we have

1) $E [\frac{\partial η_{i} (β)}{\partial β}] = A + o (1);$

2) $E [η_{i} (β) η_{i}^{T} (β)] = B + o (1),$

When $η_{i} (β) = η_{i, C} (β)$ , $A = A_{1} = E [π (X) g (0 | X) \nabla f_{1} {(\nabla f_{1})}^{T}]$ , $B = B_{1} = τ (1 - τ) E [π (X) \nabla f_{1} {(\nabla f_{1})}^{T}]$ ;

When $η_{i} (β) = {\hat{η}}_{i, W} (β)$ , $A = A_{2} = E [g (0 | X) \nabla f_{1} {(\nabla f_{1})}^{T}]$ , $B = B_{2} = τ (1 - τ) E [\frac{1}{π (X)} \nabla f_{1} {(\nabla f_{1})}^{T}]$ ;

When $η_{i} (β) = {\hat{η}}_{i, I} (β)$ , $A = A_{2} = E [g (0 | X) \nabla f_{1} {(\nabla f_{1})}^{T}]$ , $B = B_{3} = τ (1 - τ) E [\nabla f_{1} {(\nabla f_{1})}^{T}]$ .

Proof a) We first prove the lemma when $η_{i} (β) = η_{i, C} (β)$ .

For result 1), By a change of variable, we have

$η_{i, C} (β) = δ_{i} \nabla f_{i} \int [F (- h u | X_{i}) - F (0 | X_{i})] K (u) d u,$

and then

$\begin{matrix} E [\frac{\partial η_{i, C} (β)}{\partial β^{T}}] = E [π (X_{i}) g (0 | X_{i}) \nabla f_{i} \nabla f_{i}^{T}] \\ + E [π (X_{i}) \nabla f_{i} \nabla f_{i}^{T} \int [g (- h u | X_{i}) - g (0 | X_{i})] K (u) d u] . \end{matrix}$ (4.1)

Obviously, the first term on the right-hand of Equation (4.1) is $A_{1} = E [π (X_{i}) g (0| X_{i}) \nabla f_{i} \nabla f_{i}^{T}]$ , and then applying Taylor expansion to the second term on the right-hand of Equation (4.1) can obtain result 1).

For result 2), noticing that

$\begin{array}{l} E [η_{i, C} (β) η_{i, C}^{T} (β)] \\ = τ (1 - τ) E [π (X_{i}) \nabla f_{i} \nabla f_{i}^{T}] \\ + 2 E {π (X_{i}) \nabla f_{i} \nabla f_{i}^{T} \int [F (- h u | X_{i}) - F (0| X_{i})] [G (u) - τ] K (u) d u} . \end{array}$ (4.2)

Obviously, the first term on the right-hand of Equation (4.2) is $B_{1} = τ (1 - τ) E [π (X_{i}) \nabla f_{i} \nabla f_{i}^{T}]$ , and again applying Taylor expansion to the second term on the right-hand of Equation (4.2) can obtain result 2).

b) When $η_{i} (β)= {\hat{η}}_{i, W} (β)$ , by direct calculation, we can derive

$\begin{matrix} {\hat{η}}_{i, W} (β) = \frac{δ_{i}}{\hat{π} (X_{i})} \nabla f_{i} [G_{h} (f (X_{i}, β) - Y_{i}) - τ] \\ = \frac{δ_{i}}{π (X_{i})} \nabla f_{i} [G_{h} (f (X_{i}, β) - Y_{i}) - τ] \\ + \frac{π (X_{i}) - \hat{π} (X_{i})}{\hat{π} (X_{i}) π (X_{i})} δ_{i} \nabla f_{i} [G_{h} (f (X_{i}, β) - Y_{i}) - τ] \\ = η_{i, W} (β) + \frac{π (X_{i}) - \hat{π} (X_{i})}{\hat{π} (X_{i}) π (X_{i})} δ_{i} \nabla f_{i} [G_{h} (f (X_{i}, β) - Y_{i}) - τ] . \end{matrix}$ (4.3)

In addition, we can prove

$\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} \frac{δ_{i}}{\hat{π} (X_{i}) π (X_{i})} \nabla f_{i} [G_{h} (f (X_{i}, β) - Y_{i}) - τ] = O_{p} (1) .$ (4.4)

According to Lemma 1 and Lemma 2, we can obtain that $\sup_{x} | \hat{π} (x) - π (x) | = o_{p} (1)$ for both nonparametric estimator and parameter estimator of $π (x)$ . Using this result and (4.4), we can derive

${\hat{η}}_{i, W} (β) = \frac{δ_{i}}{π (X_{i})} \nabla f_{i} [G_{h} (f (X_{i}, β) - Y_{i}) - τ] + o_{p} (1) .$ (4.5)

Further derivation leads to

$\begin{matrix} {\hat{η}}_{i, W} (β) = \frac{δ_{i}}{π (X_{i})} \nabla f_{i} [G (- ε_{i} / h) - F (0 | X_{i})] + o_{p} (1) \\ = \frac{δ_{i}}{π (X_{i})} \nabla f_{i} \int [F (- h u | X_{i}) - F (0 | X_{i})] K (u) d u + o_{p} (1) . \end{matrix}$ (4.6)

Similar to the proof in the case of a), it can be seen that

$\begin{matrix} E [\frac{\partial {\hat{η}}_{i, W} (β)}{\partial β^{T}}] = E [g (0 | X_{i}) \nabla f_{i} \nabla f_{i}^{T}] \\ + E {\nabla f_{i} \nabla f_{i}^{T} \int [g (- h u | X_{i}) - g (0 | X_{i})] K (u) d u} . \end{matrix}$ (4.7)

Obviously, the first term on the right-hand of the Equation (4.7) is $A_{2} = E [g (0| X_{i}) \nabla f_{i} \nabla f_{i}^{T}]$ , and using Taylor expansion to the second term on the right hand of (4.7), and result 1) is proved. Similarly, result 2) is also obtained.

c) When $η_{i} (β) = {\hat{η}}_{i, I} (β)$ , direct calculation yield

$\begin{array}{l} f (X_{i}, β) - {\hat{Y}}_{i} \\ = (1 - \frac{δ_{i}}{\hat{π} (X_{i})}) [f (X_{i}, β) - f (X_{i}, {\hat{β}}_{Q})] - \frac{δ_{i}}{\hat{π} (X_{i})} [Y_{i} - f (X_{i}, β)] \\ = (1 - \frac{δ_{i}}{\hat{π} (X_{i})}) [f (X_{i}, β) - f (X_{i}, {\hat{β}}_{Q})] - \frac{π (X_{i}) - \hat{π} (X_{i})}{\hat{π} (X_{i}) π (X_{i})} δ_{i} ε_{i} - \frac{δ_{i}}{π (X_{i})} ε_{i} . \end{array}$ (4.8)

Using Taylor expansion to $f (X_{i}, {\hat{β}}_{Q})$ at $β$ , we have

$\begin{matrix} f (X_{i}, β) - {\hat{Y}}_{i} = (1 - \frac{δ_{i}}{\hat{π} (X_{i})}) \nabla f_{i}^{T} ({\hat{β}}_{Q} - β) \\ - \frac{π (X_{i}) - \hat{π} (X_{i})}{\hat{π} (X_{i}) π (X_{i})} δ_{i} ε_{i} - \frac{δ_{i}}{π (X_{i})} ε_{i} + o_{p} (1) . \end{matrix}$

Noticing that $E (\frac{1}{n} \sum_{i = 1}^{n} (1 - \frac{δ_{i}}{π (X_{i})}) \nabla f_{i}^{T}) = 0$ and combined with ${\hat{β}}_{Q} - β = O_{p} (n^{- 1 / 2})$ , we can derive

$\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} (1 - \frac{δ_{i}}{\hat{π} (X_{i})}) \nabla f_{i}^{T} ({\hat{β}}_{Q} - β) = o_{p} (1) .$

Using $\sup_{x} | \hat{π} (x) - π (x) | = o_{p} (1)$ and $\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} \frac{δ_{i}}{\hat{π} (X_{i}) π (X_{i})} ε_{i} = O_{p} (1)$ , then we get

$f (X_{i}, β) - {\hat{Y}}_{i} = - \frac{δ_{i}}{π (X_{i})} (Y_{i} - f (X_{i}, β)) + o_{p} (1) .$ (4.9)

So we obtain

$\begin{matrix} {\hat{η}}_{i, I} (β) = \nabla f_{i} [G_{h} (f (X_{i}, β) - {\hat{Y}}_{i}) - τ] \\ = \nabla f_{i} {G_{h} (\frac{δ_{i}}{π (X_{i})} (f (X_{i}, β) - Y_{i})) - τ} + o_{p} (1) \\ = \nabla f_{i} [G_{h} (\frac{δ_{i}}{π (X_{i}) h} ε_{i}) - F (0 | X_{i})] + o_{p} (1) \\ = \nabla f_{i} \int {F (- \frac{h π (X_{i})}{δ_{i}} u | X_{i}) - F (0 | X_{i})} K (u) d u + o_{p} (1), \end{matrix}$ (4.10)

and then

$\begin{matrix} E [\frac{\partial {\hat{η}}_{i, I} (β)}{\partial β^{T}}] = E [g (0 | X_{i}) \nabla f_{i} \nabla f_{i}^{T}] \\ + E {\nabla f_{i} \nabla f_{i}^{T} \int [g (- \frac{h π (X_{i})}{δ_{i}} u | X_{i}) - g (0 | X_{i})] K (u) d u} . \end{matrix}$ (4.11)

Similar to the proof in case a), it is obvious that the first term on the right-hand of Equation (4.11) is $A_{2} = E [g (0| X_{i}) \nabla f_{i} \nabla f_{i}^{T}]$ . Similarly, we can obtain result 2) with $B_{3} = τ (1 - τ) E [\nabla f_{i} \nabla f_{i}^{T}]$ .

Lemma 4 Suppose conditions C₁-C₁₀ hold, then as $n \to \infty$ ,

1) $\frac{1}{n} \sum_{i = 1}^{n} η_{i} (β) = O (d_{n});$

2) $\frac{1}{n} \sum_{i = 1}^{n} η_{i} (β) η_{i}^{T} (β) = B + o (1),$

uniformly in $β \in Θ_{n} \equiv {β : ‖ β - β_{0} ‖ \leq d_{n}}$ , with $d_{n} = n^{- 1 / 3 - ζ}, 0 < ζ < 1 / 6$ , where $B = B_{1}$ when $η_{i} (β) = η_{i, C} (β)$ ; $B = B_{2}$ when $η_{i} (β) = {\hat{η}}_{i, W} (β)$ ; $B = B_{3}$ when $η_{i} (β) = {\hat{η}}_{i, I} (β)$ , and $B_{1}, B_{2}$ , and $B_{3}$ are defined in Lemma 3.

Proof: By Taylor expansion, we derive

$\frac{1}{n} \sum_{i = 1}^{n} η_{i} (β) = \frac{1}{n} \sum_{i = 1}^{n} [η_{i} (β_{0}) - E (η_{i} (β_{0}))] + E (η_{i} (β_{0})) + R_{n c} (β),$ (4.12)

where $R_{n c} (β) = \frac{1}{n} \sum_{i = 1}^{n} \frac{\partial η_{i} (β^{*})}{\partial β} (β - β_{0})$ , and $β^{*}$ lies between $β$ and $β_{0}$ .

a) This lemma will be proved first for the case of $η_{i} (β) = η_{i, C} (β)$ .

Similar to the proof of Lemma 2 of Whang [22] , using Cauchy-Schwartz inequality, triangle inequality and arguments similar to the proof of Lemma 3, we have

$\sup_{β \in Θ_{n}} ‖ R_{n c} (β) ‖ = O (d_{n}) a . s .$ (4.13)

with $R_{n c} (β) = \frac{1}{n} \sum_{i = 1}^{n} \frac{\partial η_{i, C} (β^{*})}{\partial β} (β - β_{0})$ . Therefore, according to (4.12), (4.13), law of iterated logarithm, Lemma 3 and condition $C_{6}$ , it holds that

$\sup_{β \in Θ_{n}} ‖ \frac{1}{n} \sum_{i = 1}^{n} η_{i, C} (β) ‖ = O (n^{- 1 / 2} {(\log (\log n))}^{1 / 2}) + O (h^{r}) + O (d_{n}) = O (d_{n}) a . s .$

The proof for the first result is completed. The second result can be proved in a similar way, here we omit the details.

b) When $η_{i} (β) = {\hat{η}}_{i, W} (β), {\hat{η}}_{i, I} (β)$ , according to the Equations (4.5) and (4.10) respectively, and by the similar arguments for $η_{i} (β) = η_{i, C} (β)$ , we can derive the two results, here we omit the details.

Proof of Theorem 1 By the Lagrange multiplier method, $R (β)$ can be represented as

$R (β) = 2 \sum_{i = 1}^{n} \log (1 + λ^{T} η_{i} (β)),$ (4.14)

where $λ$ is the solution of the following equation

$\sum_{i = 1}^{n} \frac{η_{i} (β)}{1 + λ^{T} η_{i} (β)} = 0.$ (4.15)

Similar to the proof in Owen [13] , we can prove that

$λ = {(\sum_{i = 1}^{n} η_{i} (β) η_{i}^{T} (β))}^{- 1} \sum_{i = 1}^{n} η_{i} (β) + o_{p} (n^{- 1 / 2} + h^{r}) = O_{p} (n^{- 1 / 2} + h^{r}) .$ (4.16)

Using Taylor expansion for (4.14), combined with Lemma 4 and (4.16), we obtain

$\begin{matrix} R (β) = 2 \sum_{i = 1}^{n} [λ^{T} η_{i} (β) - {(λ^{T} η_{i} (β))}^{2} / 2] + o_{p} (1) \\ = {(\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} η_{i} (β))}^{T} {(\frac{1}{n} \sum_{i = 1}^{n} η_{i} (β) η_{i}^{T} (β))}^{- 1} (\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} η_{i} (β)) + o_{p} (1) . \end{matrix}$ (4.17)

When $η_{i} (β)$ is $η_{i, C} (β), {\hat{η}}_{i, W} (β)$ and ${\hat{η}}_{i, I} (β)$ , together with the results of Lemma 3 and Lemma 4 respectively, the asymptotic distribution of the smoothed empirical log-likelihood ratio can be proved to be a chi-square distribution with d degrees of freedom.

5. Conclusion

In this paper, we propose three smoothed empirical log-likelihood ratio functions for nonlinear model parameters with missing responses. We obtain the corresponding Wilks phenomenon under some regular conditions, and can easily construct the confidence interval of the parameters. For the type of data missing, we only consider the case where the covariate data is complete and the response variable data is missing. In addition, there are cases where the covariate data is missing and the response variable data is complete. Therefore, the smooth empirical likelihood inference for nonlinear quantile regression models with missing covariates is also worth studying.

Funding

Wang’s researches are supported by NSF project (ZR2021MA077) of Shandong Province of China.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1]	Koenker, R. and Bassett, J.G. (1978) Regression Quantiles. Econometrica, 46, 33-50. https://doi.org/10.2307/1913643
[2]	Koenker, R. (2005) Quantiles Regression. Cambridge University Press, Cambridge.
[3]	Kim, M. (2007) Quantile Regression with Varying Coefficients. The Annals of Statistics, 35, 92-108. https://doi.org/10.1214/009053606000000966
[4]	Cai, Z. and Xu, X. (2008) Nonparametric Quantile Estimations for Dynamic Smooth Coefficient Models. Journal of the American Statistical Association, 103, 1595-1608. https://doi.org/10.1198/016214508000000977
[5]	Wu, T., Yu, K. and Yu, Y. (2010) Single-Index Quantile Regression. Journal of Multivariate Analysis, 101, 1607-1621. https://doi.org/10.1016/j.jmva.2010.02.003
[6]	Cai, Z. and Xiao, Z. (2012) Semiparametric Quantile Regression Estimation in Dynamic Models with Partially Varying Coefficients. Journal of Econometrics, 167, 413-425. https://doi.org/10.1016/j.jeconom.2011.09.025
[7]	Wei, Y., Ma, Y. and Carroll, R. (2012) Multiple Imputation in Quantile Regression. Biometrika, 99, 423-438. https://doi.org/10.1093/biomet/ass007
[8]	Sherwood, B., Wang, L. and Zhou, X. (2013) Weighted Quantile Regression for Analyzing Health Care Cost Data with Missing Covariates. Statistics in Medicine, 32, 4967-4979. https://doi.org/10.1002/sim.5883
[9]	Chen, X., Wan, A.T.K. and Zhou, Y. (2015) Efficient Quantile Regression Analysis with Missing Observations. Journal of the American Statistical Association, 110, 723-741. https://doi.org/10.1080/01621459.2014.928219
[10]	Sherwood, B. (2015) Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates. Journal of Multivariate Analysis, 152, 206-223. https://doi.org/10.1016/j.jmva.2016.08.009
[11]	Zhao, P.Y., Tang, N.S. and Jiang, D.P. (2017) Efficient Inverse Probability Weighting Method for Quantile Regression with Nonignorable Missing Data. Statistics, 51, 363-386. https://doi.org/10.1080/02331888.2016.1268615
[12]	Owen, A. (1988) Empirical Likelihood Ratio Confidence Intervals for a Single Functional. Biometrika, 75, 237-249. https://doi.org/10.1093/biomet/75.2.237
[13]	Owen, A. (1990) Empirical Likelihood Ratio Confidence Regions. The Annals of Statistics, 18, 90-120. https://doi.org/10.1214/aos/1176347494
[14]	Chen, S.X. and Hall, P. (1993) Smoothed Empirical Likelihood Confidence Intervals for Quantiles. The Annals of Statistics, 21, 1166-1181. https://doi.org/10.1214/aos/1176349256
[15]	Wang, H. and Zhu, Z.H. (2011) Empirical Likelihood for Quantile Regression Models with Longitudinal Data. Journal of Statistical Planning and Inference, 141, 1603-1615. https://doi.org/10.1016/j.jspi.2010.11.017
[16]	Tang, C. and Leng, C. (2012) An Empirical Likelihood Approach to Quantile Regression with Auxiliary Information. Statistics and Probability Letters, 82, 29-36. https://doi.org/10.1016/j.spl.2011.09.003
[17]	Zhao, P., Zhou, X. and Lin, L. (2015) Empirical Likelihood for Composite Quantile Regression Modeling. Journal of Applied Mathematics and Computing, 48, 321-333. https://doi.org/10.1007/s12190-014-0804-3
[18]	Zhao, P.X. and Tang, X.R. (2016) Imputation Based Statistical Inference for Partially Linear Quantile Regression Models with Missing Responses. Metrika, 79, 991-1009. https://doi.org/10.1007/s00184-016-0586-8
[19]	Luo, S. and Pang, S. (2017) Empirical Likelihood for Quantile Regression Models with Response Data Missing at Random. AStA Advances in Statistical Analysis, 15, 317-330. https://doi.org/10.1515/math-2017-0028
[20]	Zhao, P. and Zhou, X. (2018) Robust Empirical Likelihood for Partially Linear Models via Weighted Composite Quantile Regression. Computational Statistics, 33, 659-674. https://doi.org/10.1007/s00180-018-0793-z
[21]	Zhou, X.S., Zhao, P.X. and Gai, Y.J. (2022) Imputation Based Empirical Likelihood Inferences for Partially Nonlinear Quantile Regression Models with Missing Responses. AStA Advances in Statistical Analysis, 106, 705-722. https://doi.org/10.1007/s10182-022-00441-z
[22]	Whang, Y.J. (2006) Smoothed Empirical Likelihood Methods for Quantile Regression Models. Economic Theory, 22, 173-205. https://doi.org/10.1017/S0266466606060087
[23]	Lv, X.F. and Li, R. (2013) Smoothed Empirical Likelihood Analysis of Partially Linear Quantile Regression Models with Missing Response Variables. AStA Advances in Statistical Analysis, 97, 317-347. https://doi.org/10.1007/s10182-013-0210-4
[24]	Luo, S.H., Mei, C.L. and Zhang, C.Y. (2017) Smoothed Empirical Likelihood for Quantile Regression Models with Response Data Missing at Random. AStA Advances in Statistical Analysis, 101, 95-116. https://doi.org/10.1007/s10182-016-0278-8
[25]	Xue, L. (2009) Empirical Likelihood for Linear Models with Missing Responses. Journal of Multivariate Analysis, 100, 1353-1366. https://doi.org/10.1016/j.jmva.2008.12.009
[26]	Tang, N.S. and Zhao, P.Y. (2013) Empirical Likelihood-Based Inference in Nonlinear Regression Models with Missing Responses at Random. Statistics, 47, 1141-1159. https://doi.org/10.1080/02331888.2012.658807

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies