1. Introduction
Consider the following semi-parametric error-in-variables (EV) model
(1.1)
where
are the response variables,
are design points,
are the potential variables observed with measurement errors
,
,
are random errors with
.
is an unknown parameter that needs to be estimated.
is a unknown function defined on close interval
,
is a known function defined on
satisfying
(1.2)
where
are also design points.
Model (1.1) and its special forms have gained much attention in recent years. When
,
are observed exactly, the model (1.1) reduces to the general semi-parametric model, which was first introduced by Engle et al. [1] . However, in many applications, there are often covariates measurement errors. So the EV models are somewhat more practical than the ordinary regression model. In addition, when
are complete observed and
, the model (1.1) reduces to the usual linear EV model, which has been studied by Liu and Chen [2] , Miao et al. [3] , Miao and Liu [4] , Fan et al. [5] and so on. For complete data, the model (1) itself has also been studied by many authors: See Cui and Li [6] , Liang et al. [7] , Zhou et al. [8] and so on. In recent years, the semi-parametric EV models have been widely concerned.
On the other hand, we often encounter incomplete data in the practical application of the models. In particular, some response variables may be missing, by design or by happenstance. For example, the responses
may be very expensive to measure and only part of
are available. Actually, missing of responses is very common in opinion polls, social-economic investigations, market research surveys and so on. Therefore, we focus our attention on the case that missing data occur only in the response variables. When
can fully be observed, the model (1.1) reduces to the usual semi-parametric model which has been studied by many scholars in the literature: See Wang et al. [9] , Wang and Sun [10] , Bianco et al. [11] .
To deal with missing data, one method is to impute a plausible value for each missing datum and then analyze the results as if they are complete. In regression problems, common imputation approaches include linear regression imputation by Healy and Westmacott [12] , nonparametric kernel regression imputation by Cheng [13] , semi-parametric regression imputation by Wang et al. [9] , Wang and Sun [10] , among others. We here extend the methods to the estimation of
and
under the semi-parametric EV model (1.1). We obtain two approaches to estimate
and
with missing responses and study the strong consistency for the estimators.
In this paper, suppose we obtain a random sample of incomplete data
from the model (1.1), where
if
is missing, otherwise
. Throughout this paper, we assume that
is missing at random. The assumption implies that
and
are independent. That is,
. This assumption is a common assumption for statistical analysis with missing data and is reasonable in many practical situations.
The paper is organized as follows. In Section 2, we list some assumptions. The main results are given in Section 3. Some preliminary lemmas are stated in Section 4. Proofs of the main results are provided in Sections 5.
2. Assumptions
In this section, we list some assumptions which will be used in the main results. Here
means
for every
,
means
as
, while a.s. is stand for almost sure.
(A0) Let
,
and
be sequences of independent random variables satisfying
i)
,
,
,
is known.
ii)
,
for some
.
iii)
,
,
are independent of each other.
(A1) Let
in (2) be a sequence satisfying
i)
.
ii)
, where
is a permutation of
.
iii)
.
(A2)
and
are continuous functions satisfying the first-order Lipschitz condition on the close interval
.
(A3) Let
be weight functions defined on [0, 1] and satisfy
i)
a.s.
ii)
a.s. for any
.
iii)
a.s.
(A4) The probability weight functions
are defined on
and satisfy
i)
.
ii)
, for any
.
iii)
.
Remark 2.1. Conditions (A0)-(A4) are standard regularity conditions and used commonly in the literature, see Härdle et al. [14] , Gao et al. [15] and Chen [16] .
3. Main Results
For model (1.1), we want to seek the estimators of
and
. The most natural idea is to delete all the missing data. Therefore, one can get model
. If
can be observed, we can apply the least squares estimation method to estimate the parameter
. If the parameter
is known, using the complete data
, we can define the estimator of
to be
where
are weight functions satisfying (A3). On the other hand, under the condition of the semi-parametric EV model, Liang et al. [7] improved the least squares estimator (LSE) on the basis of the usual partially linear model, and employ the estimator of parameter
to minimize the following formula:
Therefore, we can achieve the modified LSE of
as follow:
(3.1)
where
,
. We substitute (3.1) into
, then
(3.2)
Apparently, the estimators
and
are formed without taking all sample information into consideration. Hence, in order to make up for the missing data, we imply an imputation method from Wang and Sun [10] , and let
(3.3)
Therefore, Using complete data
, similar to (3.1)-(3.2), one can get another estimators for
and
, that is
(3.4)
(3.5)
where
,
,
are weight functions satisfying (A4).
Based on the estimators for
and
, we have the following results.
Theorem 3.1 Suppose that (A0)-(A3) are satisfied. For every
, we have
a)
b)
Theorem 3.2 Suppose that (A0)-(A4) are satisfied. For every
, we have
a)
b)
4. Preliminary Lemmas
In the sequel, let
be some finite positive constants, whose values are unimportant and may change. Now, we introduce several lemmas, which will be used in the proof of the main results.
Lemma 4.1 (Baek ang Liang [17] , Lemma 3.1) Let
,
be independent random variables with
. Assume that
is a triangular array of numbers with
and
. If
for some
. Then
Lemma 4.2 (Hardle et al. [14] , Lemma A.3) Let
be independent random variables with
, finite variances and
. Assume that
is a sequence of numbers such that
for some
and
for
. Then
for
.
Lemma 4.3
a) Let
, where
or
. Let
, where
or
. Then, (A0)-(A4) imply that
and
b) (A0)-(A4) imply that
,
,
and
c) (A0)-(A4) imply that
and
Lemma 4.4 Suppose that (A0)-(A4) are satisfied. Then one can deduce that
One can easily get Lemma 4.3 by (A0)-(A4). The proof Lemma 4.4 is analogous to the proof of Theorem 3.1(b).
5. Proof of Main Results
Firstly, we introduce some notations, which will be used in the proofs below.
Proof of Theorem 3.1(a). From (3.1), one can write that
(5.1)
Thus, to prove
a.s., we only need to verify that
and
for
.
Step 1. We prove
Note that
By Lemma 4.3(a), we have
a.s. Hence, it suffices to verify that
a.s. for
. Applying (A0), taking
,
,
in Lemma 4.2, we can verity that
(5.2)
where
is a sequence of independent random variables satisfying
and
. Therefore, we obtain
from (A0) and (5.2). On the other hand, taking
in Lemma 4.1, we have
(5.3)
where
is a sequence of independent random variables satisfying
and
. By (A0) and Lemma 4.3, taking
,
,
in Lemma 4.2, one can also deduce that
(5.4)
Note that, from Lemma 4.3(a), (5.2) and (5.3), we have
(5.5)
(5.6)
(5.7)
Therefore, for (5.2)-(5.7), one can deduce that
, which yields that
Therefore, by the Lemma 4.3(b), we can get that
Step 2. We verify that
for
. From (A0), we find out
is a sequences of independent random variables with
,
, for some
. Similar to (4), we deduce that
Meanwhile, from (A0)-(A3), Lemma 4.3, (5.2) and (5.3), one can achieve that
The proof of
for
is analogous. Thus, the proof of Theorem 3.1(a) is completed.
Proof of Theorem 3.1(b). From (3.2), for every
, one can write that
Therefore, we only need to prove that
a.s. for every
and
. From (A0)-(A3), Theorem 3.1(a), Lemma 4.3, (2) and (3), for every
and any
, one can get
Thus, the proof of Theorem 3.1(b) is completed.
Proof of Theorem 3.2(a). From (3.3)-(3.4), write that
Using a similar approach as step 1 in the proof of Theorem 3.1(a), one can get
Therefore, we only need to verify that
for
. From (A0)-(A4), Lemmas 4.2-4.4, Theorem 3.1(a), (5.2)-(5.4), we have
In the same way, from (A0)-(A4), Lemmas 4.2-4.4, (5.2) and (5.3), one can similarly deduce that
for
. Thus, the proof of Theorem 3.2(a) is completed.
Proof of Theorem 3.2(b). From (3.4), write that
Therefore, we only need to prove that
a.s. for every
and
. From (A0)-(A4), Lemma 4.3-4.4, (5.2), (5.3), one can get
Meanwhile, the proof of
for every
and
is analogous. Thus, the proof of Theorem 3.2(b) is completed.
Acknowledgements
The authors greatly appreciate the constructive comments and suggestions of the Editor and referee. This research was supported by the National Natural Science Foundation of China (11701368).