A Regression Type Estimator with Two Auxiliary Variables for Two-Phase Sampling


This paper is an extension of Hanif, Hamad and Shahbaz estimator [1] for two-phase sampling. The aim of this paper is to develop a regression type estimator with two auxiliary variables for two-phase sampling when we don’t have any type of information about auxiliary variables at population level. To avoid multi-collinearity, it is assumed that both auxiliary variables have minimum correlation. Mean square error and bias of proposed estimator in two-phase sampling is derived. Mean square error of proposed estimator shows an improvement over other well known estimators under the same case.

Share and Cite:

N. Hamad, M. Hanif and N. Haider, "A Regression Type Estimator with Two Auxiliary Variables for Two-Phase Sampling," Open Journal of Statistics, Vol. 3 No. 2, 2013, pp. 74-78. doi: 10.4236/ojs.2013.32010.

1. Introduction

It is fact that precision of estimators of the mean of study variable “y” is increased by proper attachment of highly correlated auxiliary variables. In some situations where auxiliary information is available at population level and cost per unit of collecting study variable “y” is affordable then single-phase sampling is more appropriate. But in a situation where prior information of auxiliary variable is lacking then it is neither practical nor economical to conduct a census for this purpose. The appropriate technique used to get estimates of those auxiliary variables on the basis of samples is two-phase sampling. In such cases we take large preliminary sample and from that auxiliary variables are computed. The main sample is independently sub-sampled from that large sample.

Two-phase sampling is a powerful technique which was firstly introduced by Neyman [2] for the stratification purpose. Two-phase sampling is based on the idea of a sampling design in which nature (specifically the size) of sampling units does not differ at any phase of sampling. “Two-phase sampling is generally employed when number of units, required to give the desired precision on different items, is widely different. This technique is employed to utilize the information collected at the first phase in order to improve the precision of the information to be collected at the second phase” [3].

In two-phase sampling, regression and ratio estimation techniques are used to estimate the finite population mean. Ratio estimator incorporates the prior information closely related to study variable and regression technique is used when relation between study variable and auxiliary variable(s) is linear. Regression estimator is considered to be more useful than ratio estimator except when regression line does not pass through origin otherwise these two estimators have almost same significance and analyst has to decide intuitively.

Let the population consist of units, and denote the values of the i-th unit of the character and respectively. Here is our variable of interest, is main auxiliary variable and is second auxiliary variable. The two auxiliary variables are highly correlated with variable of interest. Let be first phase sample of size from the population of size N according to a simple random sampling without replacement and, the sample means of two auxiliary variables are observed. Let be second phase sample of size n2 from first phase sample and are observed. The notations used in this paper are:


Cochran [4] appears to be the first to use auxiliary information in Ratio estimator when there is highly positive correlation between study variable and auxiliary variables. Hansen and Hurwitz [5] were first to suggest the use of auxiliary information in selecting the population with varying probabilities. Robson [6] gave the idea of product estimator when there is highly negative correlation. Two-phase sampling version of [6] is:

, (1.2)


Sukhatme [3] used auxiliary variable in his ratio type estimators for two-phase sampling. One of his estimators was:

, (1.4)


Raj [7] proposed a method of using information on several variates to achieve higher precision in two-phase sampling. The two-phase sampling version of [7] is:


where and “w” is a suitably chosen constant.


Mohanty [8] demonstrated that precision of study variable in two-phase sampling can be increased by combining the regression and ratio estimators using two auxiliary variables.

, (1.8)


Srivastava [9] developed a following class of ratio type estimators:

, (1.10)

. (1.11)

Mukerjee et al. [10] developed three regression type estimators. One was for the situation when no auxiliary information was available.

, (1.12)


Samiuddin and Hanif [11] developed two-phase sampling version of Sukhatme et al. [12] regression estimator when population means and are not known.




and is the partial correlation coefficient of given and.

Singh and Espejo [13] extended their own work of single phase sampling ratio-product estimator suggested in (2003) to two-phase sampling.

, (1.16)


where and.

Hanif et al. [14] developed regression type estimators of population mean in two-phase sampling. One of those estimators was:

, (1.18)


2. Proposed Regression Type Estimator for Two-Phase Sampling

We propose following estimator using two auxiliary variables for two-phase sampling when we don’t have any information of auxiliary variables i.e. both and are unknown.


Putting the notations of (1.1) in (2.1), squaring and taking expectation, we can obtain mean square as:


In order to get optimum value of K1 and K2 we differentiate (2.2) with respect to K1 and equating to zero we get:

. (2.3)

Putting the value of (2.3) in (2.2) and differentiating with respect to K1, we get:


where is the partial regression coefficient of y on keeping constant.

Putting the value of (2.4) in (2.3) we get:


Putting the values of (2.4) and (2.5) in (2.2) and on simplification we have:


Expressing the proposed estimator in terms of (1.1) and taking the assumption that is very small and expanding and up to second degree, we obtain bias of above estimator as follows


Putting (2.4) and (2.5) in (2.7) and after simplification, the optimized bias is


3. Mathematical Comparison of Proposed Estimator over Other Estimators

In this section, an improvement of our proposed estimator is shown over well-known estimators of two-phase sampling. In each case no information about population characteristics of auxiliary variables is available. It is proved through mathematical comparison that our proposed estimator outperforms the other estimators. We have compared our estimator with [3,6-11,13,14] estimators. The mathematical efficiency of our proposed estimator is given as:

a) Comparison with Robson [6] Estimator


b) Comparison with Sukhatme [3] Estimator


c) Comparison with Raj [7] Estimator


d) Comparison with Mohanty [8] Estimator


e) Comparison with Srivastava [9] Estimator


f) Comparison with Mukerjee et al. [10] Estimator


g) Comparison with Sammiuddin and Hanif [11] Estimator Our proposed estimator gives identical result to [11] because


But our estimator is more preferable than [11] if we have the estimate of, in this way we have to find only one unknown value whereas in [11] estimator we have to find two unknown values. Following special cases give another reason for the suitability of our estimator. Our estimator:

1) becomes classical ratio estimator for and;

2) converts into Robson [6] estimator for and;

3) emerges into Mohanty [8] estimator for and;

4) reduces to estimator given by Singh and Espejo [13] for;

5) turns into Hanif et al. [14] estimator for.

h) Comparison with Singh and Espejo [13] Estimator


i) Comparison with Hanif et al. [4] Estimator


4. Conclusion

In this paper we have proposed a regression type estimator for two-phase sampling when we don’t have any advance knowledge of auxiliary variables. [6,8,13,14] are the special cases of our estimator. From Equations (3.1) to (3.9) one can readily see that our proposed estimator is more precise than all other competing estimators discussed in Section 1, so we can say that our estimator provides more accurate estimate about the population parameters.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] M. Hanif, N. Hamad and M. Q. Shahbaz, “A Modified Regression Type Estimator in Survey Sampling,” World Applied Sciences Journal, Vol. 7, No. 12, 2009, pp. 1559-1561.
[2] J. Neyman, “Contribution to the Theory of Sampling Human Populations,” Journal of the American Statistical Association, Vol. 33, No. 201, 1938, pp. 101-116. doi:10.1080/01621459.1938.10503378
[3] B. V. Sukhatme, “Some Ratio Type Estimators in Two Phase Sampling,” Journal of the American Statistical Association, Vol. 57, No. 299, 1962, pp. 628-632. doi:10.1080/01621459.1962.10500551
[4] W. G. Cochran Cochran, “The Estimation of the Yields of the Cereal Experiments by Sampling for the Ratio of Grain to Total Produce,” Journal of Agricultural Science, Vol. 30, No. 2, 1940, pp. 262-275. doi:10.1017/S0021859600048012
[5] M. H. Hansen and W. N. Hurwitz, “On the Theory of Sampling from Finite Populations,” The Annals of Mathematical Statistics, Vol. 14, No. 4, 1943, pp. 333-362. doi:10.1214/aoms/1177731356
[6] D. S. Robson, “Application of Multivariate Polykays to the Theory of Unbiased Ratio Type Estimators,” Journal of the American Statistical Association, Vol. 52, No. 280, 1957, pp. 511-522. doi:10.1080/01621459.1957.10501407
[7] D. Raj, “On a Method of Using Multi-Auxiliary Information in Sample Surveys,” Journal of the American Statis tical Association, Vol. 60, No. 309, 1965, pp. 270-277. doi:10.1080/01621459.1965.10480789
[8] S. Mohanty, “Combination of Regression and Ratio Estimate,” Journal of Indian Statistical Association, Vol. 5, 1967, pp. 16-19.
[9] S. K. Srivastava, “A Generalized Estimator for the Mean of a Finite Population Using Multi Auxiliary Information,” Journal of the American Statistical Association, Vol. 66, No. 334, 1971, pp. 404-407. doi:10.1080/01621459.1971.10482277
[10] R. Mukerjee, T. J. Rao and K. Vijayan, “Regression Type Estimators Using Multiple Auxiliary Information,” Australian Journal of Statistics, Vol. 29, No. 3, 1987, pp. 244-254. doi:10.1111/j.1467-842X.1987.tb00742.x
[11] M. Samiuddin and M. Hanif, “Estimation of Population Mean in Single Phase and Two-Phase Sampling with or without Additional Information,” Pakistan Journal of Statistics, Vol. 23, No. 2, 2007, pp. 99-118.
[12] P. V. Sukhatme, B. V. Sukhatme, S. Sukhatme and C. Asok, “Sampling Theory of Surveys with Applications,” Iowa State University Press, Ames, 1984.
[13] H. P. Singh and M. R. Espejo, “Double Sampling Ratio Product Estimator of a Finite Population Mean in Sample Surveys,” Journal of Applied Statistics, Vol. 34, No. 1, 2007, pp. 71-85. doi:10.1080/02664760600994562
[14] M. Hanif, N. Hamad and M. Q. Shahbaz, “Some New Regression Type Estimators in Two-Phase Sampling,” World Applied Sciences Journal, Vol. 8, No. 7, 2010, pp. 799-803.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.