A Universal Selection Method in Linear Regression Models

Abstract

In this paper we consider a linear regression model with fixed design. A new rule for the selection of a relevant submodel is introduced on the basis of parameter tests. One particular feature of the rule is that subjective grading of the model complexity can be incorporated. We provide bounds for the mis-selection error. Simulations show that by using the proposed selection rule, the mis-selection error can be controlled uniformly.

Share and Cite:

E. Liebscher, "A Universal Selection Method in Linear Regression Models," Open Journal of Statistics, Vol. 2 No. 2, 2012, pp. 153-162. doi: 10.4236/ojs.2012.22017.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] H. Akaike, “A New Look at the Statistical Model Identification,” IEEE Transactions on Automatic Control, Vol. 19, 1974, pp. 716-723. doi:10.1109/TAC.1974.1100705
[2] C. Mallows, “Some Comments on Cp,” Technometrics, Vol. 15, No. 4, 1973, pp. 661-675. doi:10.2307/1267380
[3] G. Schwarz, “Estimating the Dimension of a Model,” Annals of Statistics, Vol. 6, No. 2, 1978, pp. 461-464. doi:10.1214/aos/1176344136
[4] J. Rissanen, “Modeling by Shortest Data Description,” Automatica, Vol. 14, No. 5, 1978, pp. 465-471. doi:10.1016/0005-1098(78)90005-5
[5] Y. Benjamini and Y. Hochberg, “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,” Journal of the Royal Statistical Society, Series B, Vol. 57, No. 1, 1995, pp. 289-300.
[6] F. Bunea, M. H. Wegkamp and A. Auguste, “Consistent Variable Selection in High Dimensional Regression via Multiple Testing,” Journal of Statistical Planning and Inference, Vol. 136, No. 12, 2006, pp. 4349-4364. doi:10.1016/j.jspi.2005.03.011
[7] Y. Benjamini, and Y. Gavrilov, “A simple forward selection procedure based on false discovery rate control,” Annals of Applied Statistics, Vol. 3, No. 1, 2009, pp. 179-198. doi:10.1214/08-AOAS194
[8] G. Claeskens and N. L. Hjort, “Model Selection and Model Averaging,” Cambridge University Press, Cambridge, 2008.
[9] H. Leeb and B. M. P?tscher, “Model Selection,” In: T. G. Andersen, et al., Eds., Handbook of Financial Time Series, Springer, Berlin, 2009, pp. 889-925. doi:10.1007/978-3-540-71297-8_39
[10] A. D. R. McQuarrie and C.-L. Tsai, “Regression and Time Series Model Selection,” World Scientific, Singapore City, 1998.
[11] A. J. Miller, “Subset Selection in Regression,” 2nd Edition, Chapman & Hall, New York, 2002.
[12] B. Droge, “Asymptotic Properties of Model Selection Procedures in Linear Regression,” Statistics, Vol. 40, No. 1, 2006, pp. 1-38. doi:10.1080/02331880500366050a
[13] R. Nishii, “Asymptotic Properties of Criteria for Selection of Variables in Multiple Regression,” Annals of Statistics, Vol. 12, No. 2, 1984, pp. 758-765. doi:10.1214/aos/1176346522
[14] C. R. Rao and Y. Wu, “A Strongly Consistent Procedure for Model Selection in a Regression Problem,” Biometrika, Vol. 76, No. 2, 1989, pp. 369-374. doi:10.1093/biomet/76.2.369
[15] J. Shao, “An Asymptotic Theory for Linear Model Selection,” Statistica Sinica, Vol. 7, 1997, pp. 221-264.
[16] C.-Y. Sin and H. White, “Information Criteria for Selecting Possibly Misspecified Parametric Models,” Journal of Econometrics, Vol. 71, No. 1-2, 1996, pp. 207-225. doi:10.1016/0304-4076(94)01701-8
[17] C. Gatu, P. I. Yanev and E. J. Kontoghiorghes, “A Graph Approach to Generate All Possible Regression Submodels,” Computational Statistics & Data Analysis, Vol. 52, No. 2, 2007, pp. 799-815. doi:10.1016/j.csda.2007.02.018
[18] H. Leeb, “The Distribution of a Linear Predictor after Model Selection: Conditional Finite-Sample Distributions and Asymptotic Approximations,” Journal of Statistical Planning and Inference, Vol. 134, No. 1, 2005, pp .64-89.
[19] H. Leeb and B. M. P?tscher, “Model Selection and Inference: Facts and Fiction,” Econometric Theory, Vol. 21, No. 1, 2005, pp. 21-59. doi:10.1017/S0266466605050036
[20] J. Shao, “Convergence Rates of the Generalized Information Criterion,” Journal of Nonparametric Statistics, Vol. 9, No. 3, 1998, pp. 217-225. doi:10.1080/10485259808832743
[21] A. Chambaz, “Testing the Order of a Model,” Annals of Statistics, Vol. 34, No. 3, 2006, pp. 1166-1203. doi:10.1214/009053606000000344
[22] D. E. Edwards and T. Havránek, “A Fast Model Selection Procedure for Large Families of Models,” Journal of the American Statistical Association, Vol. 82, No. 397, 1987, pp. 205-213. doi:10.2307/2289155
[23] M. A. Efroymson, “Multiple Regression Analysis,” In: A. Ralston and H. S. Wilf, Eds., Mathematical Methods for Digital Computers, John Wiley, New York, 1960.
[24] M. Hofmann, C. Gatu and E. J. Kontoghiorghes, “Efficient Algorithms for Computing the Best-Subset Regression Models for Large Scale Problems,” Computational Statistics & Data Analysis, Vol. 52, No. 1, 2007, pp. 16- 29. doi:10.1016/j.csda.2007.03.017
[25] E. J. Hannan and B. G. Quinn, “The Determination of the Order of an Autoregression,” Journal of the Royal Statistical Society, Series B, Vol. 41, No. 2, 1979, pp.190-195.
[26] D. Kh. Fuk and S. N. Nagaev, “Probability Inequalities for Sums of Independent Random Variables,” Theory of Probability and Its Applications, Vol. 16, 1971, pp. 643-660. doi:10.1137/1116071
[27] R. N. Bhattacharya, “On Errors of Normal Approximation,” Annals of Probability, Vol. 3, No. 5, 1975, pp. 815-828. doi:10.1214/aop/1176996268

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.