Efficiency of Selecting Important Variable for Longitudinal Data


Variable selection with a large number of predictors is a very challenging and important problem in educational and social domains. However, relatively little attention has been paid to issues of variable selection in longitudinal data with application to education. Using this longitudinal educational data (Test of English for International Communication, TOEIC), this study compares multiple regression, backward elimination, group least selection absolute shrinkage and selection operator (LASSO), and linear mixed models in terms of their performance in variable selection. The results from the study show that four different statistical methods contain different sets of predictors in their models. The linear mixed model (LMM) provides the smallest number of predictors (4 predictors among a total of 19 predictors). In addition, LMM is the only appropriate method for the repeated measurement and is the best method with respect to the principal of parsimony. This study also provides interpretation of the selected model by LMM in the conclusion using marginal R2.

Ra, J. and Rhee, K. (2014) Efficiency of Selecting Important Variable for Longitudinal Data. Psychology, 5, 6-11.

Conflicts of Interest

The authors declare no conflicts of interest.


