Determining Sufficient Number of Imputations Using Variance of Imputation Variances: Data from 2012 NAMCS Physician Workflow Mail Survey

How many imputations are sufficient in multiple imputations? The answer given by different researchers varies from as few as 2 - 3 to as many as hundreds. Perhaps no single number of imputations would fit all situations. In this study, η, the minimally sufficient number of imputations, was determined based on the relationship between m, the number of imputations, and ω, the standard error of imputation variances using the 2012 National Ambulatory Medical Care Survey (NAMCS) Physician Workflow mail survey. Five variables of various value ranges, variances, and missing data percentages were tested. For all variables tested, ω decreased as m increased. The m value above which the cost of further increase in m would outweigh the benefit of reducing ω was recognized as the η. This method has a potential to be used by anyone to determine η that fits his or her own data situation.

Conflicts of Interest

The authors declare no conflicts of interest.

Cite this paper

Pan, Q. , Wei, R. , Shimizu, I. and Jamoom, E. (2014) Determining Sufficient Number of Imputations Using Variance of Imputation Variances: Data from 2012 NAMCS Physician Workflow Mail Survey. Applied Mathematics, 5, 3421-3430. doi: 10.4236/am.2014.521319.

 [1] Rubin, D.B. (1987) Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons, New York, 1-23, 75-147. [2] Rubin, D.B. (1996) Multiple Imputation after 18+ Years (with Discussion). Journal of the American Statistical Association, 91, 473-489. http://dx.doi.org/10.1080/01621459.1996.10476908 [3] Schafer, J.L. and Olsen, M.K. (1998) Multiple Imputation for Multivariate Missing Data Problems: A Data Analyst’s Perspective. Multivariate Behavioral Research, 33, 545-571. http://dx.doi.org/10.1207/s15327906mbr3304_5 [4] Graham, J.W., Olchowski, A.E. and Gilreath, T.D. (2007) How Many Imputations Are Really Needed? Some Practical Clarifications of Multiple Imputation Theory. Prevention Science, 8, 206-213. http://dx.doi.org/10.1007/s11121-007-0070-9 [5] Hershberger, S.L. and Fisher, D.G. (2003) A Note on Determining the Number of Imputations for Missing Data. Structural Equation Modeling. Structural Equation Modeling, 10, 648-650. http://dx.doi.org/10.1207/S15328007SEM1004_9 [6] Allison, P. (2012) Why You Probably Need More Imputations Than You Think. http://www.statisticalhorizons.com/more-imputations [7] Rubin, D.B. (1978) Multiple Imputations in Sample Surveys—A Phenomenological Bayesian Approach to Nonresponse. Proceedings of the Survey Research Methods Section of the American Statistical Association, 20-28. [8] Jamoom, E., Beatty, P., Bercovitz, A., et al. (2012) Physician Adoption of Electronic Health Record Systems: United States, 2011. National Center for Health Statistics, Hyattsville, NCHS Data Brief, No. 98. [9] Andridge, R.R. and Little, R.J.A. (2010) A Review of Hot Deck Imputation for Survey Non-Response. International Statistical Review, 78, 40-64. http://dx.doi.org/10.1111/j.1751-5823.2010.00103.x [10] Freedman, D.R., Pisani, R. and Purves, R. (2007) Statistics. 4th Edition, W. W. Norton & Company, New York, 415-424, 488-495, 523-540.