Testing Rating Scale Unidimensionality Using the Principal Component Analysis (PCA)/t-Test Protocol with the Rasch Model: The Primacy of Theory over Statistics

DOI: 10.4236/ojs.2014.46044   PDF   HTML     5,295 Downloads   8,474 Views   Citations


Psychometric theory requires unidimensionality (i.e., scale items should represent a common latent variable). One advocated approach to test unidimensionality within the Rasch model is to identify two item sets from a Principal Component Analysis (PCA) of residuals, estimate separate person measures based on the two item sets, compare the two estimates on a person-by-person basis using t-tests and determine the number of cases that differ significantly at the 0.05-level; if ≤5% of tests are significant, or the lower bound of a binomial 95% confidence interval (CI) of the observed proportion overlaps 5%, then it is suggested that strict unidimensionality can be inferred; otherwise the scale is multidimensional. Given its proposed significance and potential implications, this procedure needs detailed scrutiny. This paper explores the impact of sample size and method of estimating the 95% binomial CI upon conclusions according to recommended conventions. Normal approximation, “exact”, Wilson, Agresti-Coull, and Jeffreys binomial CIs were calculated for observed proportions of 0.06, 0.08 and 0.10 and sample sizes from n= 100 to n= 2500. Lower 95%CI boundaries were inspected regarding coverage of the 5% threshold. Results showed that all binomial 95% CIs included as well as excluded 5% as an effect of sample size for all three investigated proportions, except for the Wilson, Agresti-Coull, and JeffreysCIs, which did not include 5% for any sample size with a 10% observed proportion. The normal approximation CI was most sensitive to sample size. These data illustrate that the PCA/t-test protocol should be used and interpreted as any hypothesis testing procedure and is dependent on sample size as well as binomial CI estimation procedure. The PCA/t-test protocol should not be viewed as a “definite” test of unidimensionality and does not replace an integrated quantitative/qualitative interpretation based on an explicit variable definition in view of the perspective, context and purpose of measurement.

Share and Cite:

Hagell, P. (2014) Testing Rating Scale Unidimensionality Using the Principal Component Analysis (PCA)/t-Test Protocol with the Rasch Model: The Primacy of Theory over Statistics. Open Journal of Statistics, 4, 456-465. doi: 10.4236/ojs.2014.46044.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Hobart, J.C., Cano, S.J., Zajicek, J.P. and Thompson, A.J. (2007) Rating Scales as Outcome Measures for Clinical Trials in Neurology: Problems, Solutions, and Recommendations. Lancet Neurology, 6, 1094-1105.

[2] Bollen, K. and Lennox, R. (1991) Conventional Wisdom on Measurement: A Structural Equation Perspective. Psychological Bulletin, 110, 305-314.
[3] Fayers, P.M. and Hand, D.J. (2002) Causal Variables, Indicator Variables and Measurement Scales: An Example from Quality of Life. Journal of the Royal Statistical Society, 165, 233-266.
[4] Stenner, A.J., Stone, M.H. and Burdick, D.S. (2009) Indexing vs. Measuring. Rasch Measurement Transactions, 22, 1176-1177.
[5] Nunnally, J.C. and Bernstein, I.H. (1994) Psychometric theory. McGraw-Hill, Inc., New York.
[6] Thurstone, L.L. (1931) The Measurement of Social Attitudes. The Journal of Abnormal and Social Psychology, 26, 249-269.
[7] Smith Jr., E.V. (2002) Detecting and Evaluating the Impact of Multidimensionality Using Item Fit Statistics and Principal Component Analysis of Residuals. Journal of Applied Measurement, 3, 205-231.
[8] Stout, W. (1987) A Nonparametric Approach for Assessing Latent Trait Unidimensionality. Psychometrika, 52, 589-617. http://dx.doi.org/10.1007/BF02294821
[9] Food and Drug Administration (2009) Patient-Reported Outcome Measures: Use in Medicinal Product Development to Support Labelling Claims. Food and Drug Administration, Washington DC.
[10] Andrich, D. (1988) Rasch Models for Measurement. Sage Publications, Inc., Beverly Hills.
[11] Andrich, D. (2002) Implications and Applications of Modern Test Theory in the Context of Outcomes Based Education. Studies in Educational Evaluation, 28, 103-121.
[12] Cano, S.J., Barrett, L.E., Zajicek, J.P. and Hobart, J.C. (2011) Dimensionality Is a Relative Concept. Multiple Sclerosis, 17, 893-894.
[13] Yo, C.H., Osborn Popp, S., DiGangi, S. and Jannasch-Pennell, A. (2007) Assessing Unidimensionality: A Comparison of Rasch Modeling, Parallel Analysis, and TETRAD. Practical Assessment Research & Evaluation, 12, 19 p.
[14] Kelley, T.L. (1942) The Reliability Coefficient. Psychometrika, 7, 75-83.
[15] Hattie, J. (1985) Methodology Review: Assessing Unidimensionality of Tests and Items. Applied Psychological Measurement, 9, 139-164.
[16] Hobart, J. and Cano, S. (2009) Improving the Evaluation of Therapeutic Interventions in Multiple Sclerosis: The Role of New Psychometric Methods. Health Technology Assessment, 13, 1-177.
[17] Horton, M., Marais, I. and Christensen, K.B. (2013) Dimensionality. In: Christensen, K.B., Kreiner, S. and Mesbah, M., Eds., Rasch Models in Health, John Wiley & Sons, Inc., Croydon, Surrey, 137-158.
[18] Andrich, D. (2009) Interpreting RUMM2030 Part IV: Multidimensionality and Subtests in RUMM. RUMM Laboratory Pty Ltd., Perth.
[19] Rasch, G. (1960) Probabilistic Models for Some Intelligence and Attainment Tests. Danmarks Paedagogiske Institut, Copenhagen.
[20] Stenner, A.J., Fisher Jr., W.P., Stone, M.H. and Burdick, D.S. (2013) Causal Rasch Models. Frontiers in Psychology, 4, 536. http://dx.doi.org/10.3389/fpsyg.2013.00536
[21] Guttman, L. (1944) A Basis for Scaling Qualitative Data. American Sociological Review, 9, 139-150.
[22] Andrich, D. (1985) An Elaboration of Guttman Scaling with Rasch Models for Measurement. In: Brandon-Tuma, N., Ed., Sociological Methodology, Jossey-Bass, San Francisco, 33-80.
[23] Wright, B.D. and Bell, S.R. (1984) Item Banks: What, Why, How. Journal of Educational Measurement, 21, 331-345.
[24] Andrich, D., Sheridan, B. and Luo, G. (2009) Interpreting RUMM2030. RUMM Laboratory Pty Ltd., Perth.
[25] Smith, R.M. (1996) A Comparison of Methods for Determining Dimensionality in Rasch Measurement. Structural Equation Modeling, 3, 25-40.
[26] Tennant, A. and Pallant, J. (2006) Unidimensionality Matters. Rasch Measurement Transactions, 20, 1048-1051.
[27] Linacre, J.M. (1998) Detecting Multidimensionality: Which Residual Data-Type Works Best? Journal of Outcome Measurement, 2, 266-283.
[28] Tennant, A. and Conaghan, P.G. (2007) The Rasch Measurement Model in Rheumatology: What Is It and Why Use It? When Should It Be Applied, and What Should One Look for in a Rasch Paper? Arthritis Care & Research, 57, 1358-1362.
[29] Horton, M. and Tennant, A. (2010) Assessing Unidimensionality Using Smith’s (2002) Approach in RUMM 2030. Probabilistic Models for Measurement in Education, Psychology, Social Science and Health, Copenhagen.
[30] Cowles, M. and Davis, C. (1982) On the Origins of the .05 Level of Statistical Significance. American Psychologist, 37, 553-558.
[31] Andrich, D., Sheridan, B. and Luo, G. (1997-2012) RUMM2030: Rasch Unidimensional Models for Measurement. RUMM Laboratory, Perth.
[32] Forjaz, M.J., Martinez-Martin, P., Dujardin, K., Marsh, L., Richard, I.H., Starkstein, S.E. and Leentjens, A.F. (2013) Rasch Analysis of Anxiety Scales in Parkinson’s Disease. Journal of Psychosomatic Research, 74, 414-419.
[33] Ramp, M., Khan, F., Misajon, R.A. and Pallant, J.F. (2009) Rasch Analysis of the Multiple Sclerosis Impact Scale MSIS-29. Health and Quality of Life Outcomes, 7, 58.
[34] Riazi, A., Aspden, T. and Jones, F. (2014) Stroke Self-Efficacy Questionnaire: A Rasch-Refined Measure of Confidence Post Stroke. Journal of Rehabilitation Medicine, 46, 406-412.
[35] Stewart-Brown, S., Tennant, A., Tennant, R., Platt, S., Parkinson, J. and Weich, S. (2009) Internal Construct Validity of the Warwick-Edinburgh Mental Well-Being Scale (WEMWBS): A Rasch Analysis Using Data from the Scottish Health Education Population Survey. Health and Quality of Life Outcomes, 7, 15.
[36] Tur, B.S., Kucukdeveci, A.A., Kutlay, S., Yavuzer, G., Elhan, A.H. and Tennant, A. (2009) Psychometric Properties of the WeeFIM in Children with Cerebral Palsy in Turkey. Developmental Medicine and Child Neurology, 51, 732-738. http://dx.doi.org/10.1111/j.1469-8749.2008.03255.x
[37] Young, C.A., Mills, R.J., Woolmore, J., Hawkins, C.P. and Tennant, A. (2012) The Unidimensional Self-Efficacy Scale for MS (USE-MS): Developing a Patient Based and Patient Reported Outcome. Multiple Sclerosis Journal, 18, 1326-1333.
[38] Newcombe, R.G. (1998) Two-Sided Confidence Intervals for the Single Proportion: Comparison of Seven Methods. Statistics in Medicine, 17, 857-872.
[39] Brown, L.D., Cai, T.T. and DasGupta, A. (2001) Interval Estimation for a Binomial Proportion. Statistical Science, 16, 101-133.
[40] Feinstein, A.R. (1998) P-Values and Confidence Intervals: Two Sides of the Same Unsatisfactory Coin. Journal of Clinical Epidemiology, 51, 355-360.
[41] McCormack, J., Vandermeer, B. and Allan, G.M. (2013) How Confidence Intervals Become Confusion Intervals. BMC Medical Research Methodology, 13, 134.
[42] Clopper, C.J. and Pearson, E.S. (1934) The Use of Confidence or Fiducial Limits Illustrated in the Case of the Binomial. Biometrika, 26, 404-413.
[43] Wann-Hansson, C., Klevsgard, R. and Hagell, P. (2008) Cross-Diagnostic Validity of the Nottingham Health Profile Index of Distress (NHPD). Health and Quality of Life Outcomes, 6, 47.
[44] Drummond, M.F., Sculpher, M.J., Torrance, G.W., O’Brien, B.J. and Stoddart, G.L. (2005) Methods for the Economic Evaluation of Health Care Programmes. 3rd Edition, Oxford University Press, New York.
[45] Hair, J.F., Black, B., Babin, B., Anderson, R.E. and Tatham, R.L. (2006) Multivariate Data Analysis. 6th Edition, Prentice Hall, Upper Saddle River, NJ.
[46] Hagell, P. (2005) Feasibility and Linguistic Validity of the Swedish Version of the PDQ-39. Expert Review of Pharmacoeconomics & Outcomes Research, 5, 131-136.
[47] Hagell, P. and Nilsson, M.H. (2009) The 39-Item Parkinson’s Disease Questionnaire (PDQ-39): Is It a Unidimensional Construct? Therapeutic Advances in Neurological Disorders, 2, 205-214.
[48] Hagell, P. and Nygren, C. (2007) The 39 Item Parkinson’s Disease Questionnaire (PDQ-39) Revisited: Implications for Evidence Based Medicine. Journal of Neurology, Neurosurgery & Psychiatry, 78, 1191-1198.
[49] Hagell, P., Whalley, D., McKenna, S.P. and Lindvall, O. (2003) Health Status Measurement in Parkinson’s Disease: Validity of the PDQ-39 and Nottingham Health Profile. Movement Disorders, 18, 773-783.
[50] World Health Organization (2001) International Classification of Functioning, Disability and Health: ICF. World Health Organization, Geneva.
[51] Nilsson, M.H., Westergren, A., Carlsson, G. and Hagell, P. (2010) Uncovering Indicators of the International Classification of Functioning, Disability, and Health from the 39-Item Parkinson’s Disease Questionnaire. Parkinson’s Disease, 2010, Article ID: 984673.
[52] Cortina, J.M. (1993) What Is Coefficient Alpha? An Examination of Theory and Applications. Journal of Applied Psychology, 78, 98-104.

comments powered by Disqus

Copyright © 2020 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.