Applied Psychometrics: Estimator Considerations in Commonly Encountered Conditions in CFA, SEM, and EFA Practice

Abstract

The goal of this work was: 1) to present the assumptions for use of the estimators used in CFA/SEM and EFA, and their advantages/disadvantages; 2) to highlight that the variables were treated either as continuous or as ordinal categorical during the estimation process should be consistently treated for the rest of the study analyses (keeping a consistent level of measurement). Two estimator groups exist 1) Maximum Likelihood and 2) Least Squares. Robust alternatives exist in both groups. Desirable estimator properties include consistency, non-biasness, efficiency, scale freeness, and scale invariance. Scholars propose selecting an estimator considering: 1) measurement level; 2) non-normality; 3) model type. Not considering these could affect parameter significance, chi-square tests, and model fit. A plausible under-highlighted issue is the impact of the level of measurement implied during the estimation process on the level of measurement assumed for the same variables across the study. E.g., when an ordinal categorical level is assumed using categorical estimators, the same variables should be treated as ordinal categorical for the rest of the study: 1) using ordinal reliability; 2) omitting means; 3) using tetrachoric/polychoric correlation for the nomological network. Therefore, the selection of an estimator impacts all the analytic strategies of the study.

Share and Cite:

Kyriazos, T. and Poga-Kyriazou, M. (2023) Applied Psychometrics: Estimator Considerations in Commonly Encountered Conditions in CFA, SEM, and EFA Practice. Psychology, 14, 799-828. doi: 10.4236/psych.2023.145043.

1. Introduction

Model estimation is an essential step in Exploratory Factor Analysis (EFA), Confirmatory Factor Analysis (CFA), and Structural Equation Modeling (SEM). it pertains to every research endeavor, whether simple or complicated, affecting research results.

EFA has common elements with CFA/SEM, and estimators are one of them (Brown, 2015) . Whittaker & Schumacker (2022) explain, that an estimator is a discrepancy function used for the estimation of the unknown model parameters. Estimation minimizes the difference between the sample covariance matrix (S) and the model-implied covariance matrix (Σ). In other words, the different estimators, are different weightings of the discrepancies between corresponding parts of the observed and implied covariance matrices. They have various advantages and disadvantages depending on the dataset (Loehlin & Beaujean, 2017) . Both in EFA and CFA/SEM, the researcher must select the most suitable estimator based on the dataset analyzed (Whittaker & Schumacker, 2022) . Moreover, in all three techniques, the model quality and overall fit are influenced by the effectiveness of the estimator used.

An estimator is effective when it is asymptotically consistent, unbiased, and efficient (Bollen, 1989; Lei & Wu, 2012; Gana & Broc, 2019) . Specifically, 1) A consistent estimation most probably approximates the true value of the population parameters estimated as sample size increases; 2) An unbiased estimation assumes an expected value that on average in large samples, neither overestimates nor underestimates the equivalent population parameters; and 3) An efficient estimator shows the minimal variance in large samples (Lei & Wu, 2012; Wang & Wang, 2020) . Several others (e.g., Raykov & Marcoulides, 2006 ) also suggest that an estimator should at least be consistent. Scale-invariance (that is the data scale will not affect the fit function) and scale-freeness (that is standardized parameter estimates will not change after any change in the variable scale) are equally desirable estimator properties (Kline, 2016) . The Maximum Likelihood estimator (ML) became popular in Factor Analysis, (e.g., Jöreskog, 2007; Brown, 2015; Stalikas & Kyriazos, 2019 ), or techniques like bifactor analysis (e.g., Stalikas et al., 2018 ). However, ML may not always be the best choice (Schumacker & Lomax, 2016) , given that multivariate normality rarely holds in psychology studies (e.g., Jöreskog, 2007; Beaujean, 2014; Byrne, 2012; Kline, 2016; Lei & Wu, 2012; Micceri, 1989; Watkins, 2021 among many others).

The first goal of this work is to discuss concerns when choosing an estimator in CFA/SEM, and EFA (Part A) and the impact of this choice on the overall study design and the rest of the analyses performed (Part B). In Part A, the focus will be on 1) Introduction; 2) Preliminary steps: Considerations About the Data Properties; 3) Description of the EFA, and CFA/SEM Estimation Process; 4) EFA Estimator Classification and the Assumptions for Their Use; 5) CFA/SEM Estimator Classification and the Assumptions for Their Use. In Part B the second goal of the study is addressed, i.e., to highlight: 6) How does Choosing an Estimator Can Affect all the Study Analyses? And finally 7) Recap and Conclusions are made. Part B highlights that when we choose to treat variables either as continuous or as ordinal categorical during model estimation, the same variables should have identical (consistent) treatment regarding their level of measurement for the rest of the study analyses.

Part A. Concerns when choosing an estimator in CFA/SEM, and EFA

Model estimation is a crucial step in both Confirmatory Factor Analysis (CFA) and Exploratory Factor Analysis (EFA). A well-estimated model provides evidence that the underlying factor structure is a good representation of the data, which in turn supports the validity of the constructs being measured. Several studies have highlighted the importance of model estimation in CFA and EFA. Marsh, Hau, and Wen (2004) demonstrated that proper model estimation techniques, such as maximum likelihood estimation, can improve the fit of the model to the data and lead to more accurate conclusions about the factor structure, and Similarly, Thompson and Daniel (1996) emphasized the importance of selecting appropriate model estimation methods in EFA, noting that different estimation methods can produce different factor structures and affect the interpretation of the results.

2. Preliminary Steps: Considerations of the Data Properties

Generally, Schumacker and Lomax (2016) propose selecting an estimator based on the model type, the measurement level of variables, and non-normality (Muthén & Muthén, 2006: pp. 423-426; Schumacker & Lomax, 2016: p. 245) .

Moreover, model estimation often fails because of “messy data” (Schumacker & Lomax, 2016) . Messy data involves missingness, outliers, or/and multicollinearity ( Schumacker & Lomax, 2016 ; see also Tabachnick & Fidell, 2013 ).

Schumacker and Beyerlein (2000) explain that CFA/SEM (and EFA as well) are based on correlations therefore all restrictions of the correlation coefficients, and the general linear model are also pertinent and exaggerated in CFA/SEM and EFA. Despite the availability of asymptotic distribution-free estimation methods, and robust statistics messy data cannot be overlooked (Schumacker & Lomax, 2016) .

The impact of overlooking messy data may be estimation convergence failures, Heywood cases (variables with negative variance), or non-positive definite matrices (determinant of the matrix is zero), as Schumacker and Lomax (2016) comment (Figure 1). For a detailed description of convergence failure problems and possible solutions please refer to Brown (2015) and Byrne (2012) .

An alternative course of action proposed is to use a robust estimator (i.e., MLM or MLR) no matter what the distributional data properties are (e.g., see Wang & Wang, 2020; Byrne, 2012 on this). Essentially, this alternative strategy would be effective in studies applying only CFA/SEM or EFA without any use of traditional statistics, which is rather uncommon. In applied research, traditional statistical approaches and SEM are usually used in tandem. The most common example of this application is the validation studies during the development of a new measure (Brown, 2015) or during its cultural adaption. These studies use EFA, CFA (or both) together with traditional statistical approaches to compare

Figure 1. The impact of messy data on the estimation process.

means, or correlation analysis to test the nomological network of the validated construct. See a summary of all issues to be addressed before the estimation process in Table 1.

The presence of many estimators for both EFA and CFA/SEM could put the researcher in the position of having two watches, but never knowing the time (Loehlin & Beaujean, 2017) . Therefore, researchers have to familiarize themselves with the estimation process, and the advantages/disadvantages of each estimator to make more informed selections (Loehlin & Beaujean, 2017) . This is the focus of the next sections.

3. Brief Description of the EFA, and CFA/SEM Estimation Process

Factor analysis (FA) seeks to find out how many factors (latent variables) optimally represent the variance-covariance among items (observed variables or indicators). Two FA techniques exist, based on the common factor model: Exploratory (EFA) and Confirmatory Factor Analysis (CFA). Both intend to represent the relationships of an item-set through a smaller factor-set. Their difference is the a priori specifications and restrictions they assume (Brown, 2015) . EFA is a data-driven approach requiring no a priori assumptions. CFA requires a strong empirical or conceptual foundation to specify and evaluate the model (Brown, 2015) . In applied research, FA psychometrically evaluates measurement instruments during their construct validation or development (Brown, 2015) . Statistically, FA is correlation-based seeking to replicate the intercorrelation among the variables (Schumacker & Beyerlein, 2000) .

When the researcher has uncertain or incomplete assumptions about an underlying model structure, EFA is recommended. When the researcher can make clear assumptions about the number and the inter-relationships of an underlying structure (factor correlations, number of factors e.tc.) CFA is recommended (e.g., Fabrigar & Wegener, 2012 ). For SEM Whittaker and Schumacker (2022: p. 1) state that researcher uses SEM to hypothesize and test theoretical models. Specifically, SEM reproduces the relationships among items (observed variables)

Table 1. Model Estimation Check-list: Suggestions for an effective selection of an estimator.

Source. This table was based on a similar table by Whittaker & Schumacker (2022: p. 357) .

and factors (latent variables) seeking to reproduce the theoretical models, providing hypotheses testing (Whittaker & Schumacker, 2022) .

There are 4 steps for generating an EFA solution: 1) preparing the data; 2) factor extraction (that is model estimation in the EFA context; Fabrigar & Wegener, 2012; Watkins, 2021 ) of the preliminary (orthogonal) factors; 3) how many factors to retain; and 4) rotation of the final solution (Kim & Mueller, 1978; Brown, 2015) . Factor extraction is the mathematical process of obtaining factors from a correlation matrix (Watkins, 2021) . In EFA different factor extraction methods exist (i.e., estimators, Fabrigar & Wegener, 2012 ), due to numerous fit (discrepancy) functions. The goal of the extraction process in the EFA is to find out the minimum number of common factors, reproducing the correlations among observed variables in an acceptable way. The criteria to stop extracting common factors involve determining when the difference between the reproduced and observed correlations is due to sampling variability (Kim & Mueller, 1978) .

As we already said, EFA shares several elements with CFA, and estimators are one of them (Brown, 2015) . However, for reaching an EFA solution, after the researcher decides that data are factorable, factors extraction follows (model estimation in the EFA context; Fabrigar & Wegener, 2012 ), before the researcher decides how many factors to retain (Kim & Mueller, 1978; Brown, 2015) . On the contrary, during the CFA/SEM the researcher is the one that specifies all model parameters.

The steps a researcher takes in conducting a CFA or SEM involve model specification, identification, estimation, testing, and model modification. Parameters to be estimated in a CFA model include factor loadings, factor variances and covariance (when factors > 1), and measurement error variances (e.g., Lei & Wu, 2012 ). In a SEM model, structural regression paths are additionally estimated (e.g., Byrne, 2012 ). In the model estimation, the generated parameter values seek to minimize the discrepancy between the sample covariance matrix S and the population model implied covariance matrix Σ, through a discrepancy function (Byrne, 2012) , also called the fit function ( Raykov & Marcoulides, 2006; Byrne, 2012 , among many others).

Lei & Wu (2012) describe the estimation process as follows. Generally, EFA/ CFA/SEM estimation procedures are iterative and they use start values to begin iteration. The start values replace their corresponding unknown parameters and an intermediate model-implied variance-covariance matrix is generated. The process iterates until differences in parameter estimates from one iteration to the next meet the convergence criteria (the stopping rules of the iterative process). Then the parameter estimates of the last iteration replace the unknown parameters of the model (Lei & Wu, 2012) . Depending on how the matrix distance is estimated during this process, several fit functions result in different parameter estimation methods (Raykov & Marcoulides, 2006) .

4. EFA Estimator Classification and the Assumptions for Their Use

Two major estimator classifications exist (Kim & Mueller, 1978) : 1) Maximum Likelihood (ML) and their variations, e.g., canonical factoring (Rao, 1955) . Canonical factoring entails communality estimates, yielding ML factors (Gorsuch, 2015) ; 2) Least Squares (LS) and with their variations, e.g., principal axis factoring, WLS, ULS, GLS, or minimum residual analysis (MINRES). Note that both a non-iterated principal axis factoring (PA), and an iterated principal axis factoring version are available, with iterated communalities (IPA, Fabrigar & Wegener, 2012 ). Note also that some extraction methods have different names. For instance, ULS, OLS, or MINRES imply the same extraction method (Flora, 2018; Watkins, 2021) . Moreover, an iterative principal axis (IPA) converges to an OLS solution ( Briggs & MacCallum, 2003 ; MacCallum, 2009 cited in Watkins, 2021 ).

Three additional EFA extraction methods exist Alpha factoring, Image factoring, and Principal Component Analysis (PCA). Alpha factoring (Kaiser & Caffrey, 1965) maximizes the alpha, or the Kuder-Richardson coefficient of the factors (Gorsuch, 2015) . It is extracted like PA (Loehlin & Beaujean, 2017) . Image Factor Analysis uses the image of each variable, including only common factor variance. Image factoring can also be extracted by ML (Gorsuch, 2015) . Finally, whether PCA is either an extraction method ( Kim & Mueller, 1978; Mulaik, 2009) , or a classification method unrelated to FA (Brown, 2015; Fabrigar & Wegener, 2012) is debatable (Costello & Osborne, 2005) .

Traditionally there was a tendency to prefer ML because it offered additional model fit (Fabrigar, Wegener, MacCallum, & Strahan, 1999; Fabrigar & Wegener, 2012; Matsunaga, 2010) . However, this advantage is no longer true, given that model fit statistics are currently available with all EFA extraction methods in the R environment (R Development Core Team, 2022) , e.g., when using the “psych” package (Revelle, 2022) 1, see Table 2.

Simulation studies (summarized by Watkins, 2021 ) compared ML and LS extraction classes regarding factor recovery with different sample sizes and factor strength (Briggs & MacCallum, 2003; MacCallum, Browne, & Cai, 2007) . LS outperformed ML when the factors were weak (≤16% explained variance), or there

Table 2. Estimators when carrying out EFA or CFA/SEM in the R environment.

Note. CFA/SEM estimators are partly based on a similar table by Beaujean (2014: p. 157) . aMINRES is slightly different from OLS, ULS, and PA. The derivative of the Minres changed making it practically identical to PA and ULS. However, PA tends to generate slightly smaller residuals and slightly larger RMSEA values than MINRES.

was over-factoring (Brown, 2015) , or when the sample was small (100). Similarly, Briggs and MacCallum (2003) proposed that when using OLS in EFA there is an increased likelihood to extract all major common factors (p. 54). In contrast, ML performs well with larger samples, normal data, and strong factors (MacCallum, Browne, & Cai, 2007; Watson, 2017) . Generally, for continuous data, the most popular methods are ML and PA (Brown, 2015) . When the multivariate normality is heavily violated, ML can give distorted and biased model fit and significance tests. Moreover, ML is more prone to improper solutions (Brown, 2015) . Conversely, PA is free of distributional assumptions and it is less prone to improper solutions (Brown, 2015; Fabrigar et al., 1999) . Note, that the R environment can compute goodness-of-fit statistics with all estimators (not only ML), including PA. Thus, PA is an option when the normality assumption is violated or ML generates an improper solution, if the improper solution does not undercover problems, like weak factors or messy data (Brown, 2015) . Osborne and Banjanovic (2016) among many other researchers (e.g., Bandalos & Gerstner, 2016; Costello & Osborne, 2005 , cited by Watkins, 2021 ) agree that EFA literature proposes ML when multivariate normality holds and PA or ULS in absence of normality (Watkins, 2021) . Nevertheless, both extractions (as a rule) tend to yield similar results (Tabachnick & Fidell, 2012) . Generally, LS is popular because they make no distributional assumptions and they show sensitivity to weak factors, as Watkins (2021) argues (quoting also Carroll, 1993; Russell, 2002; Widaman, 2012 among others). Weak factors tend to have loadings e.g., <0.32 ( Costello & Osborne, 2005; Tabachnick & Fidell, 2012) .

The rotation methods can also be classified into two types (Brown, 2015; Kim & Mueller, 1978) : orthogonal and oblique. The oblique rotation can be further subdivided into those which are based on the direct simplification of loadings in the factor pattern matrix or the indirect simplification of the loadings on reference axes. Within each type, many variations exist (Kim & Mueller, 1978) . As a rule, oblique rotation calculates more accurately the factor associations in psychology where all constructs tend to affect one another (Costello & Osborne, 2005) , that is why there is a general tendency to consider oblique rotations more pertinent than orthogonal ones (Fabrigar & Wegener, 2012; Brown, 2015) . In fact, some theoretical benefits of orthogonal rotations are now regarded as misconceptions, e.g., offering a simpler structure in comparison to oblique rotations (Fabrigar & Wegener, 2012) . Oblique rotations make more accurate assumptions, providing solutions with comparable or superior simple structures (Fabrigar & Wegener, 2012) . If the factors are truly uncorrelated, the oblique rotation will produce identical results with orthogonal. And in CFA it will reproduce the magnitude of the factor relationships more accurately (Brown, 2015) .

5. CFA/SEM Estimator Classification and the Assumptions for Their Use

The two basic estimator classes, already described in the EFA section are pertinent in CFA/SEM too: 1) the Maximum Likelihood (ML), and 2) the Least Squares (LS) (Beaujean, 2014; Lei & Wu, 2012) . Additionally, based on a different classification proposed by Kline (2016) the estimators can be either simultaneous or single-equation (Kline, 2016) . The simultaneous estimators (full-information) estimate all model parameters at the same time, requiring a fully identified model. In contrast, single-equation estimators (partial information or limited information) compute the equation for a single endogenous variable each time (Kline, 2016) . ML is the most popular full-information estimator. LS class has both single and full-information estimators (see Kline, 2016; Lei & Wu, 2012 ). Robust estimators exist in both classes. Robust estimators are less influenced by violations of the normality in comparison to ML (Beaujean, 2014) .

Both Loehlin and Beaujean (2017: p. 54) as well as Raykov and Marcoulides (2006: p. 28) agree there are 4 fairly standard estimators in CFA/SEM: Unweighted Least Squares (ULS), Generalized Least Squares (GLS), ML, and Browne’s (1984) Asymptotically Distribution-free (ADF), also referred as Weighted Least Squares (WLS). Their application of all four is based on the minimization of an equivalent fit function (see how to specify them in the R environment in Appendix A and their formula in Appendix B). Described next are widely used estimators and not an exhaustive list.

5.1. Maximum Likelihood Estimator (ML)

ML applies to the whole range of SEM models, from non-recursive path models to models with substantive latent variables (Kline, 2016) . The assumptions for using ML are (Bollen, 1989; Brown, 2015; Kline, 2016) : 1) large sample; 2) continuous observed variables; 3) observed variables having multivariate normal distribution; 4) complete data; and 5) a correctly specified model to eliminate the possibility of specification error dissemination (like all full-information methods). Sample size assumption, small samples are generally tricky with ML because the estimates and fit tests they generate tend to be non-asymptotic (Lee & Song, 2004) . Brown (2015) notes that when we assume that ML (or any other estimators) is appropriate for continuous data we mean interval-type data. Note also that the assumptions (2) and (3) must hold both for the observed variables, and the latent variables, but not for all other added observed variables, e.g., covariates (Brown, 2015) . Under the above conditions, ML is asymptotically consistent, unbiased, and efficient (Bollen, 1989; Lei & Wu, 2012; Wang & Wang, 2020) . Additionally, ML is scale-free (standardized estimates will not change after any change in their scale), scale-invariant (data scale will not affect the fit function), and asymptotically normally distributed, that is as sample size increases the parameter estimates are normally distributed (Wang & Wang, 2020) . Finally, the ML fitting function multiplied by (n − 1) approaches a χ2 distribution, so the model χ2 is used to test the overall model fit (Wang & Wang, 2020) .

The principle behind ML is to find what estimates maximize the likelihood that the observed covariances (the data) were drawn from the population. The final set of parameter estimates minimizes squared differences between the respective elements of the two matrices. ML being a full information estimator offers standard errors for statistical significance testing and confidence intervals of factor loadings and factor correlations, and these may be some of the reasons that made ML popular (Brown, 2015) .

FIML is also an ML estimation technique for handling data missing completely at random (MCAR) or missing at random (MAR). FIML maximizes a modified log-likelihood function after raw data input. This approach is regarded as a state-of-the-art treatment for handling missingness (Beaujean, 2014; Lei & Wu, 2012; Wang & Wang, 2020) . Note that MAR is a plausible assumption, permitting missingness both in the observed outcome and covariates (Little & Rubin, 1987 cited in Wang & Wang, 2020 ).

ML is also applicable with variables slightly deviating from normality (e.g., Bollen, 1989; Raykov & Widaman, 1995 among many others), especially when the primary focus is parameter estimates (Raykov & Marcoulides 2006) , but the extent of this applicability differs in terms of the data used and the model specified. However, the use of ML with extremely non-normal data could (Brown, 2015) : 1) spuriously increased χ2 and model over-rejection (e.g., Bollen, 1989 ); 2) slightly under-estimated TLI and CFI values, whereas RMSEA values could be overestimated (Byrne, 2012) ; 3) moderately to extremely underestimated standard errors. Brown explains biased SEs increase the risk of Type I error—that is spuriously indicating that a parameter significantly differs from zero when this condition would be unverifiable in the population (West, Finch, & Curran, 1995) . Crucially, the smaller the sample size, the bigger the impact of the estimator misuse may be. Additionally, there is an increased risk of non-convergence or improper solutions (see Brown, 2015 for more details). On the other hand, parameter estimates (like factor loadings) may still be accurate, if there are no extreme normality violations with floor effects, thus suggesting the model linearity assumption may not hold (Brown, 2015; Byrne, 2012) . ML is also considered seriously affected by extremely kurtotic data (Brown, 2015) . An ML alternative, available in the “lavaan” package (Rossel, 2012) is the Pairwise Maximum Likelihood (PML).

Other ML alternatives for estimating non-normal, continuous variables are discussed in later sections.

5.2. Least Squares Estimators (LS)

Both OLS and 2SLS are noniterative, limited information estimators, and they do not require setting start values (Kline, 2016) . However, there are also full-in-formation, iterative least squares estimators, requiring start values (see Lei & Wu, 2012; Kline, 2016) .

Ordinary Least Squares (OLS). Within the SEM framework, the OLS can only estimate recursive path models (Kline, 2016) , which means its use is limited (Lei & Wu, 2012) .

Two-Stage Least Squares (2SLS). The Two-Stage Least Squares (2SLS) is suitable for non-recursive models. 2SLS is essentially identical to the OLS, applied in 2 steps, i.e., not all parameters are estimated simultaneously (Kline, 2016) . 2SLS does not make distributional assumptions (Lei & Wu, 2012) and it may be less susceptible to spreading model misspecification than full information estimators (Bollen, Kirby, Curran, Paxton, & Chen, 2007; Kline, 2016) . According to Bollen, Kirby, Curran, Paxton, and Chen (2007) , 2SLS is consistent, asymptotically unbiased, normally distributed, and efficient among limited information estimators (Lei & Wu, 2012) . A 2SLS variation proposed by Jöreskog (1983) can generate start values for latent-variable models (Kline, 2016; Lei & Wu, 2012) . The 3SLS is a variation of the 2SLS, completed in 3 stages, after controlling for correlated errors (Kline, 2016) . This means—Kline explains—that 3SLS is essentially a simultaneous estimator (see Kline, 2016; Bollen, 2012 ).

Generalized Least Squares (GLS). GLS is an alternative to ML assuming normal, continuous data or mild non-normality with non-extreme kurtosis (Browne, 1974; Brown, 2015) . Brown (2015) comments that GLS is less computationally challenging than ML, yielding comparable goodness of fit to ML, particularly with large samples. Given the above assumptions, GLS estimates are considered consistent, unbiased, asymptotically normally distributed, i.e., the distribution of its parameter estimate approximates normal distribution as sample size increases (Wang & Wang, 2020) , and efficient (Lei & Wu, 2012) . Unlike ML, GLS uses a weight matrix (W) for the residuals. In GLS, W is typically the inverse of S (Brown, 2015) . An alternative for continuous, normally distributed data available in the lavaan package (Rossel, 2012) is the Distributionally-weighted Least Squares (DLS). For a brief introduction to weight matrices and matrix algebra please refer to Whittaker and Schumacker (2022) . For a more detailed description and the application of matrix algebra in the R environment refer to Revelle (2016) or Fieller (2016) .

Alternatives for estimating non-normal, continuous variables (beyond the LS variations already presented) are discussed in the following section.

5.3. CFA/SEM Estimators without Distributional Assumptions

In presence of non-normality, a common situation in psychology research ( Beaujean, 2014 among others), possible solutions include (Wang & Wang, 2020) : 1) transformations of non-normally distributed variables to better converge to multivariate normality; 2) removal of outliers; 3) Bootstrapping to estimate variances of parameter estimates for significance tests (Bollen & Stine 1993) ; 4) Bayesian estimators (Lee & Song, 2004) ; 5) asymptotically distribution-free estimators like ADF (Browne, 1984) ; 6) adjusting ML χ2 and standard errors using rescaling (Satorra & Bentler, 1988) ; 7) using robust estimators, e.g., MLR, or MLM (Wang & Wang, 2020) ; and 8) item parceling (Brown, 2015) .

5.3.1. Robust ML Variations: MLM, MLR

A robust method proposed by Satorra and Bentler (1988) is to adjust ML to account for normality deviations. This approach uses ML to estimate the model together with robust standard errors and Satorra and Bentler’s (1988) rescaled model χ2 (SB χ2) for model fit evaluation (e.g., Lei & Wu, 2012; Wang & Wang, 2020 ). This is a popular choice for non-normal, continuous data in large samples (Brown, 2015; Beaujean, 2014) . Simulation studies showed that SB χ2 outperformed both the ML and ADF under nonnormality, however, it showed an over-rejection tendency in small (more realistic) samples (West, Finch, & Curran, 1995) .

Furthermore, Wang and Wang (2020) summarize the most popular robust ML estimators, for non-normality distributed data: 1) MLM, generating ML parameter estimates with standard errors and the mean-adjusted χ2 test statistic (the SB χ2); 2) MLMV, yielding ML parameter estimates with standard errors and a mean- and variance-adjusted χ2 statistic for multilevel modeling with continuous data; and 3) the MLR (Asparouhov & Muthén, 2005) , an ML sandwich estimator with robust standard errors and a χ2 statistic that is asymptotically equivalent to the Yuan and Bentler (2000) T2 test statistic. MLM and MLMV are not appropriate for data containing missing values. In contrast, missingness is allowed with MLR. Specifically, MLR allows missingness less strict than MCAR, but stricter than MAR. Otherwise, MLM and MLR provide identical parameter estimates with ML, with both χ2 and standard errors, adjusted for non-normality in large samples (Brown, 2015) . MLR uses a pseudo-ML method (Skinner, 1989) and adjusted χ2 like the SB χ2 to handle non-normality, or complex sampling designs (Lei & Wu, 2012) , because of its capability to handle data with non-independent cases, i.e., Multilevel CFA (Wang & Wang, 2020) . MLR allows multilevel analyses with unbalanced groups, and random path coefficients (Byrne, 2012; Kyriazos, 2019) . However, it also requires large samples (Byrne, 2012) . MLMV possibilities remain unexplored (Lei & Wu, 2012) .

The robust ML estimators give more precise test statistics and robust standard errors than the corresponding test statistics with non-normal data (Beaujean, 2014) .

5.3.2. Least Squares Variations: ADF (WLS), ULS

If the observed variables are extremely non-normal or categorical, estimators such as Browne’s (1984) asymptotic distribution-free or unweighted least squares (ULS) are considered more appropriate than ML.

Asymptotic Distribution-free (ADF) estimator (WLS). ADF (WLS) is a weighted least square estimator, closely related to the GLS (Brown, 2015) . WLS, is an option for models with non-normal, continuous, or categorical data when the sample size is adequately large (e.g., Wang & Wang, 2020; Lei & Wu, 2012; Thompson, 2004 ). WLS uses a different W from GLS, deriving from the estimates of the variances/covariances of each element of S, and from 4th moments based on multivariate kurtosis (Kaplan, 2000) . Therefore, the WLS fit function is weighted by variances/covariances and kurtosis to handle multivariate normality violations, unlike GLS, (Brown, 2015) . Brown (2015) explains that in absence of kurtosis, WLS and GLS will generate an identical minimum fit function value. The WLS W matrix signifies a consistent estimate of the asymptotic variance/co-variance matrix of the sample variance/covariance matrix S (Wang & Wang, 2020) . Additionally, Bentler and Yuan (1999) proposed an adjusted ADF χ2 which is able to handle small samples (Wang & Wang, 2020) , generating corrected estimates for standard error (Lei & Wu, 2012) . It is referred to as Yuan-Bentler corrected arbitrary distribution generalized least squares (AGLS) adjusted test statistic or Yuan-Bentler AGLS F-statistic (Bentler & Yuan, 1999) .

The WLS requirement on sample size is rather strict, otherwise may generate large amounts of bias (Lei & Wu, 2012) . Specifically, the sample used with WLS should exceed b + p (where b is the number of elements of S and p is the number of observed variables) to ensure a nonsingular W. In fact, some software will only carry out a WLS estimation only with Ns > b (Brown, 2015) . Moreover, small samples, are prone to extremely skewed observed variables, making W non-invertible. Even in moderate to small samples W may also be susceptible to nonpositive definite matrix errors when floor or ceiling effects are present (Brown, 2015) . In absolute numbers, sample sizes > 1000 cases would be required (Thompson, 2004; Lei & Wu, 2012) . Of course, ML CFA would also require large samples, even in comparison to ML EFA for the same data (MacCallum, Browne & Sugawara, 1996; Thompson, 2004) . Anyhow, the sample size would be less crucial with an average loading ≥ |0.80| (Thompson, 2004) . Moreover, WLS is computationally demanding (Wang & Wang, 2020) , although this may not be a primary concern with modern computers. Brown (2015) explains that W in WLS is created from the variances and covariances of each S element (i.e., “covariances of the covariances”), and it can grow very big when there the model contains many observed variables. The storage and inversion of big W matrices during the iterative estimation process may be rather demanding in computer resources. This process is overloaded further by the strict WLS requirement of very large samples (Brown, 2015) . Furthermore, simulation studies on estimators for non-normally distributed continuous data suggested that WLS is a worse performer than MLM/MLR (e.g., Chou & Bentler, 1995 ), as Brown (2015) comments. Additionally, WLS estimators require pairwise deletion instead of FIML to handle missingness, because they do not assume data MAR. However, they are regarded as consistent under the MARX assumption (missing at random with respect to X); i.e., only observed covariates with missing values are allowed. This assumption is less restrictive than MCAR but more restrictive than MAR (Wang & Wang, 2020) .

Considering the above restrictions, WLS remains a possibility when data is either continuous or categorical and not normally distributed, but MLM or MLR are considered wiser options since they outperform WLS in medium-small samples (Curran, West, & Finch, 1996; Hu, Bentler, & Kano, 1992; Brown, 2015) . The same is true for the WLS performance with ordinal categorical variables (e.g., binary, ordered categorical), due to the oversensitivity of χ2 and negatively biased standard errors with more complex models ( Muthén & Kaplan, 1992; cited in Brown, 2015) . Thus, as with non-normal continuous data, WLS may not be a wise choice with categorical data, in particular with small to moderate samples ( Flora & Curran, 2004 , cited in Brown, 2015 ). More elaborated options for ordinal categorical data are described later.

Unweighted least squares (ULS). Different weight matrices were used for the different LS estimators. When the identity matrix (I) is used, we have the ULS estimator. ULS is yet another option for model estimation with non-normal data. It is a consistent estimator (Bollen, 1989) . Nevertheless, ULS assumes all observed variables to be measured on the same scale (i.e., it is not scale-invariant). So, ULS remains a possibility when data is not normally distributed, but MLM or MLR are again considered more efficient (Kline, 2016; Lei & Wu, 2012) .

However, with robust estimators, like MLM, MLMV, MLMVS, MLR, ULSMV, WLSM, and WLSMV, the LR test is not applicable in the model χ2 statistics, because the difference in χ2 statistic between the nested models does not follow a χ2 distribution. Therefore, testing the difference between nested models involves taking a correction factor into account ( Beaujean, 2014; Wang & Wang 2020; Brown, 2015 among others). See a table with CFA/SEM estimators available in “psych”; Revelle, 2022 , and lavaan (Rossel, 2012) in Table 2.

5.4. CFA/SEM Estimators for Ordinal Categorical Data

For years, applied research treated ordinal categorical data as continuous (Byrne, 2012; Gorsuch, 1983; Schumacker & Beyerlein, 2000) both in traditional statistics (e.g., ANOVA, MANOVA) and SEM, maybe due to the absence of well-established estimators for ordinal categorical data (Byrne, 2012) , despite the work of Muthén and Kaplan (1985) proposing a “continuous/categorical variable methodology” for SEM with any combination of dichotomous, categorical, or continuous observed variables (Flora & Curran, 2004; Byrne, 2012) , in parallel to the (independent) work of Jöreskog (Jöreskog, 1994) and others (Coenders, Satorra, & Saris, 1997; Moustaki, 2001) . Specifically, the methodologist suggested using the polychoric-correlation coefficient, for calculating the relationship between ordinal variables and the polyserial-correlation coefficient for calculating the relationship between an ordered categorical and a continuous variable (Raykov & Marcoulides, 2006) . For dichotomous variables, tetrachoric correlation is used (a special polychoric correlation). Biserial correlation is also a type of polyserial correlation for dichotomous variables (Byrne, 2012) . The calculations were followed by Browne’s (1984) ADF (WLS) estimator (Byrne, 2012; Raykov & Marcoulides, 2006) . However, Byrne (2012) argues that these past approaches were rather impractical, having difficulty meeting assumptions. That is, 1) underneath each categorical observed variable a continuous normally distributed latent counterpart was assumed; 2) large enough sample size to safeguard reliable estimation of the correlation matrix; and 3) the observed variables were to be the lowest possible ( Byrne, 2012 , quoting Bentler, 2005 ). In fact, Bentler (2005) commented that the above assumptions were the main weakness of the methodology (reproduced by Byrne, 2012 ).

The above restrictions led to the development of several approaches for testing ordinal categorical data. Three are the most popular: ULS, WLS, and Diagonally Weighted Least Squares (DWLS). Corrections to estimated means and/or means plus variances of the ULS and DWLS generated their robust alternatives (Byrne, 2012) : correction to ULS means and variances (ULSMV), correction to DWLS means (WLSM), and correction to DWLS means and variances (WLSMV).

These robust WLS alternatives are simpler than the full WLS, they make no distributional assumptions and they gain popularity in noncontinuous applied research, but apart from large samples, they require raw data input (Kline, 2016) . The same asymptotic variance/covariance matrix of weights is used for WLS, WLSM, and WLSMV, but in different ways. That is, while WLS uses the whole weight matrix, WLSM and WLSMV use the whole weight matrix only for calculating standard errors and fit tests and the diagonal of the weight matrix to estimate parameters, which in WLSM and WLSMV are identical. The same is true for standard errors, but their adjusted χ2 differ, Wang & Wang, 2020 ). Their robust tests of model fit are equivalent to the SB χ2 (Byrne, 2012) . Brown (2015) suggested that WLSMV is the best option for CFA with ordinal categorical data. The WLSMV estimator was designed for small and moderate samples, compared to the sample requirements of WLS (Byrne, 2012) .

Additionally, a full-information version of the ML (FIML) estimator for noncontinuous data is available in SEM software. In contrast to the limited information WLS, this ML version does not fit the model to bivariate correlations. As an alternative, ML directly estimates the latent response variables with numerical integration methods based on raw data. That is, it estimates data probabilities within the probability tapping the multivariate normal distribution of the latent response variables (Kline, 2016) .

ML estimators can handle both missingness and data non-normality more effectively than the WLS family (i.e., FIML assumes MAR, while WLSM and WLSMV are not). Therefore, it is suggested that when a model has both categorical and non-normally distributed continuous variables, estimators of the ML family may be a more efficient option (Wang & Wang, 2020) . Note, however, that for ML with categorical data some software packages cannot calculate χ2 and model fit statistics. Moreover, the FIML algorithm with categorical data may be computationally challenging. So, in complex models with numerous latent variables and/or error covariances estimators from the WLS family can be less computationally challenging in comparison to ML (Wang & Wang, 2020) .

Kline (2016) also cautions that the robust WLS method in some software is called Robust Diagonally Weighted Least Squares (RDWLS). Forero, Maydeu-Olivares, and Gallardo-Pujol (2009) who proposed a ULS variation for categorical data found that it performed adequately compared to WLS-although not scale-invariant (Kline, 2016) . There is also a two-stage estimator for combining continuous and categorical endogenous variables proposed by Lee, Poon, and Bentler (1995) . Kline (2016) explains that initially, a special ML estimates the correlations between the latent response variables before the calculation of an asymptotic covariance matrix. The model is finally estimated with Arbitrary Generalized Least Squares (AGLS), which is the full WLS method (Kline, 2016) . Note that Brown (2015) argues that this AGLS is simply WLS with a different name. Other SEM software uses the Bayesian estimation with ordinal data, and familiarity with the Bayesian approach is essential (Kline, 2016) . See a list with CFA/SEM estimators in Table 2.

There are two different parameterizations for scaling ordinal categorical latent response variables where thresholds are treated as free parameters (Kline, 2020) : 1) Delta scaling, where the total variance of the latent response variables is constrained to 1, using polychoric correlations, given a change of 1 SD in the common factor; 2) Theta scaling, where the model fit in a single sample analysis does not change, and the residual variance of each latent response variable is constrained to 1. The standardized theta scaling equals the corresponding delta scaling, which is more interpretable (Kline, 2016) .

Literature supports that with a large number of categories and normally distributed data (a rarely met condition as we said), failure to address the ordinality of the data is likely negligible (Muthén & Kaplan, 1985; Bentler & Chou, 1987 ). Otherwise, overlooking the categorical data attributes, treating categorical variables as continuous could (Raykov & Marcoulides, 2006; Byrne, 2012; Brown, 2015) : 1) generate biased test statistics and standard errors. Especially SEs are overly sensitive to this effect with highly skewed variables. This effect is maximized with skewness in opposite directions, i.e., differential skewness (Byrne, 2012; citing Finch, West, & MacKinnon, 1997 ); 2) The relationships (correlations) among observed variables could be underestimated, especially in presence of floor/ceiling effects (Brown, 2015) . The underestimation is greater with variables having less than 5 categories when variables are highly and/or differentially skewed (e.g., Byrne, 2012; Bollen & Barb, 1981 ); 3) Residual variance estimates appear to be most sensitive to underestimation with fewer than 3 categories, skewness > 1, and opposite skewness (Byrne, 2012) ; 4) they create “pseudo factors” emerging from item difficulty or extremeness and not real constructs (Brown, 2015) . ML can also produce incorrect parameter estimates, such as in cases where marked floor or ceiling effects exist in purportedly interval-level measurement scales i.e., because the assumption of linear relationships does not hold (Brown, 2015; Byrne, 2012) .

Another crucial consideration in ordinal categorical estimation is how many response categories must a response scale has to be treated as ordinal categorical. Responses rated on 4 or fewer response categories are (fairly) unanimously tend to be treated as ordinal (Kline, 2016; Gadermann, Guhn, & Zumbo, 2012) . However, simulation results for the WLSMV estimator indicated better model fit and more accurate factor loadings with 2 - 3 categories compared to 4 - 6 (Byrne, 2012; citing Beauducel & Herzberg, 2006; Bentler & Chou, 1987; Gana & Broc, 2019 ), and some experts propose that 4 response categories can also be estimated as continuous (with MLM or MLR; Gana & Broc, 2019 ). However, the use of categorical estimators like WLSMV for ≥5 response categories has become more frequent in later years (Kline, 2016) . Actually, 5 to 7 response categories are essentially a “grey zone” that some researchers propose is an ordinal categorical zone (Kline, 2016; Gadermann et al., 2012) whereas others (Beaujean, 2014; Li, 2016; Raykov, 2012; Rigdon, 1998; Raykov & Marcoulides, 2006) suggest it can be treated as a continuous zone. Research (e.g., Rigdon, 1998 ) has demonstrated that with 5 or more response categories, problems due to the ordinal categorical nature of responses are expected to be minimized, especially with the robust ML approaches (Raykov & Marcoulides, 2006; Byrne, 2012; Gana & Broc, 2019) . See a Decision tree summarizing the basic considerations when selecting estimators in Figure 2 (cf. Gana & Broc, 2019: p. 33 ). Beyond this debate, examining the data distribution becomes essential (Raykov & Marcoulides, 2006) , as already discussed in Section 2 (Preliminary steps: Considerations about the Data Properties).

Part B. The implicit (indirect) effect of the estimator selection on the whole analytic strategy of the study

6. How the Estimator Selection Can Affect All the Study Analyses?

The next plausible question would be: What is the impact of selecting an estimator on the analytical strategy of the study if we want to treat the level of measurement of the variables consistently across the study analyses?

Figure 2. Decision tree summarizing the basic considerations when selecting estimators (partially adopted by Gana & Broc, 2019: p. 33 ).

Gadermann et al. (2012) (Kline, 2020) cautioned that using Cronbach’s alpha—or any other reliability coefficient (see Kyriazos, 2017 ), like omega—under circumstances that violate the assumptions of Pearson correlation coefficient (i.e., continuous data) could substantively deflate reliability estimates (Zumbo, Gadermann, & Zeisser, 2007) .

Therefore, Zumbo et al. (2007) introduced a coefficient alpha for ordinal data (ordinal alpha, see the code in Appendix A) that is calculated based on the polychoric correlation matrix. Using a polychoric matrix for calculating alpha agrees with the covariance modeling approach for categorical data (Gadermann et al., 2012) , e.g., proposed by Muthén (CVM; 1984) and Jöreskog (Jöreskog, 1994) . Taking this line of thought further, the issue that arises is how to treat the measurement level of variables consistently across the study. E.g., Can variables, estimated as ordinal categorical in a CFA/SEM/EFA model, be treated as continuous for the rest of the study analyses? A unified course of action, would require that the measurement level of the variables be consistently treated across the entire study, and not only accounted for during the estimation process.

Specifically, treating a set of variables in a measurement instrument as ordinal categorical means that the researcher cannot calculate means and standard deviations (at least when using R; e.g., “psych”; Revelle, 2022 ). Similarly, ordinal reliability coefficients should be used (both for alpha and omega; Gadermann et al., 2012 ). Finally, the nomological network of the measure should be calculated

Figure 3. A unified course of action, ensuring that the measurement level of the variables is consistently treated across the entire study, not only accounted for during the estimation process.

based on the polychoric, polyserial, or tetrachoric correlation matrix, for example in validation studies (e.g., Kyriazos & Stalikas, 2019b ), that CFA is commonly used, together with correlation analysis. This may be a more unified course of action, ensuring that level of measurement of the variables in the study is consistently treated either as continuous or as ordinal categorical across the entire study, and not accounted for only during the estimation process (Figure 3). The effect of this unified approach might be more accurate and stable study results. However, the comparability of the results across studies might become more complicated due to arising discrepancies from results coming studies adopting the “ordinal” course of action in comparison to those adopting e.g., classical reliability coefficients.

7. Recap and Conclusion

Basic desirable estimator properties are consistency, non-biases, and efficiency. Scale invariance, and scale freeness, are also desirable. Scholars propose selecting an estimator based on 1) the measurement level of the variables; 2) non-normality; and 3) model type (Muthén & Muthén, 2006; Schumacker & Lomax, 2016) .

Conservative Approach. High-speed equipment would permit testing different estimators for the same data and comparing how they affect parameter estimates and standard errors, checking whether they are inflated or biased (Schumacker & Lomax, 2016) , or simply confirming that results are not artifacts (Thompson, 2004) , most likely due to the absence of normality when using a non-robust approach (Byrne, 2012) . This approach allows confidence that similar solutions exist with different estimators (Thompson, 2004) . E.g., when in an EFA we used ML, if the ML results are replicable with the principal axis this is an additional sign of robustness, otherwise, non-replicability could be indicative of model or data problems (e.g., Briggs & MacCallum, 2003; Fabrigar & Wegener, 2012) .

Sample size. Fouladi (2000) reported that the SB χ2 corrections work well with samples ≥ 250. The χ2, being notoriously sensitive to sample size, as the sample grows larger (as a rule > 200), χ2 is prone to become significant. On the contrary, as the sample size shrinks (as a rule < 100), the χ2 is prone to non-significance ( Thompson, 2004 , among many others). The chi-square is additionally biased by deviations from the multivariate normality assumption of the observed variables (e.g., Schumacker & Lomax, 2016 ). Of particular note here is the extremely large sample required with ordinal categorical data to get robust estimates. That is— Byrne (2012) reminds— Jöreskog and Sörbom (1996) proposed, as a minimum sample of (q + 1) (q + 2)/2, where q is the observed variables of the model (refer to Schumacker & Lomax, 2016 for a similar approach). Alternatively, Raykov and Marcoulides (2006) recommended a sample 10 times the estimated model parameters (Byrne, 2012) . Simulation research confirmed that the WLSMV generates precise test statistics, parameter estimates, and standard errors with normal and non-normal latent response distributions across varying samples (100 - 1000) with CFA models (see Flora & Curran, 2004; Byrne, 2012; Brown, 2015) , or CFA MTMM models (Kyriazos, 2018) . However, there is no consensus on what constitutes a large sample (Raykov & Marcoulides, 2006) , in complex procedures like scale development (e.g., Kyriazos & Stalikas, 2019a ), especially before model-based power analysis (e.g., MacCallum, Browne, & Sugawara, 1996 ) become available in SEM software (e.g., R; R Development Core Team, 2022 ).

The estimator selection without considering the above assumptions could affect whether parameter estimates are significant or not, that is, the value of the standard errors may be inflated or biased. Chi-square and model fit statistics can be equally biased. However, parameter estimates like factor loadings may be accurate. Robust alternatives for non-normality distributed data include MLM (only with complete data) or MLR (permitting incomplete data). Especially for ordinal categorical data, the bias is maximum with 2 response options, and it decreases as the number of categories increases (Byrne, 2012; Rigdon, 1998) .

When a researcher selects an estimator, by treating a variable set either as continuous or ordinal categorical, the variables should be treated equivalently in all other study analyses to preserve a consistent treatment of the measurement level. To treat the measurement level of variables consistently across the study would mean that choosing a continuous or categorical estimator can impact the whole study’s analytic strategy. A unified course of action would require that the measurement level of the variables be consistently treated across the entire study, and not only accounted for during the estimation process. That is if selecting an

Figure 4. Basic parameters to consider for achieving consistency, non-biasness, and efficiency, when choosing an estimator during a CFA/SEM of an EFA.

ordinal categorical estimator would require addressing the ordinality of the variables over the analyses of the entire study (e.g., calculating ordinal reliability coefficients, omitting means (SDs), and calculating tetrachoric/polychoric correlations between the study measures). Considerations for making a thoughtful choice when selecting an estimator for CFA/SEM or EFA are presented graphically (Figure 4).

Appendix A

R Syntax for specifying:

1) Estimator when performing an EFA, CFA, and SEM Models.

2) Calculating Alpha, Omega, and Factor loadings from polychoric and Pearson correlation matrices (ordinal reliability).

#################################################################

# 1. Clear User Interface and Free Memory

#################################################################

# Clear plots

if(!is.null(dev.list())) dev.off()

# Clean workspace

rm(list=ls())

# Clear console

cat("\014")

#################################################################

# 2. EFA “psych” (Revelle, 2022)

#################################################################

library(psych)

#using the Harman 24 mental tests

minres <- fa(Harman74.cor$cov,4,fm="minres" ,rotate="oblimin") # minimum residual

pa <- fa(Harman74.cor$cov,4,fm="pa" ,rotate="oblimin") # principal factor solution,

wls <- fa(Harman74.cor$cov,4,fm="wls" ,rotate="oblimin") # weighted least squares

gls <- fa(Harman74.cor$cov,4,fm="gls" ,rotate="oblimin") # generalized weighted least squares

ml <- fa(Harman74.cor$cov,4,fm="ml" ,rotate="oblimin") # maximum likelihood

#################################################################

# 3. CFA lavaan (Rossel, 2012)

#################################################################

library(lavaan)

# Holzinger and Swineford Dataset (9 Variables)

# This data is included in the lavaan package, so we can load it with the data() function.

data(HolzingerSwineford1939)

# CFA

mod <- ‘visual =~ x1 + x2 + x3

textual =~ x4 + x5 + x6

speed =~ x7 + x8 + x9’

GLS_CFA <- cfa(mod, data = HolzingerSwineford1939,estimator = "GLS") # generalized least squares. For complete data only.

WLS_CFA <- cfa(mod, data = HolzingerSwineford1939,estimator = "WLS") # weighted least squares (ADF estimation)

DWLS_CFA <- cfa(mod, data = HolzingerSwineford1939,estimator = "DWLS")# diagonally weighted least squares

ULS_CFA <- cfa(mod, data = HolzingerSwineford1939,estimator = "ULS")# unweighted least squares

DLS_CFA <- cfa(mod, data = HolzingerSwineford1939,estimator = "DLS")# distributionally-weighted least squares

PML_CFA <- cfa(mod, data = HolzingerSwineford1939,estimator = "PML")# pairwise maximum likelihood

MLM_CFA <- cfa(mod, data = HolzingerSwineford1939,estimator = "MLM")# Satorra-Bentler scaled test statistic

MLR_CFA <- cfa(mod, data = HolzingerSwineford1939,estimator = "MLR")# Yuan-Bentler test statistic.

MLMVS_CFA <- cfa(mod, data = HolzingerSwineford1939,estimator = "MLMVS")# Satterthwaite approach

summary(GLS_CFA, fit.measures=TRUE)

summary(WLS_CFA, fit.measures=TRUE)

summary(DWLS_CFA, fit.measures=TRUE)

summary(ULS_CFA, fit.measures=TRUE)

summary(DLS_CFA, fit.measures=TRUE)

summary(PML_CFA, fit.measures=TRUE)

summary(MLM_CFA, fit.measures=TRUE)

summary(MLR_CFA, fit.measures=TRUE)

summary(MLMVS_CFA, fit.measures=TRUE)

#################################################################

# 4. SEM

#################################################################

# Political Democracy dataset

mod2 <- ‘

# measurement model

ind60 =~ x1 + x2 + x3

dem60 =~ y1 + y2 + y3 + y4

dem65 =~ y5 + y6 + y7 + y8

# regressions

dem60 ~ ind60

dem65 ~ ind60 + dem60

# residual correlations

y1 ~~ y5

y2 ~~ y4 + y6

y3 ~~ y7

y4 ~~ y8

y6 ~~ y8

GLS_Sem <- sem (mod2, data = PoliticalDemocracy,estimator = "GLS") # generalized least squares

WLS_Sem <- sem (mod2, data = PoliticalDemocracy,estimator = "WLS") # weighted least squares (ADF estimation)

DWLS_Sem <- sem (mod2, data = PoliticalDemocracy,estimator = "DWLS") # diagonally weighted least squares

ULS_Sem <- sem (mod2, data = PoliticalDemocracy,estimator = "ULS") # unweighted least squares

PML_Sem <- sem (mod2, data = PoliticalDemocracy,estimator = "PML") # pairwise maximum likelihood

MLM_Sem <- sem (mod2, data = PoliticalDemocracy,estimator = "MLM") # Satorra-Bentler scaled test statistic

MLR_Sem <- sem (mod2, data = PoliticalDemocracy,estimator = "MLR") # Yuan-Bentler test statistic.

MLMVS_Sem <- sem (mod2, data = PoliticalDemocracy,estimator = "MLMVS") # Satterthwaite approach

summary(GLS_Sem, standardized = TRUE)

summary(WLS_Sem, standardized = TRUE)

summary(DWLS_Sem, standardized = TRUE)

summary(ULS_Sem, standardized = TRUE)

summary(PML_Sem, standardized = TRUE)

summary(MLM_Sem, standardized = TRUE)

summary(MLR_Sem, standardized = TRUE)

summary(MLMVS_Sem, standardized = TRUE)

#################################################################

# 5. Calculating alpha, Omega, and Factor loadings from

# polychoric and Pearson correlation matrices

#################################################################

library(psych)

data(bfi) # 25 personality self report items taken from the International Personality Item Pool

d <- subset(bfi,select = c(1:5)) # A1:Α5, Agreeableness

d$A1<-7-d$A1 # recode

polychoric(d) # polychoric correlation matrix

pcm<-polychoric(d)

# Raw and standardized alpha from polychoric and Pearson correlation matrices

alpha(pcm$rho)

alpha(d)

# Factor loadings from polychoric and Pearson correlation matrices

fa(d)

fa(pcm$rho)

#Omega from polychoric and Pearson correlation matrices

omega(d)

omega(pcm$rho)

Appendix B

Basic Discrepancy Functions (Gana & Broc, 2019: pp. 30-31; Raykov & Marcoulides, 2006: pp. 53-54) .

1) Maximum likelihood (ML) discrepancy function (FML):

2) Generalized Least Squares (GLS) discrepancy function (FGLS):

3) Unweighted Least Squares (ULS) is based on minimizing the following discrepancy function, across the set of all possible values for γ:

4) The Asymptotically Distribution Free (ADF) (also known as WLS) minimizes the discrepancy function across the set of all possible values for γ:

NOTES

1This work does not refer to a particular software, although familiarity with R environment is assumed (see the R code to help readers to specify an EFA/CFA/SEM estimator in Appendix A).

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Asparouhov, T., & Muthén, B. (2005). Multivariate Statistical Modeling with Survey Data. In The 2005 FCSM Research Conference (pp. 14-16).
[2] Bandalos, D. L., & Gerstner, J. J. (2016). Using Factor Analysis in Test Construction. In K.
Schweizer, & C. DiStefano (Eds.), Principles and Methods of Test Construction: Standards and Recent Advances (pp. 26-51). Hogrefe.
[3] Beauducel, A., & Herzberg, P. Y. (2006). On the Performance of Maximum Likelihood versus Mean and Variance Adjusted Weighted Least Squares Estimation in CFA. Structural Equation Modeling, 13, 186-203.
https://doi.org/10.1207/s15328007sem1302_2
[4] Beaujean, A. A. (2014). Latent Variable Modeling Using R: A Step-by-Step Guide. Routledge. https://doi.org/10.4324/9781315869780
[5] Bentler, P. M. (2005). EQS 6 Structural Equations Program Manual. Multivariate Software.
[6] Bentler, P. M., & Chou, C.-P. (1987). Practical Issues in Structural Modeling. Sociological Methods & Research, 16, 78-117. https://doi.org/10.1177/0049124187016001004
[7] Bentler, P. M., & Yuan, K-H. (1999). Structural Equation Modeling with Small Samples: Test Statistics. Multivariate Behavioral Research, 34, 181-197.
https://doi.org/10.1207/S15327906Mb340203
[8] Bollen, K. A. (1989). Introduction to Structural Equation Models with Latent Variables. Wiley. https://doi.org/10.1002/9781118619179
[9] Bollen, K. A. (2012). Instrumental Variables in Sociology and the Social Sciences. Annual Review of Sociology, 38, 37-72. https://doi.org/10.1146/annurev-soc-081309-150141
[10] Bollen, K. A., & Barb, K. H. (1981). Pearson’s R and Coarsely Categorized Measures. American Sociological Review, 46, 232-239. https://doi.org/10.2307/2094981
[11] Bollen, K. A., & Stine, R. A. (1993). Bootstrapping Goodness-of-Fit Measures in Structural Equation Models. In K. A. Bollen, & J. S. Long (Eds.), Testing Structural Equation Models (pp. 111-135). Sage.
[12] Bollen, K. A., Kirby, J. B., Curran, P. J., Paxton, P. M., & Chen, F. (2007). Latent Variable Models under Misspecification: Two-Stage Least Squares (2SLS) and Maximum Likelihood (ML) Estimators. Sociological Methods and Research, 36, 48-86.
https://doi.org/10.1177/0049124107301947
[13] Briggs, N. E., & MacCallum, R. C. (2003). Recovery of Weak Common Factors by Maximum Likelihood and Ordinary Least Squares Estimation. Multivariate Behavioral Research, 38, 25-56. https://doi.org/10.1207/S15327906MBR3801_2
[14] Brown, T. A. (2015). Confirmatory Factor Analysis for Applied Research (2nd ed.). Guilford.
[15] Browne, M. W. (1974). Generalized Least Squares Estimators in the Analysis of Covariance Structures. South African Statistical Journal, 8, 1-24.
https://doi.org/10.1002/j.2333-8504.1973.tb00197.x
[16] Browne, M. W. (1984). Asymptotically Distribution-Free Methods for the Analysis of Covariance Structures. British Journal of Mathematical and Statistical Psychology, 37, 62-83. https://doi.org/10.1111/j.2044-8317.1984.tb00789.x
[17] Byrne, B. M. (2012). Structural Equation Modeling with Mplus. Routledge.
https://doi.org/10.4324/9780203807644
[18] Carroll, J. B. (1993). Human Cognitive Abilities: A Survey of Factor-Analytic Studies. Cambridge University Press. https://doi.org/10.1017/CBO9780511571312
[19] Chou, C. P., & Bentler, P. M. (1995). Estimates and Tests in Structural Equation Modeling.
[20] Coenders, G., Satorra, A., & Saris, W. E. (1997). Alternative Approaches to Structural Modeling of Ordinal Data: A Monte Carlo Study. Structural Equation Modeling, 4, 261-282. https://doi.org/10.1080/10705519709540077
[21] Costello, A. B., & Osborne, J. (2005). Best Practices in Exploratory Factor Analysis: Four Recommendations for Getting the Most from Your Analysis. Practical Assessment Research & Evaluation, 10, 1-9.
[22] Curran, P. J., West, S. G., & Finch, J. F. (1996). The Robustness of Test Statistics to Nonnormality and Specification Error in Confirmatory Factor Analysis. Psychological Methods, 1, 16-29. https://doi.org/10.1037/1082-989X.1.1.16
[23] Fabrigar, L. R., & Wegener, D. T. (2012). Exploratory Factor Analysis. Oxford University Press. https://doi.org/10.1093/acprof:osobl/9780199734177.001.0001
[24] Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the Use of Exploratory Factor Analysis in Psychological Research. Psychological Methods, 4, 272-299. https://doi.org/10.1037/1082-989X.4.3.272
[25] Fielle, N. (2016). Basics of Matrix Algebra for Statistics with R. Journal of Statistical Software, 71, 1-3.
[26] Finch, J. F., West, S. G., & MacKinnon, D. P. (1997). Effects of Sample Size and Nonnormality on the Estimation of Mediated Effects in Latent Variable Models. Structural Equation Modeling, 4, 87-107. https://doi.org/10.1080/10705519709540063
[27] Flora, D. B. (2018). Statistical Methods for the Social and Behavioural Sciences: A Model-Based Approach. Sage.
[28] Flora, D. B., & Curran, P. J. (2004). An Empirical Evaluation of Alternative Methods of Estimation for Confirmatory Factor Analysis with Ordinal Data. Psychological Methods, 9, 466-491. https://doi.org/10.1037/1082-989X.9.4.466
[29] Forero, C. G., Maydeu-Olivares, A., & Gallardo-Pujol, D. (2009). Factor Analysis with Ordinal Indicators: A Monte Carlo Study Comparing DWLS and ULS Estimation. Structural Equation Modeling, 16, 625-641.
https://doi.org/10.1080/10705510903203573
[30] Fouladi, R. T. (2000). Performance of Modified Test Statistics in Covariance and Correlation Structure Analysis under Conditions of Multivariate Nonnormality. Structural Equation Modeling: A Multidisciplinary Journal, 7, 356-410.
https://doi.org/10.1207/S15328007SEM0703_2
[31] Gadermann, A. M., Guhn, M., & Zumbo, B. D. (2012). Estimating Ordinal Reliability for Likert-Type and Ordinal Item Response Data: A Conceptual, Empirical, and Practical Guide. Practical Assessment, Research & Evaluation, 17, Article No. 3.
[32] Gana, K., & Broc, G. (2019). Structural Equation Modeling with Lavaan. Wiley.
https://doi.org/10.1002/9781119579038
[33] Gorsuch, R. L. (1983). Factor Analysis. Erlbaum.
[34] Gorsuch, R. L. (2015). Factor Analysis (3rd ed.). Routledge.
https://doi.org/10.4324/9781315735740
[35] Hu, L., Bentler, P. M., & Kano, Y. (1992). Can Test Statistics in Covariance Structure Analysis Be Trusted? Psychological Bulletin, 112, 351-362.
https://doi.org/10.1037/0033-2909.112.2.351
[36] Jöreskog, K. G. (1983). Factor Analysis as an Error-in-Variables Model. In H. Wainer, & S. Messick (Eds.), Principles of Modern Psychological Measurement (pp. 185-196). Erlbaum.
[37] Jöreskog, K. G. (1994). On the Estimation of Polychoric Correlations and Their Asymptotic Covariance Matrix. Psychometrika, 59, 381-389.
https://doi.org/10.1007/BF02296131
[38] Jöreskog, K. G. (2007). Factor Analysis and Its Extensions. In R. Cudeck, & R. C. MacCallum (Eds.), Factor Analysis at 100: Historical Developments and Future Directions (pp. 47-78). Erlbaum.
[39] Jöreskog, K. G., & Sörbom, D. (1996). LISREL 8: User’s Reference Guide. Scientific Software International.
[40] Kaiser, H. F., & Caffrey, J. (1965). Alpha Factor Analysis. Psychometrika, 30, 1-14.
https://doi.org/10.1007/BF02289743
[41] Kaplan, D. (2000). Structural Equation Modeling: Foundations and Extensions. Sage.
https://doi.org/10.4135/9781412984256
[42] Kim, J., & Mueller, C. W. (1978). Factor Analysis: Statistical Methods & Practical Issues. Sage.
[43] Kline, R. B. (2016). Principles and Practice of Structural Equation Modeling. Guilford Press.
[44] Kline, R. B. (2020). Becoming a Behavioral Science Researcher. Guilford.
[45] Kyriazos, T. (2017). Reliability of Psychometric Instruments. In M. Galanakis, C. Pezirkianidis, & A. Stalikas (Eds.), Basic Aspects of Psychometrics (pp. 85-126). Topos.
[46] Kyriazos, T. A. (2018). Applied Psychometrics: The Application of CFA to Multitrait-Multimethod Matrices (CFA-MTMM). Psychology, 9, 2625-2648.
https://doi.org/10.4236/psych.2018.912150
[47] Kyriazos, T. A. (2019). Applied Psychometrics: The Modeling Possibilities of Multilevel Confirmatory Factor Analysis (MLV CFA). Psychology, 10, 777-798.
https://doi.org/10.4236/psych.2019.106051
[48] Kyriazos, T. A., & Stalikas, A. (2019a). Nicomachus-Positive Parenting (NPP): Development and Initial Validation of a Parenting Questionnaire within the Positive Psychology Framework. Psychology, 10, 2115-2165.
https://doi.org/10.4236/psych.2019.1015136
[49] Kyriazos, T. A., & Stalikas, A. (2019b). Alabama Parenting Questionnaire—Short Form (APQ-9): Evidencing Construct Validity with Factor Analysis, CFA MTMM and Measurement Invariance in a Greek Sample. Psychology, 10, 1790-1817.
https://doi.org/10.4236/psych.2019.1012117
[50] Lee, S. Y., Poon, W. Y., & Bentler, P. M. (1995). A Two-Stage Estimation of Structural Equation Models with Continuous and Polytomous Variables. British Journal of Mathematical and Statistical Psychology, 48, 339-358.
https://doi.org/10.1111/j.2044-8317.1995.tb01067.x
[51] Lee, S.-Y., & Song, X.-Y. (2004). Evaluation of the Bayesian and Maximum Likelihood Approaches in Analyzing Structural Equation Models with Small Sample Sizes. Multivariate Behavioral Research, 39, 653-686. https://doi.org/10.1207/s15327906mbr3904_4
[52] Lei, P.-W., & Wu, Q. (2012). Estimation in Structural Equation Modeling. In R. H. Hoyle (Ed.), Handbook of Structural Equation Modeling (pp. 164-180). Guilford.
[53] Li, C. (2016). Confirmatory Factor Analysis with Ordinal Data: Comparing Robust Maximum Likelihood and Diagonally Weighted Least Squares. Behavior Research Methods, 48, 936-949. https://doi.org/10.3758/s13428-015-0619-7
[54] Little, R. J. A., & Rubin, D. B. (1987). Statistical Analysis with Missing Data. Wiley.
[55] Loehlin, J. C., & Beaujean, A. A. (2017). Latent Variable Models: An Introduction to Factor, Path, and Structural Equation Analysis (5th ed.). Routledge.
https://doi.org/10.4324/9781315643199
[56] MacCallum, R. C. (2009). Factor Analysis. In R. E. Millsap, & A. Maydeu-Olivares (Eds.), The Sage Handbook of Quantitative Methods in Psychology (pp. 123-147). Sage Publications Ltd. https://doi.org/10.4135/9780857020994.n6
[57] MacCallum, R. C., Browne, M. W., & Cai, L. (2007). Factor Analysis Models as Approximations. In R. Cudeck, & R. C. MacCallum (Eds.), Factor Analysis at 100: Historical Developments and Future Directions (pp. 153-175). Erlbaum.
[58] MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power Analysis and Determination of Sample Size for Covariance Structure Modeling. Psychological Methods, 1, 130-149. https://doi.org/10.1037/1082-989X.1.2.130
[59] Marsh, H. W., Hau, K. T., & Wen, Z. (2004). In Search of Golden Rules: Comment on Hypothesis-Testing Approaches to Setting Cutoff Values for Fit Indexes and Dangers in Overgeneralizing Hu and Bentler’s (1999) Findings. Structural Equation Modeling, 11, 320-341. https://doi.org/10.1207/s15328007sem1103_2
[60] Matsunaga, M. (2010). How to Factor-Analyze Your Data Right: Do’s, Don’ts, and How-to’s. International Journal of Psychological Research, 3, 97-110.
https://doi.org/10.21500/20112084.854
[61] Micceri, T. (1989). The Unicorn, the Normal Curve, and Other Improbable Creatures. Psychological Bulletin, 105, 156-166. https://doi.org/10.1037/0033-2909.105.1.156
[62] Moustaki, I. (2001). A Review of Exploratory Factor Analysis for Ordinal Categorical Data. In R. Cudeck, S. du Toit, & D. Sörbom (Eds.), Structural Equation Modeling: Present and Future (pp. 461-480). Scientific Software.
[63] Mulaik, S. A. (2009). Foundations of Factor Analysis. CRC press.
[64] Muthén, B., & Kaplan, D. (1985). A Comparison of Some Methodologies for the Factor Analysis of Non-Normal Likert Variables. British Journal of Mathematical and Statistical Psychology, 38, 171-189.
[65] Muthén, B., & Kaplan, D. (1992). A Comparison of Some Methodologies for the Factor Analysis of Non-Normal Likert Variables: A Note on the Size of the Model. British Journal of Mathematical and Statistical Psychology, 45, 19-30.
https://doi.org/10.1111/j.2044-8317.1992.tb00975.x
[66] Muthén, L. K., & Muthén, B. O (2006). Mplus User’s Guide (4th ed.). Muthén & Muthén.
[67] Osborne, J. W., & Banjanovic, E. S. (2016). Exploratory Factor Analysis with SAS. SAS Institute.
[68] R Development Core Team (2022). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing.
[69] Rao, C. R. (1955). Estimation and Tests of Significance in Factor Analysis. Psychometrika, 20, 93-111. https://doi.org/10.1007/BF02288983
[70] Raykov, T. (2012). Scale Construction and Development Using Structural Equation Modeling. In R. H. Hoyle (Ed.), Handbook of Structural Equation Modeling (pp. 472-492). Guildford. https://doi.org/10.4324/9780203930687-1
[71] Raykov, T., & Marcoulides, G. A. (2006). A First Course in Structural Equation Modeling. Erlbaum.
[72] Raykov, T., & Widaman, K. F. (1995). Issues in Structural Equation Modeling Research. Structural Equation Modeling, 2, 289-318. https://doi.org/10.1080/10705519509540017
[73] Revelle, W. (2016). A Review of Linear Algebra: Applications in R: Notes for Courses in Psychometric Theory and Latent Variable Modeling to Accompany Psychometric Theory with Applications in R. Northwestern University.
[74] Revelle, W. (2022). Package “Psych”: Procedures for Psychological, Psychometric, and Personality Research. https://personality-project.org/r/psych
[75] Rigdon, E. E. (1998). Structural Equation Modeling. In G. A. Marcoulides (Ed.), Modern Methods for Business Research (pp. 251-294). Erlbaum.
[76] Rossel, Y. (2012). Lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48, 1-36. http://www.jstatsoft.org/v48/i02
https://doi.org/10.18637/jss.v048.i02
[77] Russell, D. W. (2002). In Search of Underlying Dimensions: The Use (and Abuse) of Factor Analysis in Personality and Social Psychology Bulletin. Personality and Social Psychology Bulletin, 28, 1629-1646. https://doi.org/10.1177/014616702237645
[78] Satorra, A., & Bentler, P. M. (1988). Scaling Corrections for Chi-Square Statistics in Covariance Structure Analysis. In American Statistical Association 1988 Proceedings of the Business and Economics Section (pp. 308-313). American Statistical Association.
[79] Schumacker, R. E., & Beyerlein, S. T. (2000). Confirmatory Factor Analysis with Different Correlation Types and Estimation Methods. Structural Equation Modeling, 7, 629-636.
https://doi.org/10.1207/S15328007SEM0704_6
[80] Schumacker, R. E., & Lomax, R. G. (2016). A Beginner’s Guide to Structural Equation Modeling (4th ed.). Routledge. https://doi.org/10.4324/9781315749105
[81] Skinner, C. J. (1989). Domain Means, Regression and Multivariate Analysis. In C. J. Skinner, D. Holt, & T. M. F. Smith (Eds.), Analysis of Complex Surveys (pp. 59-87). Wiley.
[82] Stalikas, A., & Kyriazos, T. (2019). Research Methods and Statistics Using R. Topos.
[83] Stalikas, A., Kyriazos, T. A., Yotsidi, V., & Prassa, K. (2018). Using Bifactor EFA, Bifactor CFA and Exploratory Structural Equation Modeling to Validate Factor Structure of the Meaning in Life Questionnaire, Greek Version. Psychology, 9, 348-371.
[84] Tabachnick, B. G., & Fidell, L. S. (2012). Using Multivariate Statistics. 6h Edition, Person Education.
[85] Tabachnick, B. G., & Fidell, L. S. (2013). Using Multivariate Statistics. Allyn & Bacon.
[86] Thompson, B. (2004). Exploratory and Confirmatory Factor Analysis: Understanding Concepts and Applications. American Psychological Association.
https://doi.org/10.1037/10694-000
[87] Thompson, B., & Daniel, L. G. (1996). Factor Analytic Evidence for the Construct Validity of Scores: A Historical Overview and Some Guidelines. Educational and Psychological Measurement, 56, 197-208. https://doi.org/10.1177/0013164496056002001
[88] Wang, J., & Wang, X. (2020). Structural Equation Modeling (2nd ed.). Wiley.
https://doi.org/10.1002/9781119422730
[89] Watkins, M. W. (2021). A Step-by-Step Guide to Exploratory Factor Analysis with R and Rstudio. Routledge. https://doi.org/10.4324/9781003120001
[90] Watson, J. C. (2017). Establishing Evidence for Internal Structure Using Exploratory Factor Analysis. Measurement and Evaluation in Counseling and Development, 50, 232-238. https://doi.org/10.1080/07481756.2017.1336931
[91] West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural Equation Models with Nonnormal Variables: Problems and Remedies. In R. H. Hoyle (Ed.), Structural Equation Modeling: Concepts, Issues, and Applications (pp. 56-75). Sage.
[92] Whittaker, T. A., & Schumacker, R. E. (2022). A Beginner’s Guide to Structural Equation Modeling (5th ed.). Routledge. https://doi.org/10.4324/9781003044017
[93] Widaman, K. F. (2012). Exploratory Factor Analysis and Confirmatory Factor Analysis. In H. Cooper (Ed.), APA Handbook of Research Methods in Psychology: Data Analysis and Research Publication (Vol. 3, pp. 361-389). American Psychological Association.
https://doi.org/10.1037/13621-018
[94] Yuan, K. H., & Bentler, P. M. (2000). Three Likelihood-Based Methods for Mean and Covariance Structure Analysis with Nonnormal Missing Data. In Sociological Methodology (pp. 165-200). American Sociological Association.
https://doi.org/10.1111/0081-1750.00078
[95] Zumbo, B. D., Gadermann, A. M., & Zeisser, C. (2007). Ordinal Versions of Coefficients Alpha and Theta for Likert Rating Scales. Journal of Modern Applied Statistical Methods, 6, 21-29. https://doi.org/10.22237/jmasm/1177992180

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.