Computational Identification of Confirmatory Factor Analysis Model with Simplimax Procedures ()

Jingyu Cai^{1*}, Henk A. L. Kiers^{2}, Kohei Adachi^{1}

^{1}Graduate School of Human Sciences, Osaka University, Osaka, Japan.

^{2}Faculty of Behavioural and Social Sciences, University of Groningen, Groningen, The Netherlands.

**DOI: **10.4236/ojs.2021.116062
PDF
HTML XML
133
Downloads
956
Views
Citations

Confirmatory factor analysis (CFA) refers to the FA procedure with some loadings constrained to be zeros. A difficulty in CFA is that the constraint must be specified by users in a subjective manner. For dealing with this difficulty, we propose a computational method, in which the best CFA solution is obtained optimally without relying on users’ judgements. The method consists of the procedures at lower (L) and higher (H) levels: at the L level, for a fixed number of zero loadings, it is determined both which loadings are to be zeros and what values are to be given to the remaining nonzero parameters; at the H level, the procedure at the L level is performed over the different numbers of zero loadings, to provide the best solution. In the L level procedure, Kiers’ (1994) simplimax rotation fulfills a key role: the CFA solution under the constraint computationally specified by that rotation is used for initializing the parameters of a new FA procedure called simplimax FA. The task at the H level can be easily performed using information criteria. The usefulness of the proposed method is demonstrated numerically.

Share and Cite:

Cai, J. , Kiers, H. and Adachi, K. (2021) Computational Identification of Confirmatory Factor Analysis Model with Simplimax Procedures. *Open Journal of Statistics*, **11**, 1044-1061. doi: 10.4236/ojs.2021.116062.

1. Introduction

In factor analysis (FA), the variation of *p *observed variables is assumed to be explained by *m *common factors and *p *unique factors, with *m *< *p *and the two types of factors mutually uncorrelated. The *m *common factors serve to explain the variations of all *p *variables. On the other hand, each of the *p *unique factors has a one-to-one correspondence to each variable: a unique factor explains specifically the variation of the corresponding variable that remains unaccounted for by the common factors [1] [2] [3].

The parameters to be estimated in FA are a factor loading matrix L = (*λ** _{ij}*) (

$\Sigma =\Lambda \Phi {\Lambda}^{\prime}+\Psi $. (1)

(e.g., Adachi, 2019 [1]; Mulaik, 2010 [2] ). Thus, a loss function
$f\left(\Lambda ,\Psi ,\Phi |S\right)$ can be defined, which stands for the discrepancy between (1) and its sample counterpart S (*p *× *p*): FA can be formulated as minimizing
$f\left(\Lambda ,\Psi ,\Phi |S\right)$ over L, Y, and F, for a given S.

FA can be classified into two types; confirmatory (CFA) and exploratory (EFA) [2]. In CFA, some loadings in L are constrained to take zero values, while no constraints are imposed on the loadings of EFA. In this paper, we focus on CFA. An example of the CFA model with a particular constraint is illustrated in Figure 1. There, a constrained loading matrix L is shown on the left, and the corresponding CFA model is depicted on the right as a diagram, in which only pairs of variables and factors with unconstrained loadings are linked by paths. In Figure 1, the binary matrix B is also presented which specifies the constraints in L and the links in the right diagram. The elements in B = (*b _{ij}*) (

${b}_{ij}=\{\begin{array}{l}0,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{iff}\text{\hspace{0.17em}}{\lambda}_{ij}=0\\ 1,\text{\hspace{1em}}\text{otherwise}\end{array}$, (2)

in other words, *b _{ij}* = 1 if variable

${\mathrm{min}}_{\Lambda ,\Psi ,\Phi}f\left(\Lambda ,\Psi ,\Phi |S\right)$ s.t. $\Lambda =B\u2022\Lambda $ for a specified $B\in {\mathbb{S}}_{B}$ (3)

Figure 1. Illustration of constrained loadings and the corresponding CFA model.

with “s.t.” the abbreviation for “subject to” and ${\mathbb{S}}_{B}$ denoting a set of considered matrices B.

A problem in CFA (3) is that the link matrix B = (*b _{ij}*) defined as (2) must be selected by users. That is, it must be decided in a subjective manner, which elements in B are to be zeros/ones, in other words, which pairs of variables and factors are linked as in Figure 1 whether the CFA model with a particular constraint is accepted or not is checked afterwards by referring to the goodness-of-fit of the solution [4] [5] [6]. However, even if the model is found acceptable, better CFA models with other link matrices B may exist. It implies that all possible B in
${\mathbb{S}}_{B}$ must be considered for finding the best B. However, this is unfeasible, unless

[P1] The link matrix B (specifying a CFA model) must be selected subjectively by users.

[P2] An enormous number of possible matrices B in ${\mathbb{S}}_{B}$ must be considered for finding the best B.

To the best of our knowledge, the problems [P1] and [P2] in CFA have not been considered in the existing papers. In order to deal with those problems, we propose an FA procedure for computationally and optimally identifying a suitable CFA model. Here, the model identification includes estimating the model parameter values. The outline of our proposed procedure is described in the next section. Then, we detail the procedure in Section 3, report its assessment in a simulation study in Section 4, give numerical examples in Section 5, and conclude this paper in Section 6.

2. Outline of the Proposed Procedure

First, we outline our approach to the CFA model identification under the condition that the number of zero loadings is fixed to a particular integer in Section 2.1. Then, in Section 2.2, the approach is extended to the cases with the number of zero loadings not being fixed. We then summarize the prospects for the following sections.

2.1. Model Identification for a Specified Number of Nonzero Loadings (Cardinality)

For dealing with the difficulties [P1] and [P2] in CFA, we can consider the two procedures introduced in the next paragraphs, on the condition that
$\text{Card}\left(\Lambda \right)=\text{Card}\left(B\right)$, that is, the number of nonzero values in L and B the matrix between parentheses equals a specified integer *c* hence

$\text{Card}\left(\Lambda \right)=c$, or equivalently, $\text{Card}\left(B\right)=c$. (4)

Clearly, *pm *- *c* equals the number of zeros in B or L.

The first procedure considered can be formulated as

${\mathrm{min}}_{\Lambda ,\Psi ,\Phi}f\left(\Lambda ,\Psi ,\Phi |S\right)$ s.t. $\Lambda =B\u2022\Lambda $, after B is estimated optimally s.t. (4). (5)

This minimization is rewritten as performing CFA under the constraint indicated by link matrix B, with B estimated by another method in advance. Thus, the problem [P1] is overcome. Further, we can also consider that [P2] is dealt with, supposed that the value *c* in (4) and B are suitable.

The above procedure consists of two stages: first B is estimated, then CFA is performed, see (5). In contrast, the second procedure considered is formulated with a single stage as follows:

${\mathrm{min}}_{B,\Lambda ,\Psi ,\Phi}f\left(B,\Lambda ,\Psi ,\Phi |S\right)$ s.t. $\Lambda =B\u2022\Lambda $ and (4). (6)

Here, B has been added to the subscripts of “min”: the link matrix B, which indicates the pairs of variables and factors to be linked, is estimated jointly with the other parameters L, Y, and F.

Between (5) and (6) or (6¢), we can find the following difference: in minimizing loss function $f\left(\Lambda ,\Psi ,\Phi |S\right)$, the B value is kept fixed in (5), but allowed to change in (6). The difference implies that the resulting loss function value in (6) cannot exceed that value in (5):

${\mathrm{min}}_{B,\Lambda ,\Psi ,\Phi}f\left(B,\Lambda ,\Psi ,\Phi |S\right)\le {\mathrm{min}}_{\Lambda ,\Psi ,\Phi}f\left(\Lambda ,\Psi ,\Phi |S\right)$. (7)

This inequality shows that (6) can provide a better solution than (5). We will propose an iterative algorithm for (6), which will be started by the optimum for (5). As empirically shown later, in almost all cases, the solutions of (5) and (6) are equivalent: we can obtain the final solutions only by (5), without performing the iterative algorithm for (6). However, it is worth to perform the latter step, as (6) *can* provide a better solution in a few cases.

In this paper, we use the maximum likelihood (ML) method for estimating the parameters in (5) and (6). This implies that the loss function to be minimized is given as the negative of the log likelihood. It is explicitly expressed as

$f\left(\Lambda ,\Psi ,\Phi |S\right)=\text{log}\left|\Sigma \right|+\text{tr}\text{\hspace{0.05em}}S{\Sigma}^{-1}=\text{log}\left|\Lambda \Phi {\Lambda}^{\prime}+\Psi \right|+\text{tr}\text{\hspace{0.05em}}S{\left(\Lambda \Phi {\Lambda}^{\prime}+\Psi \right)}^{-1}$, (8)

following from the normality assumptions for factors [7]. For minimizing (8), we use Rubin & Thayer’s [8] EM algorithm for FA, whose properties are discussed in Adachi [9]. This algorithm is detailed in Appendix A1.

2.2. Selection of the Best Cardinality

We should notice that the above approach is conditional upon *c* in (4). Thus, it remains to select the best value for *c*. This can be attained by the following procedure:

Select the value *c *with the lowest *IC*(*c*) among
$c={c}_{\text{min}},\cdots ,{c}_{\text{max}}$. (9)

Here, *c*_{min}/*c*_{max} expresses the reasonable minimum/maximum of *c*, and *IC*(*c*) denotes the value of an information criterion statistic [10], with the statistic being a function of *c*. The information criteria consist of some statistics which can be defined in ML-based procedures, and they show smaller values for better solutions. Explicit expressions of *IC*(*c*) are introduced later. We can rewrite (9) as follows: the procedure of (6) following (5) (in the last subsection) is performed for
$c={c}_{\text{min}},\cdots ,{c}_{\text{max}}$, so that we have multiple solutions, among which the solution with the least *IC*(*c*) is selected as the best one. The largest interval of [*c*_{min}, *c*_{max}] is obviously [1, *pm*]. It can be reasonably reduced as

${c}_{\mathrm{min}}=p$ and ${c}_{\mathrm{max}}=pm-\frac{m\left(m-1\right)}{2}$ (10)

This is because, solutions with *c* smaller than *p *would surely have one or more zero rows of loadings. On the other hand, it is considered in the *c*_{max} value that *m*(*m *- 1)/2 elements in L can be set to zeros without a change in the values of loss function (8).

Now, let us discuss how our proposed procedure is related to the set
${\mathbb{S}}_{B}$ containing all possible B, using
${\mathbb{S}}_{B}\left(c\right)$ for the subset of
${\mathbb{S}}_{B}$ that contains the matrices B with *c *nonzeros. The number of B in
${\mathbb{S}}_{B}\left(c\right)$ is given by

${N}_{B}\left(c\right)={}_{pm}\text{C}{}_{c}$, (11)

*i.e.*, the number of the combinations of the *c* elements being ones among all *pm* elements in B. Thus, the number of all B contained in
${\mathbb{S}}_{B}$ is given by

${N}_{B}={\displaystyle {\sum}_{c={c}_{\mathrm{min}}}^{{c}_{\mathrm{max}}}{N}_{B}\left(c\right)}$ (12)

with (10) and (11). For example, when *p* = 12 and *m* = 3, the value in (12) is approximately 1.25 × 10^{9}, which is enormous (though less than 2* ^{pm} *@ 6.87 × 10

The remaining parts of this paper are organized as follows: in Sect 3.1 and Sect 3.2, we detail (5) and (6) in turn. There, it is described that Kiers’ [11] simplimax rotation is used for the optimal estimation of B subject to (4) in (5) and the idea of this rotation also fulfills a key role in (6). Thus, the procedures for (5) and (6) are particularly called simplimax-based CFA (SbCFA) and simplimax FA (SimpFA), respectively. In the section after treating SbCFA (5) and SimpFA (6), we detail the whole process of the proposed procedure, *i.e.*, SimpFA (6) following SbCFA (5) with the cardinality selection by (9). The proposed procedure is studied in a simulation study and illustrated with real data examples, as reported in the sections before concluding this paper.

3. Simplimax-Based Confirmatory Factor Analysis

In this section, we detail the procedure formulated as (5) with
$f\left(\Lambda ,\Psi ,\Phi |S\right)$ defined as (8). A key point is that we will use exploratory FA (EFA) followed by Kiers’ [11] simplimax rotation for estimating B optimally subject to (4) in (5). That is, in the procedure, firstly EFA is performed for a data set, secondly simplimax is applied to the EFA solution for estimating the link matrix B, and, finally, the resulting B is used for minimizing (8) subject to
$\Lambda =B\u2022\Lambda $. These three stages are described respectively in the following subsections. The last one can be regarded as *confirmatory FA* (*CFA*) on the basis of the B given by simplimax. We thus call the procedure in this section simplimax-based CFA (SbCFA).

3.1. Exploratory Factor Analysis

Let T denote any *m *× *m* nonsingular matrix satisfying

$\text{diag}\left(T{T}^{\prime}\right)={I}_{m}$, (13)

where diag(TT¢) is the diagonal matrix whose diagonal elements are those of TT¢. EFA has rotational indeterminacy as shown next:

$\Sigma =\Lambda {T}^{-1}T{T}^{\prime}{T}^{-1}{}^{\prime}{\Lambda}^{\prime}+\Psi ={\Lambda}_{\text{T}}\Phi {{\Lambda}^{\prime}}_{\text{T}}+\Psi $ (14)

with F = TT¢ and Λ_{T} = ΛT^{−1}. Here, the latter matrix can also be regarded as the loading matrix. Thus, even if F = TT¢ is fixed to I* _{m}* by choosing T that meets TT¢ = I

The indeterminacy (14) implies that Λ_{T} = Λ_{E}T^{−1} and F = TT¢ can also be the EFA solutions of factor loading and correlation matrices, respectively, with
$f\left({\Lambda}_{\text{E}},\Psi ,\Phi ={{\rm I}}_{m}|S\right)=f\left({\Lambda}_{\text{T}},\Psi ,\Phi |S\right)$. This fact is often exploited by obtaining T that allows Λ_{T} = Λ_{E}T to be interpretable. The procedures for obtaining T are generally referred to as factor rotation.

3.2. Simplimax Rotation

This subsection concerns the factor rotation intermediating between the first EFA and the final CFA stages. Here, the loading matrix is unconstrained in EFA, but constrained to meet $\Lambda =B\u2022\Lambda $ in CFA as in (3). The latter constrainedness leads to

${\mathrm{min}}_{\Lambda ,\Psi ,\Phi}f\left(B\u2022\Lambda ,\Psi ,\Phi |S\right)\ge f\left({\Lambda}_{\text{E}},\Psi ,\Phi ={\rm I}|S\right)=f\left({\Lambda}_{\text{T}},\Psi ,\Phi |S\right)$, (15)

where $f\left(B\u2022\Lambda ,\Psi ,\Phi |S\right)$ stands for the loss function in (3) in which the CFA constraint $\Lambda =B\u2022\Lambda $ substituted. Inequality (15) implies that the value of the EFA loss function $f\left({\Lambda}_{\text{E}},\Psi ,\Phi ={\rm I}|S\right)=f\left({\Lambda}_{\text{T}},\Psi ,\Phi |S\right)$ gives the lower limit of $f\left(B\u2022\Lambda ,\Psi ,\Phi |S\right)$ to be minimized in CFA (3). This suggests that the best attainable CFA solution would be one which provides the ${\mathrm{min}}_{\Lambda ,\Psi ,\Phi}f\left(B\u2022\Lambda ,\Psi ,\Phi |S\right)$ value close to the EFA counterpart the $f\left({\Lambda}_{\text{E}},\Psi ,\Phi ={\rm I}|S\right)=f\left({\Lambda}_{\text{T}},\Psi ,\Phi |S\right)$ value.

Now if T can be chosen in the factor rotation such that Λ_{T} = Λ_{E}T^{−1} is similar to a matrix that can be written as
$B\u2022\Lambda $ for particular matrices B and L, then we can expect that such B and L will be good candidates for giving a CFA solution with a fit close to that of EFA. This can be attained by the simplimax rotation [11], which is formulated as minimizing the simplimax function

$spx\left(B,\Lambda ,T\right)={\Vert B\u2022\Lambda -{\Lambda}_{\text{E}}{T}^{-\text{1}}\Vert}^{2}$ (16)

over B, L, and T subject to (4) and (13). Thus, this rotation serves as a suitable bridge between the first and final stages.

For the constrained minimization of (16), two steps are iterated alternately. In one of them, the simplimax function (16) is minimized over T under (13) for a given
$B\u2022\Lambda $, using Browne’s [12] algorithm. In another step, (16) is minimized over B = (*b _{ij}*) and Lunder (8) for a given Λ

${\mathrm{min}}_{B,\Lambda}spx\left(B,\Lambda |H\right)={\Vert B\u2022\Lambda -H\Vert}^{2}$ for a given *p*×*m* matrix
$H=\left({h}_{jk}\right)$, (17)

with H = Λ_{E}T^{−1} in the present context. As explained in Appendix A3, (17) can be attained for

${b}_{ij}=\{\begin{array}{l}1,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{if}\text{\hspace{0.17em}}{h}_{ij}^{2}\ge {h}_{\left[c\right]}^{2}\\ 0,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{otherwise}\end{array}$ and $\Lambda =H$, (18)

with
${h}_{\left[c\right]}^{2}$ denoting the *c*th largest value of all elements in
$H\u2022H=\left({h}_{ij}^{2}\right)$. The steps in the simplimax rotation algorithm is listed in Appendix A2.

3.3. Confirmatory Factor Analysis Following Simplimax Rotation

The final stage is simply to perform CFA using B obtained by the simplimax rotation. Thus, SbCFA is formulated by making (5) concrete as

${\mathrm{min}}_{\Lambda ,\Psi ,\Phi}f\left(\Lambda ,\Psi ,\Phi |S\right)$ s.t. $\Lambda =B\u2022\Lambda $

after B is estimated s.t. $\text{Card}\left(B\right)=c$ by simplimax (5¢)

with $f\left(\Lambda ,\Psi ,\Phi |S\right)$ defined as (8). The minimization in (5) or (5¢) is attained with the EM algorithm, which is described in Appendix A1.

3.4. Simplimax Factor Analysis

We will now describe simplimax FA (SimpFA). This is the procedure minimizing (6), using the solution from SbCFA as a starting configuration. Specifically, we minimize the negative of the log likelihood (see (8)), with
$B\u2022\Lambda $ is substituted into L, over B, L, Y, and F subject to card(å) = *c *(see (4)). We call this procedure simplimax FA (SimpFA), as this is a new FA procedure, and the minimization of the simplimax function (17) fulfills a key role as will be seen in the next paragraph.

The EM algorithm described in Appendix A1 can also be used for SimpFA. The algorithm for SimpFA differs from that for the other FA procedures, in that the former includes the step for minimizing (8) over both B and L with Y and F fixed. An innovative feature in the algorithm is to use Kiers’ [13] [14] majorization method for dealing with difficulties in treating (8) as a function of B and L. In that method, an auxilary function that majorizes (8) is minimized. For this minimization,

$spx\left(B,\Lambda |W\right)={\Vert B\u2022\Lambda -W\Vert}^{2}$, (19)

*i.e.*, the simplimax function in (17) with H = W, fulfills a key role. Here,
$W={\Lambda}_{\text{C}}+{Q}^{\prime}{{\Lambda}^{\prime}}_{\text{C}}{\Psi}^{-\text{1}}-\text{tr}\text{\hspace{0.05em}}{C}^{\prime}{\Psi}^{-\text{1}}{\Lambda}_{\text{C}}$ with Q and C defined in A.1 and L_{C} is the current L value (before update), see Appendix A4 for a derivation.

3.5. Multiple Runs of SimpFA Following SbCFA

Here, we describe the procedures in SimpFA following SbCFA, in which SbCFA provides the initial values of the parameters in SimpFA providing the final solution. We take a multiple run approach for SimpFA following SbCFA, in order to reduce the possibility of selecting a local minimizer as the optimal solution: the algorithm of SbCFA followed by SimpFA is run multiple times by starting SbCFA with mutually different initial values, and the best solution is selected among the multiple SimpFA solutions. The procedure is listed as follows:

Stage 1. For S, perform EFA to provide L_{E} and set *l* = 1.

Stage 2. For each of $l=1,\cdots ,100$, perform the following sub-stages:

Stage 2.1. Initialize T to T* _{l}* and perform simplimax. Express the resulting B as B

Stage 2.2. For S, perform CFA using B* _{l}* as B. Express the resulting L, Y, and F as
${\stackrel{\u02dc}{\Lambda}}_{l}$,
${\stackrel{\u02dc}{\Psi}}_{l}$, and
${\stackrel{\u02dc}{\Phi}}_{l}$, respectively, with their set
${\stackrel{\u02dc}{\Theta}}_{l}=\left\{{\stackrel{\u02dc}{\Lambda}}_{l},{\stackrel{\u02dc}{\Psi}}_{l},{\stackrel{\u02dc}{\Phi}}_{l}\right\}$.

Stage 2.3. For S, perform SimpFA with L, Y, and F initialized at
${\stackrel{\u02dc}{\Lambda}}_{l}$,
${\stackrel{\u02dc}{\Psi}}_{l}$, and
${\stackrel{\u02dc}{\Phi}}_{l}$, respectively. Express the resulting L, Y, and F as L* _{l}*, Y

Stage 3. Select $\left\{\stackrel{^}{\Lambda},\stackrel{^}{\Psi},\stackrel{^}{\Phi}\right\}$ with $f\left(\stackrel{^}{\Lambda},\stackrel{^}{\Psi},\stackrel{^}{\Phi}|S\right)={\mathrm{min}}_{l}f\left({\Lambda}_{l},{\Psi}_{l},{\Phi}_{l}|S\right)$ as the optimal solution.

Here, the initial value T* _{l}* for T in Stage 2.1 is chosen as follows: T

3.6. Selection of Cardinality by Information Criteria

The final solution
$\left\{\stackrel{^}{\Lambda},\stackrel{^}{\Psi},\stackrel{^}{\Phi}\right\}$ resulting in Stage 3 (the last subsection) depends on the cardinality *c* value in (8). We thus use
$\left\{{\stackrel{^}{\Lambda}}_{c},{\stackrel{^}{\Psi}}_{c},{\stackrel{^}{\Phi}}_{c}\right\}$ for the solution
$\left\{\stackrel{^}{\Lambda},\stackrel{^}{\Psi},\stackrel{^}{\Phi}\right\}$ for a particular value of *c*. The best value for *c *can be selected with the procedure (9). This can be rewritten using
$\left\{{\stackrel{^}{\Lambda}}_{c},{\stackrel{^}{\Psi}}_{c},{\stackrel{^}{\Phi}}_{c}\right\}$ as

Choose the
$\left\{{\stackrel{^}{\Lambda}}_{{c}^{\ast}},{\stackrel{^}{\Psi}}_{{c}^{\ast}},{\stackrel{^}{\Phi}}_{{c}^{\ast}}\right\}$ for the value*
${c}^{\ast}=\mathrm{arg}{\mathrm{min}}_{{c}_{\mathrm{min}}\le c{\le}_{\mathrm{max}}}IC\left(c\right)$ * (20)

with *c*_{min} and *c*_{max} defined as (10).

Figure 2.Graphical illustration of Stages 1 - 3.

As *IC*(*c*) in (20), we consider using either of Akaike’s [15] information criterion (*AIC*) or Schwarz’s [16] Bayesian information criterion (BIC*),* which are particularly popular in the information criteria. *AIC* and *BIC* can be calculated as

$AIC\left(c\right)=\text{2}f\left({\stackrel{^}{\Lambda}}_{c},{\stackrel{^}{\Psi}}_{c},{\stackrel{^}{\Phi}}_{c}|S\right)+2\kappa \left(c\right)$, (21)

$BIC\left(c\right)=\text{2}f\left({\stackrel{^}{\Lambda}}_{c},{\stackrel{^}{\Psi}}_{c},{\stackrel{^}{\Phi}}_{c}|S\right)+\mathrm{log}n\kappa \left(c\right)$, (22)

respectively, for a particular value of *c*. Here, *n* is the number of observations, and *k*(*c*) = *c* + *α *with *α* the number of unique variances plus the number of inter-factor correlations. Thus, (21) or (22) is substituted into *IC*(*c*) in (20). Whether we should use (21) or (22) is assessed in the next section.

4. Simulation Study

We performed a simulation study to assess the proposed procedure in the last section, with respect to [1] how often the *c* value in (8) is selected correctly by *AIC* and *BIC*; [2] how similar the SbCFA and SimpFA solutions are; [3] how well parameter values are recovered. The procedures for synthesizing and analyzing data in the study are described in the first subsection, then the results for [1] [2] and [3] are reported in the following ones.

4.1. Data Synthesis and Analysis

With *p* = 12, and *m* =3, we set the true {L, Y, F} as in Table 1 and sampled *n* = 300 rows of an *n *× *p *data matrix X from the *p*-variate normal distribution whose average is 0* _{p}* and covariance matrix is defined as
$\Sigma =\Lambda \Phi {\Lambda}^{\prime}+\Psi $, see (1). The resulting X provided the sample covariance matrix S =

For each S, we carried out the procedure of SimpFA following SbCFA with the selection of the best *c *by (20), using both *AIC* and *BIC*. It gives the SimpFA solutions
$\left\{{\stackrel{^}{\Lambda}}_{{c}^{\ast}},{\stackrel{^}{\Psi}}_{{c}^{\ast}},{\stackrel{^}{\Phi}}_{{c}^{\ast}}\right\}$, which are classified into two types according to whether *c* was chosen by AIC or BIC.

4.2. Correctness of Selected Cardinality

Let *c*_{true} denote the true Card(L). As found in Table 1, *c*_{true} = 15. On the other

Table 1. True parameters in L, Y1* _{p}*, and F with blank cells indicating zero elements.

hand, *c*^{*} expresses Card(L) selected by the proposed procedure (20). We obtained the deviation *c*^{*} − *c*_{true} for each of the 200 data set. The averages (standard deviation) of these deviations over all data sets are 2.43 (1.40) when using *AIC* and 0.34 (0.62) when using *BIC*. This result shows that *AIC* tends to overestimate Card(L). Moreover, the mean absolute bias |*c*^{*} − *c*_{true}| for *BIC *was smaller than that for *IC *for 184 data sets among the 200 ones. This result shows that BIC is substantially better than AIC. Accordingly, we take only the BIC-based solution into consideration from here.

4.3. Equivalence of SbCFA and SimpFA Solutions

It was found that 98 percent of the 200 solutions of SimpFA were equivalent to those of the SbCFA ones used for initializing the SimpFA parameters: no iteration in the SimpFA algorithm was required. Also in the remaining two percent of the runs, SbCFA and SimFA solutions were almost equivalent, with the average of the differences between the two solutions over the 200 data sets being 0.001. Here, the difference is defined as
$\left({\Vert {\stackrel{^}{\Lambda}}_{{c}^{\ast}}-{\stackrel{\u02dc}{\Lambda}}_{{c}^{\ast}}\Vert}_{1}+{\Vert {\stackrel{^}{\Psi}}_{{c}^{\ast}}-{\stackrel{\u02dc}{\Psi}}_{{c}^{\ast}}\Vert}_{1}+{\Vert {\stackrel{^}{\Phi}}_{{c}^{\ast}}-{\stackrel{\u02dc}{\Phi}}_{{c}^{\ast}}\Vert}_{1}\right)/\left\{pm+p-m\left(m-1\right)\right\}$ with
${\Vert \Gamma \Vert}_{1}={\Sigma}_{i}{\Sigma}_{j}\left|{\gamma}_{ij}\right|$ denoting the *L*_{1} norm of a matrix G = (*γ _{ij}*) and
$\left\{{\stackrel{\u02dc}{\Lambda}}_{{c}^{\ast}},{\stackrel{\u02dc}{\Psi}}_{{c}^{\ast}},{\stackrel{\u02dc}{\Phi}}_{{c}^{\ast}}\right\}$ being the SbCFA counterpart of
$\left\{{\stackrel{^}{\Lambda}}_{{c}^{\ast}},{\stackrel{^}{\Psi}}_{{c}^{\ast}},{\stackrel{^}{\Phi}}_{{c}^{\ast}}\right\}$.

The above results do not show that the SimpFA (following SbCFA) is useless, as SimpFA serves for showing the equivalence of its solution to the SbCFA one, which allows us to find the optimality of the SbCFA solution.

4.4. Recovery of Parameters

We assess how well the true parameter values are recovered by the SimpFA solution
${\stackrel{^}{\Theta}}_{{c}^{\ast}}=\left\{{\stackrel{^}{\Lambda}}_{{c}^{\ast}},{\stackrel{^}{\Psi}}_{{c}^{\ast}},{\stackrel{^}{\Phi}}_{{c}^{\ast}}\right\}$, which is provided by (20) with *IC*(*c*) = *BIC*(*c*). As the indices standing for the badness in the recovery for L, Y, and F,we obtained
${\Vert {\stackrel{^}{\Lambda}}_{{c}^{\ast}}-{\Lambda}_{\text{true}}\Vert}_{1}/\left(pm\right)$,
${\Vert {\stackrel{^}{\Psi}}_{{c}^{\ast}}-{\Psi}_{\text{true}}\Vert}_{1}/p$, and
${\Vert {\stackrel{^}{\Phi}}_{{c}^{\ast}}-{\Phi}_{\text{true}}\Vert}_{1}/M$, respectively. Further, for assessing the incorrectness in identifying the true zero and nonzero loadings, we recorded *MIR*_{0} = the number of the true zero loadings estimated as nonzero/(*pm *− *c*), and *MIR _{#}* = the number of the true nonzero loadings estimated as zero/

The statistics of the resulting five index values over the 200 data sets are presented in Table 2. It shows that the proposed procedure recovered
$\left\{{\stackrel{^}{\Lambda}}_{{c}^{\ast}},{\stackrel{^}{\Psi}}_{{c}^{\ast}},{\stackrel{^}{\Phi}}_{{c}^{\ast}}\right\}$, *i.e.*, the links of variables to factors and parameter values, fairly well, except that the 95 percentile for factor correlations can be considered large. These results allow us to conclude that the true CFA model can be recovered satisfactorily, though the estimates of factor correlations might deviate from the true counterparts in a few cases.

5. Real Data Demonstration

In order to demonstrate how useful our proposed procedure is, we apply it to two data sets that have already been analyzed by CFA with zero constraints selected by users. The solutions of the latter user-based CFA are compared with those for our proposed procedure.

The first data set is Carlson and Mulaik’s [17] personality trait data matrix of *n* = 280 (participants) × *p* = 15 (personality traits). Its correlation matrix (Mulaik, 2010, p. 198) has been analyzed by Mulaik [2] using CFA with his selected constraints, and its solution is shown in Mulaik (2010, p. 437) and also in Table 3 together with the solution of the proposed procedure. Here, the latter solution is the same as the one of SbCFA before SimpFA: iteration was not required in the SimpFA algorithm. Comparing the BIC values in Table 3, we can find that the solution of the proposed procedure was better than Mulaik’s [2] one.

Another data set is Kojima’s housing preference one with *n* = 1120 (participants) and *p* = 13 (features). It describes to what degree the participants wish to

Table 2. Percentiles, average (Avg) and standard deviation (SD) of each index over200 data sets.

Table 3. Solution for personality trait data, with blank cells indicating zero elements.

live in the houses featured by the variables. The correlation matrix for this data set is presented in Table 4, as it is described in Japanese and not easily available. This matrix has been analyzed by Kojima using CFA with his selected constraints, and its solution was shown in Kojima and also in Table 5 with the solution of the proposed procedure. Here, this solution differs from the SbCFA counterpart: iteration was required in the SimpFA algorithm. In Table 5, BIC shows that the solution of the proposed procedure was better.

In both of the above examples, the proposed procedure outperformed in the BIC values, but its solutions are similar to the counterparts of CFA with users’ selected constraints (Table 3 and Table 5). This similarity shows that the selected constraints were rational. However, the proposed procedure is fully computational and does not require any users’ effort for considering constraints. This property would be helpful as soon as there is some uncertainty as to which loadings to constrain to 0.

6. Conclusions

A problem in the confirmatory factor analysis (CFA) is that users must select

Table 4. Correlation matrix for Kojima’s (2013) housing preference data.

Table 5. Solution for housing preference data, with blank cells indicating zero elements.

what pairs of variables and factors are linked, in other words, what loadings are to be zero or nonzero, in a subjective manner. To deal with this problem, we proposed the procedure of SimpFA following SbCFA for computating an optimally suitable CFA model and its solution without relying on any user’s judgment. The simulation study showed that the true CFA model and parameter values can be recovered fairly well by the proposed procedure. Real data examples demonstrated that it can outperform CFA with users’ selected constraints in terms of the BIC statistic.

In Section 4.3, we found the SbCFA solutions to be equivalent to SimpFA in almost all cases. In particular, the good performance of SbCFA was somewhat surprising. To theoretically study, reasons for such a SbCFA performance are considered as a subject for a future study.

Appendix

A1. EM Algorithm for Factor Analysis

In Rubin and Thayer’s (1982) EM algorithm for FA, anauxiliary function of (8) is defined as

$\varphi \left(\Lambda ,\Psi ,\Phi |S\right)=\text{log}\left|\Psi \right|+\text{tr}\text{\hspace{0.05em}}{\Psi}^{-\text{1}}\left(S-\text{2}C{\Lambda}^{\prime}+\Lambda Q{\Lambda}^{\prime}\right)+\text{log}\left|\Phi \right|+\text{tr}\text{\hspace{0.05em}}Q{\Phi}^{-\text{1}}$. (A1)

Here, C = SA and Q = A¢SA + U, with A and U being computed as

$A={\left(\Lambda \Phi {\Lambda}^{\prime}+\Psi \right)}^{-\text{1}}\Lambda \Phi $ and $U={\Phi}^{1/2}{\left[{I}_{m}+\left({\Phi}^{1/2}{\Lambda}^{\prime}\right){\Psi}^{-\text{1}}\left(\Lambda {\Phi}^{1/2}\right)\right]}^{-\text{1}}{\Phi}^{1/2}$ (A2)

using the current L,Y, and F values. In the EM algorithm, (A1) rather than (8) is minimized iteratively. This minimization is known to allow (8) to be minimized (Rubin & Thayer, 1982). This fact also holds true, when L is constrained as $\Lambda =B\u2022\Lambda $. Thus, the minimization of (8) in EFA, SbCFA (5) or (5¢), and SimpFA (6) or (6¢) is attained with the EM algorithm.

The EM algorithm for EFA, SbCFA, and SimpFA can be summarized as follows:

Step 1. Initialize L,Y, and F, with B also initialized in SimpFA.

Step 2. Obtain C and Q in (A.1) through (A.2).

Step 3. Minimize (A.1) over Y with L and F fixed.

Step 4. Minimize (A.1) over L with Y and F fixed, with “over L“ replaced by “over B and L under (4)” only in SimpFA.

Step 5. Minimize (A.1) over F with Y and L fixed.

Step 6. Transform F and L so that F is a correlation matrix and finish if convergence is reached; otherwise. Go back to Step 2.

Here, the minimization in Step 3 is attained when the *i*th diagonal elements in Y are updated as
${\psi}_{ii}={s}_{ii}-\text{2}{{\lambda}^{\prime}}_{i}{c}_{i}+{{\lambda}^{\prime}}_{i}Q{\lambda}_{i}$ (
$i=\text{1},\cdots ,p$ ), with
${{\lambda}^{\prime}}_{i}$ the *i*th row of L,
${{c}^{\prime}}_{i}$ that of C, and *s _{ii} *the (

The initialization in Step 1 is detailed here. In EFA, the eigenvalue decomposition of S defined as S = LD^{2}L¢ is used, with D^{2} the *p *× *p* diagonal matrix whose diagonal elements are arranged in descending order, and LL¢ = I* _{p}*. That is, the initial L and Y are set to L

The minimization in Step 4 is attained by the update of L with L = CQ^{−}^{1} in EFA and the row-wise
${\lambda}_{i}={{B}^{\prime}}_{i}{\left({B}_{i}Q{{B}^{\prime}}_{i}\right)}^{-\text{1}}{B}_{i}{c}_{i}$ (
$i=\text{1},\cdots ,p$ ) in SbCFA. Here, B* _{i}* is the

then ${B}_{i}=\left[\begin{array}{lll}1\hfill & 0\hfill & 0\hfill \\ 0\hfill & 0\hfill & 1\hfill \end{array}\right]$. The Step 4 in SimpFA is detailed in Appendix A4.

A2. Algorithm for Simplimax Rotation

The algorithm for the simplimax rotation can be summarized as follows:

Step 1. Initialize T

Step 2. Update B and L as (18) with H = L_{E}T^{−1 }

Step 3. Update T with Browne’s (1972) algorithm

Step 4. Finish if convergence is reached; otherwise, go back to Step 2.

How T is initialized is described in the section for the whole process of the proposed procedure. The convergence is defined that the decrease in (16) from the previous round is less than 10^{−6} in this paper.

A3. Derivation of (18)

The simplimax function $spx\left(B,\Lambda |H\right)={\Vert B\u2022\Lambda -H\Vert}^{2}$ in (17) can be rewritten as

$spx\left(B,\Lambda |H\right)={\displaystyle {\sum}_{\left(i,j\right)\in \aleph}{\left({b}_{ij}{\lambda}_{ij}-{h}_{ij}\right)}^{2}}+{\displaystyle {\sum}_{\left(i,j\right)\in {\aleph}^{\perp}}{h}_{ij}^{2}}\ge {\displaystyle {\sum}_{\left(i,j\right)\in {\aleph}^{\perp}}{h}_{ij}^{2}}$. (A3)

with H = (*h _{ij}*). Here,
$\aleph $ denotes the set of the index pairs (

A4. Majorization Algorithm for SimpFA Loadings

Let us rewrite (A.1) as
${\varphi}^{*}\left(\Lambda \right)+const$, where *const* is a part independent of
$\Lambda =B\u2022\Lambda $ and

${\varphi}^{*}\left(\Lambda \right)=-\text{2tr}{\Psi}^{-\text{1}}C{\Lambda}^{\prime}+\text{tr}{\Psi}^{-\text{1}}\Lambda Q{\Lambda}^{\prime}=\text{tr}{\Psi}^{-\text{1}}\Lambda Q{\Lambda}^{\prime}-\text{2tr}{C}^{\prime}{\Psi}^{-\text{1}}\Lambda $. (A4)

This minimization over
$B\u2022\Lambda $ is found to be the task of Step 4 in Appendix A1 for SimpFA. Using
$\Delta =\Lambda -{\Lambda}_{\text{C}}$ or
$\Lambda ={\Lambda}_{\text{C}}+\Delta $ with L_{C} the current L value, (A4) can be rewritten as

$\begin{array}{c}{\varphi}^{*}\left(\Lambda \right)=\text{tr}{\Psi}^{-\text{1}}\left({\Lambda}_{\text{C}}+\Delta \right)Q{\left({\Lambda}_{\text{C}}+\Delta \right)}^{\prime}-\text{2tr}{C}^{\prime}{\Psi}^{-\text{1}}\left({\Lambda}_{\text{C}}+\Delta \right)\\ ={\varphi}^{*}\left({\Lambda}_{\text{C}}\right)+\text{tr}{\Psi}^{-\text{1}}\Delta Q{\Delta}^{\prime}+\text{tr}2V\Delta \end{array}$ (A4¢)

with
$V={Q}^{\prime}{{\Lambda}^{\prime}}_{\text{C}}{\Psi}^{-\text{1}}-{C}^{\prime}{\Psi}^{-\text{1}}{\Lambda}_{\text{C}}$. Kiers (1990, Theorem 1) shows that
$\text{tr}{\Psi}^{-\text{1}}\Delta Q\Delta $ in (A4¢) satisfies the inequality
$\text{tr}{\Psi}^{-\text{1}}\Delta Q{\Delta}^{\prime}/{\Vert \Delta \Vert}^{2}\le \alpha $, *i.e.,*
$\text{tr}{\Psi}^{-\text{1}}\Delta Q{\Delta}^{\prime}\le \alpha {\Vert \Delta \Vert}^{2}$, where *α *is the greatest eigenvalue of
${\Psi}^{-\text{1}}\otimes Q$, with
$\otimes $ denoting the Kronecker product. Using that inequality, a function majorizing (A4¢) can be defined as

$\eta \left(\Lambda \right)={\varphi}^{*}\left({\Lambda}_{\text{C}}\right)+\beta \text{tr}{\Vert \Delta \Vert}^{\text{2}}+\text{2tr}V\Delta =g\left({\Lambda}_{\text{C}}\right)+\beta {\Vert \Delta -V\Vert}^{\text{2}}-\beta {\Vert V\Vert}^{\text{2}}$, (A5)

with *β *≥ *α* and *β *≥ 0: (14) satisfies
${\varphi}^{*}\left({\Lambda}_{\text{C}}\right)=\eta \left({\Lambda}_{\text{C}}\right)\ge \eta \left(\Lambda \right)\ge {\varphi}^{*}\left(\Lambda \right)$ if L is the minimizer of
${\varphi}^{*}\left({\Lambda}_{\text{C}}\right)$. Since only
$\beta {\Vert \Delta -V\Vert}^{\text{2}}$ is a function of L with *β *³ 0 on the right side of (A5), the task of Step 4 is attained by minimizing
${\eta}^{*}\left(\Lambda \right)={\Vert \Delta -V\Vert}^{\text{2}}$. From
$\Delta =\Lambda -{\Lambda}_{\text{C}}$, we can rewrite
${\eta}^{*}\left(\Lambda \right)$ as
${\eta}^{*}\left(\Lambda \right)={\Vert \Lambda -{\Lambda}_{\text{C}}-V\Vert}^{\text{2}}={\Vert \Lambda -W\Vert}^{\text{2}}$, which is the simplimax function (19), with
$W={\Lambda}_{\text{C}}+V$. Thus, B and L to be obtained in Step 4 are given by (18) with H = (*h _{ij}*) and
${h}_{\left[c\right]}^{2}$ replaced by W = (

Conflicts of Interest

The authors declare no conflicts of interest.

[1] |
Adachi, K. (2019) Factor Analysis: Latent Variable, Matrix Decomposition, and Constrained Uniqueness Formulations. WIREs Computational Statistics, 11, e1458.
https://onlinelibrary.wiley.com/doi/abs/10.1002/wics.1458 https://doi.org/10.1002/wics.1458 |

[2] | Mulaik, S.A. (2010) Foundations of Factor Analysis. 2nd Edition, CRC Press, Boca Raton. |

[3] |
Yanai, H. and Ichikawa, M. (2007) Factor Analysis. In: Rao, C.R. and Sinharay, S., Eds., Handbook of Statistics, Vol. 26, Elsevier, Amsterdam, 257-296.
https://doi.org/10.1016/S0169-7161(06)26009-7 |

[4] | Arbuckle, J.L. (2008) AMOS 17 User’s Guide. SPSS Inc., Chicago. |

[5] |
Bollen, K.A. (1989) Structural Equations with Latent Variables. Wiley, New York.
https://doi.org/10.1002/9781118619179 |

[6] |
Jöreskog, K.G. (1969) A General Approach to Confirmatory Maximum Likelihood Factor Analysis. Psychometrika, 34, 183-202.
https://doi.org/10.1002/9781118619179 |

[7] |
Lawley, D.N. (1941) Further Investigation of Factor Estimation. Proceedings of the Royal Society of Edinburgh A, 61, 176-185.
https://doi.org/10.1017/S0080454100006178 |

[8] |
Rubin, D.B. and Thayer, D.T. (1982) EM Algorithms for ML Factor Analysis. Psychometrika, 47, 69-76. https://doi.org/10.1007/BF02293851 |

[9] |
Adachi, K. (2013) Factor Analysis with EM Algorithm Never Gives Improper Solutions When Sampleand Initial Parameter Matrices Are Proper. Psychometrika, 78, 380-394. https://doi.org/10.1007/s11336-012-9299-8 |

[10] |
Konishi, S. and Kitagawa, G. (2008) Information Criteria and Statistical Modeling. Springer, New York. https://doi.org/10.1007/978-0-387-71887-3 |

[11] |
Kiers, H.A.L. (1994) Simplimax: Oblique Rotation to an Optimal Target with Simple Structure. Psychometrika, 59, 567-579. https://doi.org/10.1007/BF02294392 |

[12] |
Browne, M.W. (1972) Oblique Rotation to a Partially Specified Target. British Journal of Mathematical and Statistical Psychology, 25, 207-212.
https://doi.org/10.1007/BF02294392 |

[13] |
Kiers, H.A.L. (1990) Majorization as a Tool for Optimizing a Class of Matrix Functions. Psychometrika, 55, 417-428. https://doi.org/10.1007/BF02294758 |

[14] |
Kiers, H.A.L. (2002) Setting Up Alternating Least Squares and Iterative Majorization Algorithms for Solving Various Matrix Optimization Problems. Computational Statistics and Data Analysis, 41, 157-170.
https://doi.org/10.1016/S0167-9473(02)00142-1 |

[15] |
Akaike, H. (1974) A New Look at the Statistical Model Identification. IEEE Transactions on Automatic Control, 19, 716-723.
https://doi.org/10.1109/TAC.1974.1100705 |

[16] |
Schwarz, G. (1978) Estimating the Dimension of a Model. Annals of Statistics, 6, 461-464. https://doi.org/10.1214/aos/1176344136 |

[17] |
Carlson, M. and Mulaik, S.A. (1993) Trait Ratings from Descriptions of Behavior as Mediated by Components of Meaning. Multivariate Behavioral Research, 28, 111-159.
https://doi.org/10.1207/s15327906mbr2801_7 |

Journals Menu

Contact us

customer@scirp.org | |

+86 18163351462(WhatsApp) | |

1655362766 | |

Paper Publishing WeChat |

Copyright © 2023 by authors and Scientific Research Publishing Inc.

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.