A New Algorithm for Generalized Least Squares Factor Analysis with a Majorization Technique ()

Kohei Adachi^{}

Graduate School of Human Sciences, Osaka University, Osaka, Japan.

**DOI: **10.4236/ojs.2015.53020
PDF
HTML XML
3,185
Downloads
4,626
Views
Citations

Graduate School of Human Sciences, Osaka University, Osaka, Japan.

Factor analysis (FA) is a time-honored multivariate analysis procedure for exploring the factors underlying observed variables. In this paper, we propose a new algorithm for the generalized least squares (GLS) estimation in FA. In the algorithm, a majorization step and diagonal steps are alternately iterated until convergence is reached, where Kiers and ten Berge’s (1992) majorization technique is used for the former step, and the latter ones are formulated as minimizing simple quadratic functions of diagonal matrices. This procedure is named a majorizing-diagonal (MD) algorithm. In contrast to the existing gradient approaches, differential calculus is not used and only elmentary matrix computations are required in the MD algorithm. A simuation study shows that the proposed MD algorithm recovers parameters better than the existing algorithms.

Keywords

Exploratory Factor Analysis, Generalized Least Squares Estimation, Matrix Computations, Majorization

Share and Cite:

Adachi, K. (2015) A New Algorithm for Generalized Least Squares Factor Analysis with a Majorization Technique. *Open Journal of Statistics*, **5**, 165-172. doi: 10.4236/ojs.2015.53020.

1. Introduction

Using for a

(1)

with a p-

(2)

Here, is the matrix of zeros, is the identity matrix, and is the diagonal matrix whose diagonal elements are called

(3)

[1] [2] . A main purpose of FA is to estimate the parameter matrices and from the inter-variable sample covariance matrix corresponding to (3). Some authors classify FA as exploratory (EFA) or

(4)

respectively,

In all

For all of the LS, GLS, and ML

As found in the above discussion, an

The MD algorithm is not the

The remaining parts of this paper are organized as follows: the MD algorithm is detailed in the next section, and it is illustrated with a real data set in Section 3. A simulation study for assessing the algorithm is reported in Section 4, which is followed by discussions.

2. Proposed Algorithm

We propose the MD algorithm for minimizing the GLS loss function (4) over the loadings in and the unique variances in the diagonal matrix . Here, it is supposed that the sample covariance matrix is positive-definite and is of full-column rank, i.e., its rank is with. This supposition and the covariance matrix being modeled as (3) imply that, without loss of generality, we can reparameterize as

(5)

where is a matrix satisfying

(6)

and is an positive-definite diagonal matrix. By substituting (5) into the GLS loss function (4), it is rewritten as

(7)

This function is minimized over, , and subject to (6) and the latter two matrices being diagonal ones, by alternately iterating the majorizing and diagonal steps described in the next subsections.

2.1. Majorization Step

Let us consider minimizing (7) over subject to (6) while and are kept fixed. Summarizing the parts irrelevant to in (7) into, the loss function (7) is rewritten as

. (8)

Though the optimal

According to the formula, the update of by

(9)

decreases the value of (8) with. Here, stands for the matrix before the update; and are the column-orthonormal matrices that are obtained from the SVD defined as

(10)

with the diagonal matrix including the singular values of the matrix in the left-hand side, and, , and the largest eigenvalues of, , and, respectively.

2.2. Diagonal Steps

In this section, we describe updating each of diagonal matrices and. First, let us consider minimizing the loss function (7) over with keeping and fixed. Since the terms relevant to in the loss function (7) are the same as those relevant to, the expression (8) into which (7) is rewritten is to be noted again. By taking account of the fact that is a diagonal matrix, (8) can be rewritten as

(11)

Here, and with denoting the diagonal matrix whose diagonal elements are those of the parenthesized matrix. Further, we can rewrite (11) as with denoting the Frobenius norm. It shows that the function

(11) is minimized for

(12)

for fixed and.

Next, we consider minimizing (7) over with and fixed. Summarizing the parts irrelevant to in (7) into and using the fact of being a diagonal matrix, the loss function (7) can be rewritten as

(13)

with and. We can find that (13) is minimized for

(14)

for fixed and.

2.3. Whole Algorithm

The results in the last two subsections show that the proposed MD algorithm can be listed as follows:

Step 1. Initialize, , and.

Step 2. Update with (9) times.

Step 3. Update with (12).

Step 4. Update with (14).

Step 5. Finish with set to (5) if convergence is reached; otherwise, return to Step 2.

It should be noted in Step 2 that the update of by (9) does not minimize (7) but only decreases its value, which implies that that update can be replicated (times) for further decreasing the value of (7). In this paper, we set.

In Step 1, the initialization is performed using the principal component analysis of sample covariance matrix. That is, the initial and are given by and, respectively, with the diagonal matrix whose diagonal elements are the largest eigenvalues of, and the columns of being the eigenvectors corresponding to. The initial is set to.

In Step 5, we define the convergence as the decrease in the value of (7) or (4) from the previous round being less than.

3. Illustration

In this section, we illustrate the performance of the MD algorithm with a 190-

We carried out the MD algorithm for the

Table 1. Loadings and unique variances Y1_{p} for personality rating data.

Figure 1. Change in the GLS loss function value as a function of the number of iteration.

As the

4. Simulation Study

A simulation study was performed in order to assess how well parameter matrices are recovered by the proposed MD algorithm and compare it with the existing algorithms for the GLS estimation in the goodness of the recovery. We first describe the procedure for synthesizing the data to be analyzed, which is followed by results.

An n-observations × p-variables data matrix was synthesized according to the matrix versions of the FA model (1) and the assumptions in (2):

(15)

(16)

Step 1. Draw from, from, and from, with denoting the discrete uniform distribution defined for the integers within the range.

Step 2. Draw each loading in from and each unique variance in from with denoting the uniform distribution over the range.

Step 3. Draw each elements of in (15) from which is followed by centering and post- multiplying it by the matrix that allows the resulting to satisfy (16).

Step 4. Form with (15) and obtain the covariance matrix.

In Step 3 we have used a uniform distribution for, rather than the normal distribution typically used for such a matrix, as a feature of the GLS estimation is that it does not need the normality assumption required in the ML estimation. We replicated the above steps to have 2000 sets of. For them, the MD and the existing algorithms were carried out, where the latter are the two gradient algorithms [9] [10] , as described in Section 1. We refer to the ones in [9] and [10] as the Newton-Raphson (NR) and Gauss-Newton (GN) algorithms, respectively. In the NR one, we obtained the gradient vector in [9] , Equation (32), by pre-multiplying the vector of first derivatives by the Moore Penrose inverse of the corresponding Hessian matrix. Also in the NR and GN algorithms, we used the same initialization and definition of convergence as in Section 2.3.

Let us express the true simply as and use for the solution given by the NR, GN, or MD algorithm. For assessing the recovery of the loading matrix, the averaged absolute difference (AAD) of the elements in to the corresponding estimates, i.e.,

, (17)

can be used with denoting the norm. Here, it should be noted that has rotational freedom and must be rotated so that the resulting is optimally matched to. Such a rotated can be obtained by the orthogonal Procrustes method [22] with a target matrix. The loading matrix in (17) thus stands for the one rotated by the Procrustes method. The recovery of unique variances can also be assessed with the AAD index, where the unique variances are uniquely determined, thus the additional procedure as for is unnecessary. Smaller values of those AAD indices stand for better recovery.

The

Table 2. Statistics for the differences between the true parameter values and their estimated counterparts.

s

5. Discussion

We proposed the majorizing-diagonal (MD) algorithm for the GLS estimation in FA. In the algorithm, the loading matrix is reparameterized as the product of a column-orthonormal matrix and a diagonal one, and the former one is updated with Kiers and ten Berge’s [18]

One of the tasks remaining for the MD algorithm is to study its mathematical properties as have been done for the algorithms in the other estimation procedures. For example, it has been found that the EM algorithm for the ML estimation [12] can never give an improper solution under a certain condition [23] , where the improper solution refers to the one including a negative unique variance. Whether such special features are possessed by the MD algorithm is considered to be found by studying the properties of the matrix update formulas in the algorithm.

Conflicts of Interest

The authors declare no conflicts of interest.

[1] | Harman, H.H. (1976) Modern Factor Analysis. 3rd Edition, The University of Chicago Press, Chicago. |

[2] | Mulaik, S.A. (2010) Foundations of Factor Analysis. 2nd Edition, CRC Press, Boca Raton. |

[3] | Yanai, H. and Ichikawa, M. (2007) Factor Analysis. In: Rao, C.R. and Sinharay, S., Eds., Handbook of Statistics, Vol. 26: Psychometrics, Elsevier, Amsterdam, 257-296. |

[4] | Anderson, T.W. and Rubin, H. (1956) Statistical Inference in Factor Analysis. In: Neyman, J., Ed., Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Vol. 5, University of California Press, Berkeley, 111-150. |

[5] | Lange, K. (2010) Numerical Analysis for Statisticians. 2nd Edition, Springer, New York. |

[6] | ten Berge, J.M.F. (1993) Least Squares Optimization in Multivariate Analysis. DSWO Press, Leiden. |

[7] |
Jöreskog, K.G. (1967) Some Contributions to Maximum Likelihood Factor Analysis. Psychometrika, 32, 443-482.
http://dx.doi.org/10.1007/BF02289658 |

[8] |
Jennrich, R.I. and Robinson, S.M. (1969) A Newton-Raphson Algorithm for Maximum Likelihood Factor Analysis. Psychometrika, 34, 111-123.
http://dx.doi.org/10.1007/BF02290176 |

[9] |
Jöreskog, K.G. and Goldberger, A.S. (1972) Factor Analysis by Generalized Least Squares. Psychometrika, 37, 243-250.
http://dx.doi.org/10.1007/BF02306782. |

[10] |
Lee, S.Y. (1978) The Gauss-Newton Algorithm for the Weighted Least Squares Factor Analysis. Journal of the Royal Statistical Society: Series D (The Statistician), 27, 103-114.
http://dx.doi.org/10.2307/2987906 |

[11] |
Harman, H.H. and Jones, W.H. (1966) Factor Analysis by Minimizing Residuals (Minres). Psychomerika, 31, 351-369.
http://dx.doi.org/10.1007/BF02289468 |

[12] |
Rubin, D.B. and Thayer, D.T. (1982) EM Algorithms for ML Factor Analysis. Psychometrika, 47, 69-76.
http://dx.doi.org/10.1007/BF02293851 |

[13] | Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977) Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B, 39, 1-38. |

[14] | Groenen, P.J.F. (1993) The Majorization Approach to Multidimensional Scaling: Some Problems and Extensions. DSWO Press, Leiden. |

[15] |
Unkel, S. and Trendafilov, N.T. (2010) A Majorization Algorithm for Simultaneous Parameter Estimation in Robust Exploratory Factor Analysis. Computational Statistics and Data Analysis, 54, 3348-3358.
http://dx.doi.org/10.1016/j.csda.2010.02.003 |

[16] |
Unkel, S. and Trendafilov, N.T. (2010) Simultaneous Parameter Estimation in Exploratory Factor Analysis: An Expository Review. International Statistical Review, 78, 363-382.
http://dx.doi.org/10.1111/j.1751-5823.2010.00120.x |

[17] |
Adachi, K. (2012) Some Contributions to Data-Fitting Factor Analysis with Empirical Comparisons to Covariance-Fitting Factor Analysis. Journal of the Japanese Society of Computational Statistics, 25, 25-38.
http://dx.doi.org/10.5183/jjscs.1106001_197 |

[18] |
Kiers, H.A.L. and ten Berge, J.M.F. (1992) Minimization of a Class of Matrix Trace Functions by Means of Refined Majorization. Psychometrika, 57, 371-382.
http://dx.doi.org/10.1007/BF02295425 |

[19] |
Kiers, H.A.L. (1990) Majorization as a Tool for Optimizing a Class of Matrix Functions. Psychometrika, 55, 417-428.
http://dx.doi.org/10.1007/BF02294758 |

[20] | Costa, P.T. and McCrae, R.R. (1992) NEO PI-R Professional Manual: Revised NEO Personality Inventory (NEO PI-R) and NEO Five-Factor Inventory (NEO-FFI). Psychological Assessment Resources, Odessa, FL. |

[21] |
Kaiser, H.F. (1958) The Varimax Criterion for Analytic Rotation in Factor Analysis. Psychometrika, 23, 187-200.
http://dx.doi.org/10.1007/BF02289233 |

[22] |
Gower, J.C. and Dijksterhuis, G.B. (2004) Procrustes Problems. Oxford University Press, Oxford.
http://dx.doi.org/10.1093/acprof:oso/9780198510581.001.0001 |

[23] |
Adachi, K. (2013) Factor Analysis with EM Algorithm Never Gives Improper Solutions When Sample Covariance and Initial Parameter Matrices Are Proper. Psychometrika, 78, 380-394.
http://dx.doi.org/10.1007/s11336-012-9299-8 |

Journals Menu

Contact us

+1 323-425-8868 | |

customer@scirp.org | |

+86 18163351462(WhatsApp) | |

1655362766 | |

Paper Publishing WeChat |

Copyright © 2024 by authors and Scientific Research Publishing Inc.

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.