Greetings,
(I posted this on the newsgroup, but was advised that the list has better traffic these days, so I'm taking the liberty of reposting my question here.) I recently had need to get some factor analysis results (loadings and eigenvalues) to match between SPSS and Stata. A particular estimation process I'm interested in stipulates that a factor analysis should be used for part of the process, and that SPSS's principal axes extraction (PAF or the old PA2) should be used. I'm trying to reproduce this analysis in Stata (as well as in SPSS), and I'm now a bit wary about what both programs are doing. I know that terminology can be a bit loose here, which may account for some of the issues. Here's what I found that puzzles me: 1. From the algorithm descriptions, SPSS's principal axes and Stata's iterated principal factors extractions both seem to iteratively re- estimate the communalities. However, I can't get them to produce the same eigenvalues. Close, but consistently different. 2. SPSS gives the same eigenvalues regardless of what extraction I use: * For example, Principal Axes vs. Principal Components. GET FILE='C:\Program Files\IBM\SPSS\Statistics\19\Samples\English \car_sales.sav'. FACTOR /VARIABLES price engine_s horsepow wheelbas width length curb_wgt /MISSING LISTWISE /ANALYSIS price engine_s horsepow wheelbas width length curb_wgt /PRINT INITIAL EXTRACTION /CRITERIA MINEIGEN(1) ITERATE(25) /EXTRACTION PAF /ROTATION NOROTATE . *. FACTOR /VARIABLES price engine_s horsepow wheelbas width length curb_wgt /MISSING LISTWISE /ANALYSIS price engine_s horsepow wheelbas width length curb_wgt /PRINT INITIAL EXTRACTION /CRITERIA MINEIGEN(1) ITERATE(25) /EXTRACTION PC /ROTATION NOROTATE /METHOD=CORRELATION. /METHOD=CORRELATION. 3. Stata's "principal component factor" approach (communalities fixed at 1.0) gives the same results as SPSS, which would suggest that SPSS is *not* doing principal axes, which supposedly iteratively re- estimates communalities. Can anyone offer clarification on what's going on here, in particular what is happening in SPSS such that changing the method of extraction does not affect the eigenvalues? (I don't expect anyone here to have any particular interest in or knowledge about Stata, but I thought some FA knowledgeable folks might nevertheless be able to shed light on what's happening.) Regards, Mike Lacy Dept. of Sociology Colorado State Univ. Fort Collins CO |
Consider the following points:
(1) I think there may be some confusion about what Stata and SPSS are doing. Let me suggest that you take a look at the UCLA Stat Computing center and take a look at the SPSS and Stata factor analysis write-ups which seem to perform that same analysis (principal axis factor analysis) on the same dataset (13 items from a survey conducted by John Sidanius; on the SPSS page there's a link to the SPSS data file being used). For the SPSS Factor Analysis see: http://www.ats.ucla.edu/stat/spss/output/factor1.htm For the Stata Factor Analysis, see: http://www.ats.ucla.edu/stat/stata/output/fa_output.htm Note that though both analyses extract the same number of factors, though the eigenvalues appear to be different.. (3) In your point #2 below you say: "SPSS gives the same eigenvalues regardless of what extraction I use". You leave out a key word, you should say "same INITIAL eignenvalues". Again, if you go to the UCLA stat center and look at their SPSS output for principal component analysis (which uses the same dataset referred to above), you will see that the initial eigenvalues are the same as those for SPSS PFA; see: http://www.ats.ucla.edu/stat/SPSS/output/principal_components.htm The "Extraction Sums of squared Loadings" are different for SPSS PCA vs. PFA but the Extracted "Total" (new eigenvalues) are the same as those produced in the Stata output; Look at column 4 of the "Total Variance Explained" table in the SPSS PFA and compare it to column 2 in the Stata output -- the first three values are the same. Stata provides all of the eigenvalues after iteration while SPSS provide the eigenvalues for the number of factors extracted. (4) I'd wager that the initial eigenvalues are the same because both analyses start off using 1.00 on the diagonal. For the principal factor analysis this then goes through a process where this is then iteratively changed as indicated in the ver 18 SPSS Algorithms manual on page 322. I think that the eigenvalues you're looking for are in the TOTAL column under the heading "Extraction Sums of Squared Loadings". It's been some time since I've gone through these types of comparisons and I'll leave to the more knowledgeable folks to point out where I am wrong. -Mike Palij New York University [hidden email] ----- Original Message ----- From: "mglacy" <[hidden email]> To: <[hidden email]> Sent: Friday, July 15, 2011 6:53 PM Subject: Factor analysis extraction methods > Greetings, > (I posted this on the newsgroup, but was advised that the list has better > traffic > these days, so I'm taking the liberty of reposting my question here.) > > I recently had need to get some factor analysis results (loadings and > eigenvalues) to match between SPSS and Stata. A particular estimation > process I'm interested in stipulates that a factor analysis should be > used for part of the process, and that SPSS's principal axes extraction > (PAF or the old PA2) should be used. > > I'm trying to reproduce this analysis in Stata (as well as in SPSS), and > I'm now a bit wary about what both programs are doing. I know that > terminology can be a bit loose here, which may account for some of the > issues. > > Here's what I found that puzzles me: > > 1. From the algorithm descriptions, SPSS's principal axes and Stata's > iterated principal factors extractions both seem to iteratively re- > estimate the communalities. However, I can't get them to produce the > same eigenvalues. Close, but consistently different. > > 2. SPSS gives the same eigenvalues regardless of what extraction I > use: > > * For example, Principal Axes vs. Principal Components. > GET > FILE='C:\Program Files\IBM\SPSS\Statistics\19\Samples\English > \car_sales.sav'. > > FACTOR > /VARIABLES price engine_s horsepow wheelbas width length curb_wgt > /MISSING LISTWISE > /ANALYSIS price engine_s horsepow wheelbas width length curb_wgt > /PRINT INITIAL EXTRACTION > /CRITERIA MINEIGEN(1) ITERATE(25) > /EXTRACTION PAF > /ROTATION NOROTATE . > *. > FACTOR > /VARIABLES price engine_s horsepow wheelbas width length curb_wgt > /MISSING LISTWISE > /ANALYSIS price engine_s horsepow wheelbas width length curb_wgt > /PRINT INITIAL EXTRACTION > /CRITERIA MINEIGEN(1) ITERATE(25) > /EXTRACTION PC > /ROTATION NOROTATE > /METHOD=CORRELATION. > /METHOD=CORRELATION. > > 3. Stata's "principal component factor" approach (communalities fixed > at 1.0) gives the same results as SPSS, which would suggest that SPSS > is *not* doing principal axes, which supposedly iteratively re- > estimates communalities. > > Can anyone offer clarification on what's going on here, in particular > what is happening in SPSS such that changing the method of extraction > does not affect the eigenvalues? (I don't expect anyone here to have > any particular interest in or knowledge about Stata, but I thought > some FA knowledgeable folks might nevertheless be able to shed > light on what's happening.) > > Regards, > > Mike Lacy > Dept. of Sociology > Colorado State Univ. > Fort Collins CO > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Factor-analysis-extraction-methods-tp4592615p4592615.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
<quote author="Mike Palij"> >(3) In your point #2 below you say: "SPSS gives the same >eigenvalues regardless of what extraction I use". You leave out >a key word, you should say "same INITIAL eignenvalues". This seems exactly correct. I was simply confused because FACTOR reports the initial eigenvalues, before whatever iterative extraction (PAF, ML, etc.) happens. What some other programs report as the eigenvalues (after extraction) are what SPSS labels as "Extraction Sum of Squared Loadings." So, for example, the appropriate option in Stata will produce eigenvalues that exactly match up with FACTOR using PAF. Mystery is solved. Thanks, Mike Lacy Dept. of Sociology Colorado State University |
Powered by Nabble | Edit this page |