

I am running a factor analysis on scaled survey responses. There are 3
course sections that I am combining for the analysis; the determinant of the
correlation matrix is 0. I checked the correlations, none of them are > 0.9.
Data is highly significant, satisfies KMO conditions, Bartlett's test and
is superb for factor analysis. However, the determinant is 0 and that's
really unusual especially because if I don't combine the sections and run
them individually, the determinant is not 0. It's only when I combine the 3
sections that I end up with a determinant of 0.
I have more cases than variables; I really don't see anything that
perplexing about the data itself, but the 0 determinant is cause for concern.
Any reasons / pointers / advice on why this might be happening and what can
be done to resolve it would be much appreciated.
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD


I could be many things.
One possibility is that one or more items has a squared multiple
correlation of 1.00 with some of the other items.
treat all your items as items in a single scale in RELIABILITY.
Double check the statistics about the correlations. (max, min)
Look at the squared multiple correlations of each item with the
the other items.
What do you see?
What is the meaning of the sections?
Are sets of items written to measure specific constructs? Or are
you just exploring the data? or ??
Art Kendall
Social Research Consultants
On 3/18/2012 12:35 PM, Meg wrote:
I am running a factor analysis on scaled survey responses. There are 3
course sections that I am combining for the analysis; the determinant of the
correlation matrix is 0. I checked the correlations, none of them are > 0.9.
Data is highly significant, satisfies KMO conditions, Bartlett's test and
is superb for factor analysis. However, the determinant is 0 and that's
really unusual especially because if I don't combine the sections and run
them individually, the determinant is not 0. It's only when I combine the 3
sections that I end up with a determinant of 0.
I have more cases than variables; I really don't see anything that
perplexing about the data itself, but the 0 determinant is cause for concern.
Any reasons / pointers / advice on why this might be happening and what can
be done to resolve it would be much appreciated.
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants


The data are essentially scores obtained by using the semantic differential instrument (scores from 17) and the sections represent number of students (cases). So each student should have a score corresponding to 20 different items (variables) respectively. Raw data are transferred to SPSS and all I'm trying to do is to explore the number of factors emerging from the 20 variables. So, from the Scree plot and eigenvalue condition > 1, there are 3 factors extracted. However, it still does not explain why the determinant of the correlation matrix is 0 and I'm not sure if this data becomes inadmissible on those grounds, despite having satisfied the minimum criteria for factor analysis.
Another strange aspect is that the residual matrix does not have values close to 0 at all. In fact, some of the values are > 0.1 and a lot of the values are negative. So, has the factor extraction been inefficient or should some of the variables be removed?
The reliability statistics summary:
Summary Item Statistics
Mean Min. Max. Range Max / Min Variance N of Items
Item Means 3.630 2.390 5.949 3.559 2.489 .813 20
Item Variances 1.743 1.240 2.338 1.098 1.886 .091 20
InterItem .015 .567 .615 1.182 1.085 .115 20
correlations
The highest square multiple correlation value is .624. None of them has a SMC of 1.00 with other items.
Thanks for your help!

Administrator

How are missing data treated? RELIABILITY requires LISTWISE deletion which is *DEFAULT* for FACTOR.
If you are using PAIRWISE deletion the R matrix can be singular.
One thing to try:
Create a junk variable and use that as DEPendent in REGRESSION.
Look at the VIF and Tolerance stat when requesting TOLerance on STATistics subcommand.
COMPUTE junk=Normal(1).
REGRESSION / STAT TOL / DEP junk/ METHOD ENTER x1 TO x20.
HTH, David

meteora100 wrote
The data are essentially scores obtained by using the semantic differential instrument (scores from 17) and the sections represent number of students (cases). So each student should have a score corresponding to 20 different items (variables) respectively. Raw data are transferred to SPSS and all I'm trying to do is to explore the number of factors emerging from the 20 variables. So, from the Scree plot and eigenvalue condition > 1, there are 3 factors extracted. However, it still does not explain why the determinant of the correlation matrix is 0 and I'm not sure if this data becomes inadmissible on those grounds, despite having satisfied the minimum criteria for factor analysis.
Another strange aspect is that the residual matrix does not have values close to 0 at all. In fact, some of the values are > 0.1 and a lot of the values are negative. So, has the factor extraction been inefficient or should some of the variables be removed?
The reliability statistics summary:
Summary Item Statistics
Mean Min. Max. Range Max / Min Variance N of Items
Item Means 3.630 2.390 5.949 3.559 2.489 .813 20
Item Variances 1.743 1.240 2.338 1.098 1.886 .091 20
InterItem .015 .567 .615 1.182 1.085 .115 20
correlations
The highest square multiple correlation value is .624. None of them has a SMC of 1.00 with other items.
Thanks for your help!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.

"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"


In addition to what David suggested, I would suggest that you look at the negative correlations you have.Â In the bit of Reliability output you provide below, it appears that you minimum correlation is a .567  you should not have any negative correlations (unless
they are close to zero).Â First, how many negative correlations do you have.Â Second, why do you have negative correlations? If you eliminate the variables with negative correlations, does that improve things?Â
Mike Palij New York University [hidden email]On Sun, Mar 18, 2012 at 4:57 PM, David Marso <[hidden email]> wrote:
How are missing data treated? Â RELIABILITY requires LISTWISE deletion which
is *DEFAULT* for FACTOR.
If you are using PAIRWISE deletion the R matrix can be singular.
One thing to try:
Create a junk variable and use that as DEPendent in REGRESSION.
Look at the VIF and Tolerance stat when requesting TOLerance on STATistics
subcommand.
COMPUTE junk=Normal(1).
REGRESSION / STAT TOL / DEP junk/ METHOD ENTER x1 TO x20.
HTH, David

meteora100 wrote
>
> The data are essentially scores obtained by using the semantic
> differential instrument (scores from 17) and the sections represent
> number of students (cases). Â So each student should have a score
> corresponding to 20 different items (variables) respectively. Â Raw data
> are transferred to SPSS and all I'm trying to do is to explore the number
> of factors emerging from the 20 variables. Â So, from the Scree plot and
> eigenvalue condition > 1, there are 3 factors extracted. Â However, it
> still does not explain why the determinant of the correlation matrix is 0
> and I'm not sure if this data becomes inadmissible on those grounds,
> despite having satisfied the minimum criteria for factor analysis.
>
> Another strange aspect is that the residual matrix does not have values
> close to 0 at all. Â In fact, some of the values are > 0.1 and a lot of the
> values are negative. Â So, has the factor extraction been inefficient or
> should some of the variables be removed?
>
> The reliability statistics summary:
> Summary Item Statistics
> Â Â Â Â Â Â Â Â Â Â Â Mean Â Â Â Â Â Â Min. Â Â Â Â Â Â Max. Â Â Â Â Â Range Â Â Â Â Max / Min
> Variance Â Â Â Â Â Â N of Items
> Item Means Â Â 3.630 Â 2.390 Â 5.949 Â 3.559 Â 2.489 Â Â Â Â Â Â .813 Â Â Â Â Â Â Â 20
> Item Variances Â Â Â Â 1.743 Â 1.240 Â 2.338 Â 1.098 Â 1.886 Â Â Â Â Â Â .091
> 20
> InterItem Â Â Â .015 Â .567 Â .615 Â Â Â Â Â Â 1.182 Â 1.085 Â Â Â Â Â Â .115
> 20
> correlations
>
> The highest square multiple correlation value is .624. Â None of them has a
> SMC of 1.00 with other items.
>
> Thanks for your help!
>

View this message in context: http://spssxdiscussion.1045642.n5.nabble.com/DeterminantofCorrelationMatrixis0tp5575290p5575658.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD


What are the items?
Also the Kaiser criterion only tells the software not to do the
work of extracting more dimensions because any further factors
would only account for as much variance as an average variable.
I suggest you ball park the number of factors using parallel
analysis.
Also are you using listwise deletion in both the factor analysis
and the reliability analysis? If you are using pairwise deletion
that can cause the matrix to be singular.
If this you did use listwise deletion, are you getting a message
that says the determinant is zero or id that only to the number of
digits displayed in the output?
If you are familiar with regression diagnostics there have been
some techniques posted in this list sometime in the last year or
so.
Although direction of scoring would not be a problem when getting
the SMCs, or in getting the magnitude of interitem correlations,
be sure that you reflect items appropriately when you do the
reliability analysis on your final scale(s).
Is this a homework assignment? if not and the list does not end
up helping, if you send me a system file with respondent ID
removed. I'll take a look at it. If you send a file, create
random IDs for your cases. Then keep a copy for yourself that has
both your original ID and the random ID.
Art Kendall
Social Research Consultants
On 3/18/2012 3:53 PM, meteora100 wrote:
The data are essentially scores obtained by using the semantic differential
instrument (scores from 17) and the sections represent number of students
(cases). So each student should have a score corresponding to 20 different
items (variables) respectively. Raw data are transferred to SPSS and all
I'm trying to do is to explore the number of factors emerging from the 20
variables. So, from the Scree plot and eigenvalue condition > 1, there are
3 factors extracted. However, it still does not explain why the determinant
of the correlation matrix is 0 and I'm not sure if this data becomes
inadmissible on those grounds, despite having satisfied the minimum criteria
for factor analysis.
Another strange aspect is that the residual matrix does not have values
close to 0 at all. In fact, some of the values are > 0.1 and a lot of the
values are negative. So, has the factor extraction been inefficient or
should some of the variables be removed?
The reliability statistics summary:
Summary Item Statistics
Mean Min. Max. Range Max / Min
Variance N of Items
Item Means 3.630 2.390 5.949 3.559 2.489 .813 20
Item Variances 1.743 1.240 2.338 1.098 1.886 .091
20
InterItem .015 .567 .615 1.182 1.085 .115
20
correlations
The highest square multiple correlation value is .624. None of them has a
SMC of 1.00 with other items.
Thanks for your help!

View this message in context: http://spssxdiscussion.1045642.n5.nabble.com/DeterminantofCorrelationMatrixis0tp5575290p5575576.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants

Administrator

"If you eliminate the variables with negative correlations, does that
improve things?"
Rather than remove them simply reverse the scoring.
OP: It would be useful to have access to the correlation matrix if you wish to receive much more than speculative blind stabbing at the problem.

CORR ALL/ MATRIX OUT (*).
SAVE OUTFILE "Corr.sav".
Then attach the saved file to your posting!
*Substitute your 20 vars for ALL in the CORR command.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.

"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"


I will recheck the entire analysis for the parameters mentioned in your response and if I'm still not getting anywhere with it, I will send you the file.
Thanks for the prompt response!


I will redo the analysis with elimination of negatively correlated variables and see how that turns out. If I'm still not getting anywhere, I will send the correlation matrix.
Thanks for the prompt response!


I will try the analysis with negatively correlated variables removed and see how that turns out. If I'm not getting anywhere, I will send the file and post the associated correlation matrix on the forum.
Thanks much!

Administrator

"Look at the squared multiple correlations of each item with the
the other items. "
That might not always be terribly revealing if say some var is a near simple sum of the remaining. In such a case ALL of the SMCs will be very high in any regression of the K1 on any other. They will all be 1 if the relationship is a perfect linear combination.
See sample below.

new file.
input program.
loop id=1 to 1000.
do repeat x=x1 to x10.
compute x= normal(1).
end repeat.
end case.
end loop.
end file.
end input program.
DEFINE SMC (VARS !CMDEND).
!LET !CPY=!VARS
!DO !I !IN (!VARS)
*.
REG /STAT R TOL/DEP !HEAD(!CPY) / METHOD ENTER !TAIL(!CPY) .
!LET !CPY=!CONCAT(!TAIL(!CPY)," ",!HEAD(!CPY))
!DOEND
!ENDDEFINE.
compute problem=MEAN(x1 TO X10)*.999 + Normal(.01).
COMPUTE GBG=NORMAL(1).
REG /STAT TOL / DEP GBG / METHOD ENTER X1 TO X10 Problem.
SMC VARS X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 problem .
Art Kendall wrote
I could be many things.
One possibility is that one or more items has a squared multiple
correlation of 1.00 with some of the other items.
treat all your items as items in a single scale in RELIABILITY.
Double check the statistics about the correlations. (max, min)
Look at the squared multiple correlations of each item with the
the other items.
What do you see?
What is the meaning of the sections?
Are sets of items written to measure specific constructs? Or are
you just exploring the data? or ??
Art Kendall
Social Research Consultants
On 3/18/2012 12:35 PM, Meg wrote:
I am running a factor analysis on scaled survey responses. There are 3
course sections that I am combining for the analysis; the determinant of the
correlation matrix is 0. I checked the correlations, none of them are > 0.9.
Data is highly significant, satisfies KMO conditions, Bartlett's test and
is superb for factor analysis. However, the determinant is 0 and that's
really unusual especially because if I don't combine the sections and run
them individually, the determinant is not 0. It's only when I combine the 3
sections that I end up with a determinant of 0.
I have more cases than variables; I really don't see anything that
perplexing about the data itself, but the 0 determinant is cause for concern.
Any reasons / pointers / advice on why this might be happening and what can
be done to resolve it would be much appreciated.
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.

"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"


throwing out strong correlations with negative
interitem correlations is losing valuable information. That would
not be a productive way to go.
Think of a typical attitude scale. Items are balanced so that
some items strongly agree indicate being at one of the dimension
and others strongly agree means the opposite end of the
dimension. The customary way of handling this is to "reflect" the
second set of items before summing.
"chocolate" is one of the best flavors there is." SD D N A SA
"the flavor of chocolate is disgusting" SD
D N A SA
How did you handle missing data in the two runs?
IFF the squared multiple correlations from Reliability are based
on the same cases as the factor analysis then there is something
going on that says more detail is needed to diagnose the problem.
 Did you retype the variable list or did you cutandpaste it?
 Are you sure that the variable list only includes each variable
once?
 Is the meaning of your items such that pretty much answering a
couple (few) of questions one way means that another question is
very predictable?
 If you send the correlation David asked about be sure to create
a DV for the regression.
compute noise = rv.normal(0,1).
or some other purely random variable
Art Kendall
Social Research Consultants
On 3/18/2012 7:01 PM, meteora100 wrote:
I will redo the analysis with elimination of negatively correlated variables
and see how that turns out. If I'm still not getting anywhere, I will send
the correlation matrix.
Thanks for the prompt response!

View this message in context: http://spssxdiscussion.1045642.n5.nabble.com/DeterminantofCorrelationMatrixis0tp5575290p5575815.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD
=====================
To manage your subscription to SPSSXL, send a message to
[hidden email] (not to SPSSXL), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSXL
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants

