...and if so how do you go about it?
I'm trying to look for clusters within supermarket shopping behaviour, based around the categories that people buy. Crisp and Frozen Chips, Organic Eggs and Organic Veg. My data is arranged as each customer is an individual case and then I have a 1000 ish dichotomous variables for the categories they bought. I've run an initial factoring and it comes out with some nice factor groups but there are c150 factors, and I need to reduce these down further. Also, while some of my factor groups "feel nice" less than 50% of the variation is explained. I had thought the sheer number of variables may make it hard to get a high value. Should this tell me to look for another approach altogether or can I carry on and re-factor the factors? but at this point I'm a little unsure as to how to do that. Do I create dichotomous variables using the rotated factor matrix? i.e. if factor 1 was category 2, 56 and 102, then I compute a new dichotomous variable for customers who had bought one of these categories (repeating this for all factors) and then running something on these new "factored" variables. My issue with Or do I select the option to create factor variables and use these variables to re-factor? Thanks, Stuart |
Stuart Kirkup wrote:
[snip] > > Or do I select the option to create factor variables and use these > variables to re-factor? > Yes. This is called a 2nd order factor analysis. You occasionally see people repeat the process and do a third order factor analysis. Of course, you can only do this if you have used an oblique rotation. If you use an orthogonal rotation (e.g. varimax) then the factors are uncorrelated, and you cannot refactor them. jeremy -- Jeremy Miles mailto:[hidden email] http://www-users.york.ac.uk/~jnvm1/ Dept of Health Sciences (Area 4), University of York, York, YO10 5DD Phone: 01904 321375 Mobile: 07941 228018 Fax 01904 321320 NOTE: New address from September 2006: RAND Corporation, 1776 Main St, Santa Monica, CA, USA. (New email and stuff too, but I don't know it yet). |
In reply to this post by Stuart Kirkup
Keith Starborn
www.statisticsdoc.com Stuart, You can re-factor a factor analysis by saving the factor scores and then carrying out a factor analysis on the saved scores. This can be done if you utilized an oblique rotation (i.e., had correlated factors). However, it sounds like the results of the initial factor analysis might benefit from some additional analysis. You should probably decide on the number of factors using the scree criterion, not the number of eigenvalues greater than or equal to one. With 1000+ items, the eigenvalue GE one criterion will give you a massive number of factors. You might want to consider summing dichotomous items that are related into a something more like a continuous variable. Are your variables from a checklist, or do you have any information about the quantity of items shoppers purchased? HTH, KS ---- Stuart Kirkup <[hidden email]> wrote: > ...and if so how do you go about it? > > I'm trying to look for clusters within supermarket shopping behaviour, > based around the categories that people buy. Crisp and Frozen Chips, > Organic Eggs and Organic Veg. > > My data is arranged as each customer is an individual case and then I > have a 1000 ish dichotomous variables for the categories they bought. > > I've run an initial factoring and it comes out with some nice factor > groups but there are c150 factors, and I need to reduce these down > further. Also, while some of my factor groups "feel nice" less than 50% > of the variation is explained. I had thought the sheer number of > variables may make it hard to get a high value. Should this tell me to > look for another approach altogether or can I carry on and re-factor the > factors? but at this point I'm a little unsure as to how to do that. > > Do I create dichotomous variables using the rotated factor matrix? i.e. > if factor 1 was category 2, 56 and 102, then I compute a new dichotomous > variable for customers who had bought one of these categories (repeating > this for all factors) and then running something on these new "factored" > variables. My issue with > > Or do I select the option to create factor variables and use these > variables to re-factor? > > Thanks, > > Stuart > > -- For personalized and experienced consulting in statistics and research design, visit www.statisticsdoc.com |
Free forum by Nabble | Edit this page |