

1. Given the number of variables, how to create a tetrachoric correlation matrix in SPSS version 16?
2. How to do a Principal Axis Factoring with Varimax rotation method in SPSS 16 using a correlation matrix as input data?
Thanks.
Eins

Eins,
I can't comment on how to compute a tetrachoric matrix in spss. I don't know
the formula and would have to look it up. Possibly the computation could be
done using the matrixend matrix command set (look this up in the syntax
reference). I'd bet there are several people on the list that know exactly
how to do it. I don't.
As far as reading in a matrix goes, that's kind of easy. Look at the Matrix
Data command, also in the syntax reference. Note that, as far as I know, the
matrix has to be written so that it is as many columns wide as the number of
variables. That might seem obvious but the point is that some programs ouput
(or can output) a matrix in different arrangements on the page (and by this,
I do not mean lower triangular vs rectangular). The very fact that you are
asking about reading a matrix in means that you are computing that matrix in
another program.
I think the PAF part is kind of trivial. Again, look at the syntax ref. The
key part is getting the matrix file in and selecting the extraction method
(choose the PAF option). A varimax rotation is the default.
Repost to the list if you have more questions.
Gene Maguin
All,
I'm working with someone whose DV is a proportion. Specifically, a count of
tasks completed a worker divided by total tasks undertaken in a unit time.
There are about 10 workers per unit, three units per condition, and two
conditions. Disregarding the DV type issue, I'm regarding this as a nested
design, units within condition. I really never work with proportions and
don't have hardly any experience. I'm thinking that one problem with
proportions is that the standard deviation of a set of proportions depends
on the mean proportion because the standard deviation of a proportion is
simply sqrt(p*q). To fix up this problem, one solution has been to transform
the raw proportions. So I'd like to hear advice on two lines of questions.
1) Is there a newer and more preferred way to analyze proportions within a
GLM framework than transformations?
2) What are the recommended types of transformations to use with
proportions?
Any excellent refs are appreciated.
Thanks, Gene Maguin
Tetrachoric correlation is just polychoric correlation with dichotomous variables, I think. There is an SPSS Statistics 17 extension command, SPSSINC HETCOR, that computes polychoric correlations. You can download it from SPSS Developer Central (www.spss.com/devcentral). Besides Version 17, it requires R 2.7.0. Full requirements are in the readme file.
HTH,
Jon Peck
There is also a macro, r_tetra, available for computing polychoric
(tetrachoric) correlations.
http://www2.jura.unihamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Software/Enzmann_Software.htmlJustin
Assuming that you have both the count of tasks completed and the total tasks undertaken available, have a look at using Generalized Linear Models (GENLIN) with a binomial distribution and the response specified as the number of events occurring in a set of trials.
Alex
Sent: Monday, October 13, 2008 11:58 AM
To: [hidden email]
Subject: Working proportions and GLM or equivalent
All,
I'm working with someone whose DV is a proportion. Specifically, a count of
tasks completed a worker divided by total tasks undertaken in a unit time.
There are about 10 workers per unit, three units per condition, and two
conditions. Disregarding the DV type issue, I'm regarding this as a nested
design, units within condition. I really never work with proportions and
don't have hardly any experience. I'm thinking that one problem with
proportions is that the standard deviation of a set of proportions depends
on the mean proportion because the standard deviation of a proportion is
simply sqrt(p*q). To fix up this problem, one solution has been to transform
the raw proportions. So I'd like to hear advice on two lines of questions.
1) Is there a newer and more preferred way to analyze proportions within a
GLM framework than transformations?
2) What are the recommended types of transformations to use with
proportions?
Any excellent refs are appreciated.
Thanks, Gene Maguin
Given the nested design, I think you are looking at a general nonlinear
mixed model. This can be done in HLM, MLWin, SAS, and a few others, but
not as far as I know in SPSS.
Paul R. Swank, Ph.D
Professor and Director of Research
Children's Learning Institute
University of Texas Health Science Center
Houston, TX 77038
All,
I'm working with someone whose DV is a proportion. Specifically, a count
of
tasks completed a worker divided by total tasks undertaken in a unit
time.
There are about 10 workers per unit, three units per condition, and two
conditions. Disregarding the DV type issue, I'm regarding this as a
nested
design, units within condition. I really never work with proportions and
don't have hardly any experience. I'm thinking that one problem with
proportions is that the standard deviation of a set of proportions
depends
on the mean proportion because the standard deviation of a proportion is
simply sqrt(p*q). To fix up this problem, one solution has been to
transform
the raw proportions. So I'd like to hear advice on two lines of
questions.
1) Is there a newer and more preferred way to analyze proportions within
a
GLM framework than transformations?
2) What are the recommended types of transformations to use with
proportions?
Any excellent refs are appreciated.
Thanks, Gene Maguin
Pablo, thank you for your reply. I have Raudenbush's HLM book 1st edition
and I'll check that. Or, are you thinking specifically of the second
edition. However, we won't have the HLM program. Mplus, yes, but not HLM.
Paul, thanks also. To be a bit lazy, how would this analysis be classfied in
Mplus? Right now, it seems to be trials within worker within unit within
condition. But, I don't think that is right.
Alex, I'd like to follow up your reply since you've replied in the context
of spss. First, I'd like to make sure I understand the data setup and
command setup.
Would the (minimal) command setup be
GENLIN TasksDone of TotalTasks by unit condition/model unit(condition)
distribution=binomial link=logit.
Ok, data setup. Would it look like this?
Id condition unit tasksdone totaltasks
101 1 1 15 30
Thanks, Gene Maguin
>>I'm working with someone whose DV is a proportion. Specifically, a count
of
tasks completed a worker divided by total tasks undertaken in a unit time.
There are about 10 workers per unit, three units per condition, and two
conditions. Disregarding the DV type issue, I'm regarding this as a nested
design, units within condition. I really never work with proportions and
don't have hardly any experience. I'm thinking that one problem with
proportions is that the standard deviation of a set of proportions depends
on the mean proportion because the standard deviation of a proportion is
simply sqrt(p*q). To fix up this problem, one solution has been to transform
the raw proportions. So I'd like to hear advice on two lines of questions.
I thought I said HLM, MLWin, or SAS. Mplus will handle one level of
nesting but I'm not sure about the proportional data. I was thinking
about nested logistic regression with a number of events/number of
opportunities. I am pretty sure HLM 2 will do this, but not HLM 1. I
have all those programs but if I were going to do it, I would use SAS
proc glimmix or nlmixed.
Paul R. Swank, Ph.D
Professor and Director of Research
Children's Learning Institute
University of Texas Health Science Center
Houston, TX 77038
>>I'm working with someone whose DV is a proportion. Specifically, a
count
of
tasks completed a worker divided by total tasks undertaken in a unit
time.
There are about 10 workers per unit, three units per condition, and two
conditions. Disregarding the DV type issue, I'm regarding this as a
nested
design, units within condition. I really never work with proportions and
don't have hardly any experience. I'm thinking that one problem with
proportions is that the standard deviation of a set of proportions
depends
on the mean proportion because the standard deviation of a proportion is
simply sqrt(p*q). To fix up this problem, one solution has been to
transform
the raw proportions. So I'd like to hear advice on two lines of
questions.
I am quite intrigue with SPSS that requires R software. Does it mean that I would have to install R and SPSS in the same computer, then a syntax may be written to the spss syntax editor?
Eins
"Peck, Jon" < [hidden email]> wrote:
Tetrachoric correlation is just polychoric correlation with dichotomous variables, I think. There is an SPSS Statistics 17 extension command, SPSSINC HETCOR, that computes polychoric correlations. You can download it from SPSS Developer Central (www.spss.com/devcentral). Besides Version 17, it requires R 2.7.0. Full requirements are in the readme file.
HTH,
Jon Peck
You can install the optional R PlugIn from SPSS Developer Central, which requires the appropriate version of R as well (2.5.0 for SPSS 16 and 2.7.0 for version 17). Then you can write R programs in the SPSS Viewer between BEGIN PROGRAM R. and END PROGRAM. These can fetch the active SPSS dataset or portions thereof, run R code, and have the results appear as text, pivot tables and/or charts (version 17) in the SPSS Viewer.
Using the Version 17 Custom Dialog Builder, you can create SPSS dialog boxes for your R programs or extension commands (or for regular SPSS syntax, too).
Of course, the standard SPSS commands do not use R, but you now have the option of extending SPSS with R packages.
Check out the documentation that comes with the R plugin and the downloadable Data Management book, which is linked on Developer Central (www.spss.com/devcentral). You can also try out the many extension commands in Version 17 for running R packages.
HTH,
Jon Peck
I am quite intrigue with SPSS that requires R software. Does it mean that I would have to install R and SPSS in the same computer, then a syntax may be written to the spss syntax editor?
Eins
"Peck, Jon" < [hidden email]> wrote:
Tetrachoric correlation is just polychoric correlation with dichotomous variables, I think. There is an SPSS Statistics 17 extension command, SPSSINC HETCOR, that computes polychoric correlations. You can download it from SPSS Developer Central (www.spss.com/devcentral). Besides Version 17, it requires R 2.7.0. Full requirements are in the readme file.
HTH,
Jon Peck
That looks right. Note that this assumes that the probability a given worker completes a given task is independent of the probability that she completes the other tasks. If you want to treat the tasks as repeated measurements, then have a look at Generalized Estimating Equations (also GENLIN). Your data setup would then be like:
Id condition unit task taskcomplete
101 1 1 1 0
...
101 1 1 30 1
And you'd add a REPEATED subcommand with id as your SUBJECT variable and task as your WITHINSUBJECT variable.
Note that GENLIN doesn't handle random effects.
Alex
>>I'm working with someone whose DV is a proportion. Specifically, a count
of
tasks completed a worker divided by total tasks undertaken in a unit time.
There are about 10 workers per unit, three units per condition, and two
conditions. Disregarding the DV type issue, I'm regarding this as a nested
design, units within condition. I really never work with proportions and
don't have hardly any experience. I'm thinking that one problem with
proportions is that the standard deviation of a set of proportions depends
on the mean proportion because the standard deviation of a proportion is
simply sqrt(p*q). To fix up this problem, one solution has been to transform
the raw proportions. So I'd like to hear advice on two lines of questions.
I'm running the following genlin procedure.
GENLIN TasksDone of TotalTasks by unit condition/model unit(condition)
distribution=binomial link=logit.
Condition has two levels and unit has six levels. N per level of unit ranges
between 4 and 9.
I got this warning.
Warnings
The maximum number of stephalvings was reached but the loglikelihood value
cannot be further improved. Output for the last iteration is displayed.
The GENLIN procedure continues despite the above warning(s). Subsequent
results shown are based on the last iteration. Validity of the model fit is
uncertain.
I am completely willing to accept the criticism that my sample is too small.
However, within that limitation, is there an estimation parameter that I can
vary to try to get a valid solution?
Thanks, Gene Maguin
What does your iteration history look like? Use /PRINT HISTORY(1). In particular, what are the last few values of the parameter estimates, and which ones appear to not be converging?
Alex
