Using GLM for Repeated Measures Logistic Regression

6 messages
Open this post in threaded view
|

Using GLM for Repeated Measures Logistic Regression

 We have a dichotomous dependent variable DRUG.  36 individuals, in random order, received drug A at one timepoint and drug B at a later timepoint.  We have a number of independent variables (e.g. blood pressure) measured at both timepoints.  We're interested in determining how well all of the independent variables together predict which drug the subject received at a given timepoint. We were initially using  the Binary Logistic Regression in SPSS 17 to do this, but realized we needed to account for the fact that the measures of an independent variable for a single subject wouldn't be independent across timepoints (e.g. if blookd pressure is high under drug A, blood pressure may be more likely to be high under drug B). So we moved to using the Generalized Linear Models, selecting Binomial distribution and Logit link function.  However, I can't figure out how to add the within-subject piece of it through the user interface.  I think I can do it through the syntax window using the "/Repeated Subject=name" line of code, but then the omnibus table disappears.  (This happens in both SPSS 17 and PASW 18)  The code I'm using (simplified for one indep variable) is copied at the end of this post. Is there a way to avoid losing the omnibus table?  Or a way to run it through the user interface? And one follow up question - the nice part of using the Binary Logistic Regression rather than the Generalized Linear Models is that the former gives you that nice classification table showing how many cases the model correctly predicted.  Is there anyway to get something similar in the Generalized Linear Models? Thanks, mdb ------------------------------------------------------------------ GENLIN flag (REFERENCE=LAST) WITH BloodPressure   /MODEL BloodPressure INTERCEPT=YES  DISTRIBUTION=BINOMIAL LINK=LOGIT /Repeated Subject=name   /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL MAXITERATIONS=100 MAXSTEPHALVING=5     PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD) CILEVEL=95 CITYPE=WALD     LIKELIHOOD=FULL   /MISSING CLASSMISSING=EXCLUDE   /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.
Open this post in threaded view
|

Re: Using GLM for Repeated Measures Logistic Regression

The repeated measures piece can be done in the GUI through the Analyze > Generalized Linear Models > Generalized Estimating Equations dialog.

Good question about the omnibus tests; the answer is in the GENLIN algorithms, in the section on Generalized Estimating Equations:  "Since GEE is not a likelihood-based method of estimation, the inferences based on likelihoods are not possible for GEEs. Most notably, the Lagrange multiplier test, goodness-of-fit tests, and omnibus tests are invalid and will not be offered. "  The algorithms are in PDF format on the installation disks, or as part of the help system (Help > Algorithms).

I'm afraid GENLIN doesn't have a classification table as part of the output, so you would need to /SAVE PREDVAL in GENLIN and then run CROSSTABS.

Also note that in v19, Generalized Linear Mixed Models provides an alternative to GEE for fitting repeated measures.

Alex

 From: mdb <[hidden email]> To: [hidden email] Date: 05/20/2011 12:04 PM Subject: Using GLM for Repeated Measures Logistic Regression Sent by: "SPSSX(r) Discussion" <[hidden email]>

We have a dichotomous dependent variable DRUG.  36 individuals, in random
order, received drug A at one timepoint and drug B at a later timepoint.  We
have a number of independent variables (e.g. blood pressure) measured at
both timepoints.  We're interested in determining how well all of the
independent variables together predict which drug the subject received at a
given timepoint.

We were initially using  the Binary Logistic Regression in SPSS 17 to do
this, but realized we needed to account for the fact that the measures of an
independent variable for a single subject wouldn't be independent across
timepoints (e.g. if blookd pressure is high under drug A, blood pressure may
be more likely to be high under drug B). So we moved to using the
Generalized Linear Models, selecting Binomial distribution and Logit link
function.  However, I can't figure out how to add the within-subject piece
of it through the user interface.  I think I can do it through the syntax
window using the "/Repeated Subject=name" line of code, but then the omnibus
table disappears.  (This happens in both SPSS 17 and PASW 18)  The code I'm
using (simplified for one indep variable) is copied at the end of this post.

Is there a way to avoid losing the omnibus table?  Or a way to run it
through the user interface?

And one follow up question - the nice part of using the Binary Logistic
Regression rather than the Generalized Linear Models is that the former
gives you that nice classification table showing how many cases the model
correctly predicted.  Is there anyway to get something similar in the
Generalized Linear Models?

Thanks,
mdb

------------------------------------------------------------------
GENLIN flag (REFERENCE=LAST) WITH BloodPressure
/MODEL BloodPressure INTERCEPT=YES
/Repeated Subject=name
/CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL MAXITERATIONS=100
MAXSTEPHALVING=5
PCONVERGE=1E-006(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD)
CILEVEL=95 CITYPE=WALD
LIKELIHOOD=FULL
/MISSING CLASSMISSING=EXCLUDE
/PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Using-GLM-for-Repeated-Measures-Logistic-Regression-tp4413047p4413047.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Open this post in threaded view
|

Re: Using GLM for Repeated Measures Logistic Regression

Open this post in threaded view
|

Re: Using GLM for Repeated Measures Logistic Regression

Open this post in threaded view
|

Re: Using GLM for Repeated Measures Logistic Regression

 Alex - Your pointer to saving PREDVAL and then running CROSSTABS was excellent. Works perfectly.  But with respect to the lack of an omnibus table, then how does one determine the significance of the overall model (as opposed to the individual coefficients)? Gene - We're actually interested in checking how good our measures (blood pressure was a simple example, but we have hundreds obtained using different tools) taken together are in predicting whether A or B was administered, the timepoint is largely irrelevant to us.  Maybe my initial phrasing contributed to the confusion. Bruce - Thanks for the suggested syntax - I am still playing with it.  But I'm not sure I understand the logic of including the time variable in the withinSubject .  We are actually not concerned with what timepoint someone received A or B. Each person got both A and B, and we don't expect order to matter.  All we're trying to do is account for the fact that they were repeated measures - as above, maybe my initial description was fuzzy on this. Given all that, is it possible to just ignore the fact that these are repeated measures and run a standard binomial logistic regression (not through GLM), perhaps by first running another test to get comfortable that there is some independence between the repeated measures for a subject? Thanks, all, for your help. mdb
Open this post in threaded view
|

Re: Using GLM for Repeated Measures Logistic Regression

There isn't a significance test for this, but you can use the information criteria reported in the goodness-of-fit table to compare models.  From Help > Case Studies, then Advanced Statistics > Generalized Linear Models > Generalized Estimating Equations,

* The Quasi-likelihood under Independence Model Criterion (QIC) can be used to help you choose between two correlation structures, given a set of model terms. The structure that obtains the smaller QIC is "better" according to this criterion.
* The Corrected Quasi-likelihood under Independence Model Criterion (QICC) can be used to help you choose between two sets of model terms, given a correlation structure. The model that obtains the smaller QICC is "better" according to this criterion. The computation of the QICC assumes that the distribution, link function, and working correlation matrix specifications are all "correct" for the dataset.

You could compare the QICC for your model with one for a "null" model (one with no predictors) and the same correlation structure.  If the QICC for your model is lower, then at least you know you're doing better than guessing.

Alex

 From: mdb <[hidden email]> To: [hidden email] Date: 05/20/2011 01:56 PM Subject: Re: Using GLM for Repeated Measures Logistic Regression Sent by: "SPSSX(r) Discussion" <[hidden email]>

Alex - Your pointer to saving PREDVAL and then running CROSSTABS was
excellent. Works perfectly.  But with respect to the lack of an omnibus
table, then how does one determine the significance of the overall model (as
opposed to the individual coefficients)?