Hello,
I am struggling with fitting the model to overdispersed (positively skewed) data in SPSS and want to ask for your opinion. I measured for how long a specific behavior (B in [s]) lasted in tested subjects during a fixed time of observation. There are two independent variables/predictors, i.e., subject's sex (S: male or female) and genotype (G: 1 or 2). My research question is whether the subject's sex or genotype affects the duration of behavior B and whether the genotype modulates sex's effect. EXP: B ~ S + G + S*G My data do not follow the assumptions of the general linear model, so I decided to go with generalized linear models (as far as I know, regular, nonparametric tests cannot estimate the factors' interaction, in which I am interested). I cannot use GLMs with gamma distribution since behavior B did not appear for many subjects (B = 0 s), yet these cases are relevant for my experiment. I decided to try GLMs Poisson and then ZIP, but they do not fit data appropriately. The best fit had GLMs negative binomial regression, and here is my question: My data for B is a continuous variable (time measured in [s]). For the sake of my experiment, I can use the integer values (i.e., I can substitute 30,35 s > 31 s) but is this the only available approach for me to use NBR, and is it legitimate in your opinion? Have you any other ideas on how I can handle this design and data to estimate S*G interaction? I will genuinely appreciate your feedback. Best regards, Natalia ===================== To manage your subscription to SPSSXL, send a message to [hidden email] (not to SPSSXL), with no body text except the command. To leave the list, send the command SIGNOFF SPSSXL For a list of commands to manage subscriptions, send the command INFO REFCARD 
Natalia, to me your plan sounds wacky, Why not do an event history,
restricting your analysis to those cases that exhibited the phenomenon whose uration you want to study? It makes no sense to include cases that did not exhibit the phenomenon in a study of how long the phenomenon lasted.I do not know what SPSS offers for doing even history, I would expect as a minimum that it would have Cox regression. Stata also has a number of different parametric even history models. David Greenberg, sociology Dept., NYU On Fri, Mar 12, 2021 at 7:10 PM Natalia <[hidden email]> wrote: > > Hello, > > I am struggling with fitting the model to overdispersed (positively skewed) data in SPSS and want to ask for your opinion. > > I measured for how long a specific behavior (B in [s]) lasted in tested subjects during a fixed time of observation. > There are two independent variables/predictors, i.e., subject's sex (S: male or female) and genotype (G: 1 or 2). > > My research question is whether the subject's sex or genotype affects the duration of behavior B and whether the genotype modulates sex's effect. > > EXP: B ~ S + G + S*G > > My data do not follow the assumptions of the general linear model, so I decided to go with generalized linear models > (as far as I know, regular, nonparametric tests cannot estimate the factors' interaction, in which I am interested). > > I cannot use GLMs with gamma distribution since behavior B did not appear for many subjects (B = 0 s), yet these cases are relevant for my experiment. > > I decided to try GLMs Poisson and then ZIP, but they do not fit data appropriately. The best fit had GLMs negative binomial regression, and here is my question: > > My data for B is a continuous variable (time measured in [s]). For the sake of my experiment, I can use the integer values (i.e., I can substitute 30,35 s > 31 s) > but is this the only available approach for me to use NBR, and is it legitimate in your opinion? > > Have you any other ideas on how I can handle this design and data to estimate S*G interaction? > > I will genuinely appreciate your feedback. > > Best regards, > > > Natalia > > ===================== > To manage your subscription to SPSSXL, send a message to > [hidden email] (not to SPSSXL), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSXL > For a list of commands to manage subscriptions, send the command > INFO REFCARD ===================== To manage your subscription to SPSSXL, send a message to [hidden email] (not to SPSSXL), with no body text except the command. To leave the list, send the command SIGNOFF SPSSXL For a list of commands to manage subscriptions, send the command INFO REFCARD 
Administrator

Hi David. Your suggestion of Cox regression makes me wonder if you read
Natalia's post the same way I did. I understood that she was measuring the ~duration~ of some behaviour, not time to onset of the behaviour (i.e., time to event). Cox regression would be appropriate for the latter, but I don't know how one would use it for the former. Your comments did make me think that one approach would be something similar to a hurdle model, as follows: 1. Binary logistic regression model with Y = occurrence of the behaviour/phenomenon. 2. Some kind of model with Y = duration in seconds using only those observations for which the phenomenon occurred. When the stage 2 model includes only the cases with the phenomenon of interest, it may be that an OLS model is fine. But some other type of model could be used if OLS is not reasonable and defensible. Personally, I might be inclined to use quantile regression. Another possibility, given that the length of the period of observation is fixed, would be to treat Y as a proportion of the total time. In Stata, one could use betareg or fracreg for that type of outcome. I think one could achieve something similar with GENLIN in SPSS, perhaps using the eventsoftrials specification for the outcome, but "events" = the duration of the behaviour, and "trials" being the total time of observation. By the way, Natalia, what would the sample sizes be for the two stages I described above? Cheers, Bruce David Greenberg wrote > Natalia, to me your plan sounds wacky, Why not do an event history, > restricting your analysis to those cases that exhibited the phenomenon > whose uration you want to study? It makes no sense to include cases > that did not exhibit the phenomenon in a study of how long the > phenomenon lasted.I do not know what SPSS offers for doing even > history, I would expect as a minimum that it would have Cox > regression. Stata also has a number of different parametric even > history models. David Greenberg, sociology Dept., NYU   Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an email, please use the address shown above.  Sent from: http://spssxdiscussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSXL, send a message to [hidden email] (not to SPSSXL), with no body text except the command. To leave the list, send the command SIGNOFF SPSSXL For a list of commands to manage subscriptions, send the command INFO REFCARD

Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an email, please use the address shown above. 
Bruce, I understand Natalia's post the same way you do, but would
stand by my suggestion If she has a start time and an end time, she can use an event history approach to study the factors that influence the duration of the phenomenon. David Greenberg On Fri, Mar 12, 2021 at 8:42 PM Bruce Weaver <[hidden email]> wrote: > > Hi David. Your suggestion of Cox regression makes me wonder if you read > Natalia's post the same way I did. I understood that she was measuring the > ~duration~ of some behaviour, not time to onset of the behaviour (i.e., time > to event). Cox regression would be appropriate for the latter, but I don't > know how one would use it for the former. > > Your comments did make me think that one approach would be something similar > to a hurdle model, as follows: > > 1. Binary logistic regression model with Y = occurrence of the > behaviour/phenomenon. > > 2. Some kind of model with Y = duration in seconds using only those > observations for which the phenomenon occurred. > > When the stage 2 model includes only the cases with the phenomenon of > interest, it may be that an OLS model is fine. But some other type of model > could be used if OLS is not reasonable and defensible. Personally, I might > be inclined to use quantile regression. Another possibility, given that the > length of the period of observation is fixed, would be to treat Y as a > proportion of the total time. In Stata, one could use betareg or > fracreg for that type of outcome. I think one could achieve something > similar with GENLIN in SPSS, perhaps using the eventsoftrials > specification for the outcome, but "events" = the duration of the behaviour, > and "trials" being the total time of observation. > > By the way, Natalia, what would the sample sizes be for the two stages I > described above? > > Cheers, > Bruce > > > David Greenberg wrote > > Natalia, to me your plan sounds wacky, Why not do an event history, > > restricting your analysis to those cases that exhibited the phenomenon > > whose uration you want to study? It makes no sense to include cases > > that did not exhibit the phenomenon in a study of how long the > > phenomenon lasted.I do not know what SPSS offers for doing even > > history, I would expect as a minimum that it would have Cox > > regression. Stata also has a number of different parametric even > > history models. David Greenberg, sociology Dept., NYU > > > > > >  >  > Bruce Weaver > [hidden email] > https://urldefense.proofpoint.com/v2/url?u=http3A__sites.google.com_a_lakeheadu.ca_bweaver_&d=DwICAg&c=slrrB7dE8n7gBJbeO0gIQ&r=hjZ7i9ec0QWFsbuAR5BUwQ&m=V2TlLRM7r0ZouYrFDLXNR8rv6wMCbWwWWzQLb0aKhU&s=LV_Fw3jSBweMhhOicxNY9eF77xJ93ZtB0V1sh_Xbxo0&e= > > "When all else fails, RTFM." > > NOTE: My Hotmail account is not monitored regularly. > To send me an email, please use the address shown above. > >  > Sent from: https://urldefense.proofpoint.com/v2/url?u=http3A__spssx2Ddiscussion.1045642.n5.nabble.com_&d=DwICAg&c=slrrB7dE8n7gBJbeO0gIQ&r=hjZ7i9ec0QWFsbuAR5BUwQ&m=V2TlLRM7r0ZouYrFDLXNR8rv6wMCbWwWWzQLb0aKhU&s=B_yWuUUbZ_GnSbXy6Qq6UoFysnT6kgxIoB8OXmK_Q&e= > > ===================== > To manage your subscription to SPSSXL, send a message to > [hidden email] (not to SPSSXL), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSXL > For a list of commands to manage subscriptions, send the command > INFO REFCARD ===================== To manage your subscription to SPSSXL, send a message to [hidden email] (not to SPSSXL), with no body text except the command. To leave the list, send the command SIGNOFF SPSSXL For a list of commands to manage subscriptions, send the command INFO REFCARD 
In reply to this post by Bruce Weaver
Another approach to consider might be Heckman regression, which is available in Statistics as an extension command. On Fri, Mar 12, 2021 at 6:42 PM Bruce Weaver <[hidden email]> wrote: Hi David. Your suggestion of Cox regression makes me wonder if you read 
In reply to this post by Natalia
The notion of a GLM is that there is /some/ equation with linear terms
that describe the phenomenon ... with the existence of equal intervals
in the predictor equation having an "equal effect" when accounted for
by some transformed metric and specific computations of error of fit.
One straightforward approach to seeing what is there, at all, is to make it
two questions  0 vs. other, and predicting quantity among "other." This
is a good thing to look at because, what you get from any other approach
that has a single answer will be some weighted composite of those two
answers. You have only two predictors, so it is not so obvious as it would be
with five or ten, that you may have solutions that are silly to combine into
one equation.
Where duration is not zero, what does its distribution look like? This is,
you say, of an event duration in a fixed interval. What jumps out at me is the
prospect that your "event length" might deserve treatment as logistic, bounded
at zero and max. What can you say about event length, for or against that?
DO YOU expect that the same things (in general) that predict length should
predict the event?
Are you looking at something that does have some strong and obvious effects
which you are trying to fit to a model, or are you just scrambling?

Rich Ulrich
From: SPSSX(r) Discussion <[hidden email]> on behalf of Natalia <[hidden email]>
Sent: Friday, March 12, 2021 2:17 AM To: [hidden email] <[hidden email]> Subject: Fitting negative binomial regression to continuous data Hello,
I am struggling with fitting the model to overdispersed (positively skewed) data in SPSS and want to ask for your opinion. I measured for how long a specific behavior (B in [s]) lasted in tested subjects during a fixed time of observation. There are two independent variables/predictors, i.e., subject's sex (S: male or female) and genotype (G: 1 or 2). My research question is whether the subject's sex or genotype affects the duration of behavior B and whether the genotype modulates sex's effect. EXP: B ~ S + G + S*G My data do not follow the assumptions of the general linear model, so I decided to go with generalized linear models (as far as I know, regular, nonparametric tests cannot estimate the factors' interaction, in which I am interested). I cannot use GLMs with gamma distribution since behavior B did not appear for many subjects (B = 0 s), yet these cases are relevant for my experiment. I decided to try GLMs Poisson and then ZIP, but they do not fit data appropriately. The best fit had GLMs negative binomial regression, and here is my question: My data for B is a continuous variable (time measured in [s]). For the sake of my experiment, I can use the integer values (i.e., I can substitute 30,35 s > 31 s) but is this the only available approach for me to use NBR, and is it legitimate in your opinion? Have you any other ideas on how I can handle this design and data to estimate S*G interaction? I will genuinely appreciate your feedback. Best regards, Natalia ===================== To manage your subscription to SPSSXL, send a message to [hidden email] (not to SPSSXL), with no body text except the command. To leave the list, send the command SIGNOFF SPSSXL For a list of commands to manage subscriptions, send the command INFO REFCARD 
Heckman censored regression, is a generalization of tobit regression. Censoring is modeled using probit analysis, and the observed outcomes are modeled with regression On Fri, Mar 12, 2021 at 11:07 PM Rich Ulrich <[hidden email]> wrote:

Free forum by Nabble  Edit this page 