Multiple Imputation

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Multiple Imputation

A.C. van der Burgh

Dear all,

 

Currently, I am doing a research project about serum sodium levels and falling. I am doing my analysis in SPSS. My dataset consists of a continuous variable of the sodium levels, let’s call it X. However, later in the analysis I want to categorize this variable in such a way that I can look in only hypo-, hyper- and normonatremia patients. I have created a new variable, called Z, to categorize the serum sodium levels in those three groups. However, a problem occurred after multiple imputation. Because of the fact that I have some missings in X, I also have some missing in Z (because I used X to create Z). Both variables will be imputed if I apply multiple imputation. However, the values of the variable X are not correct anymore. So for example, the imputed value for X is 140 and for Z 1 (while 1 is coded as X between 130 and 134). My question is thus: how can I solve this? Can I change this in all datasets after imputation? I know I could do it by hand, but I have approximately 1900 subjects and 20 imputed datasets, so I hope there is a faster way.

 

Thank you in advance. I hope that my explanation is clear enough.

 

Best,

Lisa Burgh

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multiple Imputation

Maguin, Eugene

I don’t have the MI module in my spss version so I can’t make a specific comment. That said, I’d think of z as a variable computed from an observed variable, which I guess is what x is. I’d impute x but not z and then compute z from x in each imputed dataset. Yeah, a pain to do but easily done by a baby macro. Just to speculate a bit, I think the correspondence between imputed x and imputed z is informative about correspondence of predicted x given the iteration plus the x to z recode versus predicted z given the iteration.

Gene Maguin

 

 

From: SPSSX(r) Discussion <[hidden email]> On Behalf Of A.C. van der Burgh
Sent: Tuesday, May 1, 2018 9:50 AM
To: [hidden email]
Subject: Multiple Imputation

 

Dear all,

 

Currently, I am doing a research project about serum sodium levels and falling. I am doing my analysis in SPSS. My dataset consists of a continuous variable of the sodium levels, let’s call it X. However, later in the analysis I want to categorize this variable in such a way that I can look in only hypo-, hyper- and normonatremia patients. I have created a new variable, called Z, to categorize the serum sodium levels in those three groups. However, a problem occurred after multiple imputation. Because of the fact that I have some missings in X, I also have some missing in Z (because I used X to create Z). Both variables will be imputed if I apply multiple imputation. However, the values of the variable X are not correct anymore. So for example, the imputed value for X is 140 and for Z 1 (while 1 is coded as X between 130 and 134). My question is thus: how can I solve this? Can I change this in all datasets after imputation? I know I could do it by hand, but I have approximately 1900 subjects and 20 imputed datasets, so I hope there is a faster way.

 

Thank you in advance. I hope that my explanation is clear enough.

 

Best,

Lisa Burgh

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multiple Imputation

Joost van Ginkel

Correct. Whenever computing any composite variable from other ones, you should compute these variables AFTER imputation, and do the imputation on the raw data only, that is, when you do it in SPSS. However, I would recommend the mice procedure in R instead. In R you can handle composite variables as well by specifying how they are composed of other variables. R will then impute the missing values on these composite variables accordingly. This will come with many advantages, such as that composite variables can then also be used as predictors for the missing data on other variables, while the specific relationship with the variables that they are composed of stays intact as well.

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Maguin, Eugene
Sent: Tuesday, May 01, 2018 4:13 PM
To: [hidden email]
Subject: Re: Multiple Imputation

 

I don’t have the MI module in my spss version so I can’t make a specific comment. That said, I’d think of z as a variable computed from an observed variable, which I guess is what x is. I’d impute x but not z and then compute z from x in each imputed dataset. Yeah, a pain to do but easily done by a baby macro. Just to speculate a bit, I think the correspondence between imputed x and imputed z is informative about correspondence of predicted x given the iteration plus the x to z recode versus predicted z given the iteration.

Gene Maguin

 

 

From: SPSSX(r) Discussion <[hidden email]> On Behalf Of A.C. van der Burgh
Sent: Tuesday, May 1, 2018 9:50 AM
To: [hidden email]
Subject: Multiple Imputation

 

Dear all,

 

Currently, I am doing a research project about serum sodium levels and falling. I am doing my analysis in SPSS. My dataset consists of a continuous variable of the sodium levels, let’s call it X. However, later in the analysis I want to categorize this variable in such a way that I can look in only hypo-, hyper- and normonatremia patients. I have created a new variable, called Z, to categorize the serum sodium levels in those three groups. However, a problem occurred after multiple imputation. Because of the fact that I have some missings in X, I also have some missing in Z (because I used X to create Z). Both variables will be imputed if I apply multiple imputation. However, the values of the variable X are not correct anymore. So for example, the imputed value for X is 140 and for Z 1 (while 1 is coded as X between 130 and 134). My question is thus: how can I solve this? Can I change this in all datasets after imputation? I know I could do it by hand, but I have approximately 1900 subjects and 20 imputed datasets, so I hope there is a faster way.

 

Thank you in advance. I hope that my explanation is clear enough.

 

Best,

Lisa Burgh

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD