How to deal with Multiple imputation for big data in SPSS?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

How to deal with Multiple imputation for big data in SPSS?

keren.agay.shay
I am trying to conduct multiple imputation for variable with 35% missing in
my 2,000,000 cases data set in SPSS.
I have received the error: "The model cannot be built because a
computational error has occurred during the estimation. No output will be
displayed." All the other variables are not imputed.
Thanks for your help



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with Multiple imputation for big data in SPSS?

Art Kendall
Why is the data missing?  I.e., what value labels are attached to the values
are missing?

If you create a variable 0 "not missing" 1 "missing reason 1" 2 "missing
reason 2" etc. are there associations/correlations with other variables?



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with Multiple imputation for big data in SPSS?

keren.agay.shay
Thanks for your attention.
I did not explained that The data are MAR(missing at random) and not
MCAR(missing completely at random(MCAR).
I have conducted previously, in other studies multiple imputation, but this
is the first time I am dealing with big data.
Thanks



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with Multiple imputation for big data in SPSS?

Andy W
In reply to this post by keren.agay.shay
I might first try with only generating 1 imputed dataset. If that does not
work, you might try with fewer variables. Finally, take a sample and then
make sure the code works on the smaller subset. The error message is not
obvious that the size of the dataset is the problem.

I'd note that missing data imputation mostly improves efficiency in
estimates. With a large dataset you are unlikely to observe very different
results than just conducting complete case analysis. Even with 35% missing
you still have over 1 million cases.



-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with Multiple imputation for big data in SPSS?

Rich Ulrich
In reply to this post by keren.agay.shay

35% missing and it is "at random"? - seems unlikely to me, though what matters

is that it is at-random with respect to the other measures in use.


One way to confirm at-random is to see how much "Missing vs. non"  associates with

any of the other variables. With N of 2 million, look at the effect size rather than

the p-levels.


--

Rich Ulrich


From: SPSSX(r) Discussion <[hidden email]> on behalf of keren.agay.shay <[hidden email]>
Sent: Wednesday, November 22, 2017 10:05:50 AM
To: [hidden email]
Subject: Re: How to deal with Multiple imputation for big data in SPSS?
 
Thanks for your attention.
I did not explained that The data are MAR(missing at random) and not
MCAR(missing completely at random(MCAR).
I have conducted previously, in other studies multiple imputation, but this
is the first time I am dealing with big data.
Thanks



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with Multiple imputation for big data in SPSS?

Bruce Weaver
Administrator
In reply to this post by keren.agay.shay
Just in case anyone is unclear on the distinction between MAR and MCAR, I
think the Key Messages box in this article summarizes it quite nicely.  

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4121561/



keren.agay.shay wrote

> Thanks for your attention.
> I did not explained that The data are MAR(missing at random) and not
> MCAR(missing completely at random(MCAR).
> I have conducted previously, in other studies multiple imputation, but
> this
> is the first time I am dealing with big data.
> Thanks
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with Multiple imputation for big data in SPSS?

Jon Peck
What it doesn't discuss is what to do if data are likely not MAR.  Is MI still better than nothing.

On Wed, Nov 22, 2017 at 7:40 PM Bruce Weaver <[hidden email]> wrote:
Just in case anyone is unclear on the distinction between MAR and MCAR, I
think the Key Messages box in this article summarizes it quite nicely.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4121561/



keren.agay.shay wrote
> Thanks for your attention.
> I did not explained that The data are MAR(missing at random) and not
> MCAR(missing completely at random(MCAR).
> I have conducted previously, in other studies multiple imputation, but
> this
> is the first time I am dealing with big data.
> Thanks
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with Multiple imputation for big data in SPSS?

bdates
There's a good article on MI and EM at the link below. The article has a decision tree based on the mechanism of missingness. What's important is the introduction of power as a decision-making basis. The author found that if the dataset remaining after casewise deletion is sufficient for good power for analysis, then casewise deletion might be the best option. Andy Wheeler made an earlier suggestion on this thread that analysis after casewise deletion might be appropriate. To John's question, the decision to impute if data are NMAR is based on whether the missingness can be modeled as part of the estimation process. Generally, it's a dead end. Unfortunately, if data aren't MCAR, it's pretty much a crap shoot about whether they're MAR of NMAR.

digitalcommons.wayne.edu/cgi/viewcontent.cgi?article=1964&context=jmasm

Brian

________________________________________
From: SPSSX(r) Discussion [[hidden email]] on behalf of Jon Peck [[hidden email]]
Sent: Wednesday, November 22, 2017 10:01 PM
To: [hidden email]
Subject: Re: How to deal with Multiple imputation for big data in SPSS?

What it doesn't discuss is what to do if data are likely not MAR.  Is MI still better than nothing.

On Wed, Nov 22, 2017 at 7:40 PM Bruce Weaver <[hidden email]<mailto:[hidden email]>> wrote:
Just in case anyone is unclear on the distinction between MAR and MCAR, I
think the Key Messages box in this article summarizes it quite nicely.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4121561/



keren.agay.shay wrote

> Thanks for your attention.
> I did not explained that The data are MAR(missing at random) and not
> MCAR(missing completely at random(MCAR).
> I have conducted previously, in other studies multiple imputation, but
> this
> is the first time I am dealing with big data.
> Thanks
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]<mailto:[hidden email]>
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email]<mailto:[hidden email]> (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Jon K Peck
[hidden email]<mailto:[hidden email]>

===================== To manage your subscription to SPSSX-L, send a message to [hidden email]<mailto:[hidden email]> (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD