How to fill in the missing values?

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

How to fill in the missing values?

albert_sun
Hi

I got a data like the format below

X  Y
1  6
2  8
3  ?
4  ?
5  12

The data has 5 rows, for each value in X, some of them have a value in Y. In
this data, the values of Y are missing for 3 and 4.

What I want is to fill up the missing value by using the linear regression
of values at X=2 and 5.

Question is how do I do this through SPSS?

Thanks



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to fill in the missing values?

Jon Peck
You can do this simply by saving the predicted values from the regression.
data list list/x y.
begin data
1 6
2 8
3 .
4 .
5 12
end data
dataset name xy.
REGRESSION
  /DEPENDENT y
  /METHOD=ENTER x
  /SAVE PRED.


On Mon, Feb 25, 2019 at 6:59 PM albert_sun <[hidden email]> wrote:
Hi

I got a data like the format below

X  Y
1  6
2  8
3  ?
4  ?
5  12

The data has 5 rows, for each value in X, some of them have a value in Y. In
this data, the values of Y are missing for 3 and 4.

What I want is to fill up the missing value by using the linear regression
of values at X=2 and 5.

Question is how do I do this through SPSS?

Thanks



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to fill in the missing values?

Jon Peck
I should have finished by adding the command to copy over missing values:
if missing(y) y = PRE_1.

If PRE_1 existed before the regression was run, the name of the predicted values variable would be different.
Note that treating the imputed value as if they are real values might give misleading statistical results.


On Tue, Feb 26, 2019 at 7:44 AM Jon Peck <[hidden email]> wrote:
You can do this simply by saving the predicted values from the regression.
data list list/x y.
begin data
1 6
2 8
3 .
4 .
5 12
end data
dataset name xy.
REGRESSION
  /DEPENDENT y
  /METHOD=ENTER x
  /SAVE PRED.


On Mon, Feb 25, 2019 at 6:59 PM albert_sun <[hidden email]> wrote:
Hi

I got a data like the format below

X  Y
1  6
2  8
3  ?
4  ?
5  12

The data has 5 rows, for each value in X, some of them have a value in Y. In
this data, the values of Y are missing for 3 and 4.

What I want is to fill up the missing value by using the linear regression
of values at X=2 and 5.

Question is how do I do this through SPSS?

Thanks



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]



--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to fill in the missing values?

Kirill Orlov
In reply to this post by albert_sun
You meant to say linear interpolation?
In Transform menu, find Replace Missing Values.


26.02.2019 4:54, albert_sun пишет:
Hi 

I got a data like the format below

X  Y
1  6 
2  8
3  ?
4  ?
5  12

The data has 5 rows, for each value in X, some of them have a value in Y. In
this data, the values of Y are missing for 3 and 4. 

What I want is to fill up the missing value by using the linear regression
of values at X=2 and 5. 

Question is how do I do this through SPSS?

Thanks



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to fill in the missing values?

albert_sun
In reply to this post by Jon Peck
Thanks for your reply, I should explain my issue further.

See the example data below, the plot of x and y is kind of a logistic
regression curve, as I am not sure how to get the regression coefficient
from SPSS, so I thought to use linear interpolation around the missing
values might give me a good approximate (a smoother curve).

I did try all methods under Replace Missing Values (RMV), and among all
options, I think the most closing one is "mean of nearby points". The issue
of this method is that if there are two consecutive missing values, the
estimate from RMV will give the same results.


data list list/x y.
begin data
0      0.09
1      0.24
2      0.63
3      1.04
4      1.64
5      2.38
6      3.92
7      6.37
8      
9      
10      19.46
11      
12      31.2
13      37.28
14      42.91
15      
16      52.79
17      56.47
18      
19      64.5
20      67.38
21      70.5
22      
23      75.35
24      77.64
25      .
26      82.05
27      84.04
28      .
29      88.94
30      91.11
31      92.94
32     .
33      96.5
34      98.13
35     .
36      99.7
37      99.86
38      .
39      100
40      .
end data.

plot of x and Y

<http://spssx-discussion.1045642.n5.nabble.com/file/t339934/1.jpg>



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to fill in the missing values?

bdates
In reply to this post by Kirill Orlov

I thought Linear Interpolation was for Time Series Data. I didn't get a solution when I tried it. I think John's simple syntax approach is the correct one. Maybe Joost Van Ginkel has a solution.


Brian

From: SPSSX(r) Discussion <[hidden email]> on behalf of Kirill Orlov <[hidden email]>
Sent: Tuesday, February 26, 2019 10:53:47 AM
To: [hidden email]
Subject: Re: How to fill in the missing values?
 
You meant to say linear interpolation?
In Transform menu, find Replace Missing Values.


26.02.2019 4:54, albert_sun пишет:
Hi 

I got a data like the format below

X  Y
1  6 
2  8
3  ?
4  ?
5  12

The data has 5 rows, for each value in X, some of them have a value in Y. In
this data, the values of Y are missing for 3 and 4. 

What I want is to fill up the missing value by using the linear regression
of values at X=2 and 5. 

Question is how do I do this through SPSS?

Thanks



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD




===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to fill in the missing values?

Bruce Weaver
Administrator
In reply to this post by Jon Peck
I'll jump in here before Art K does and suggest that the OP make a new
variable to hold the imputed values so that the original variable is
preserved.

COMPUTE Y2 = Y.
IF MISSING(Y2) Y2 = PRE_1.

;-)  





Jon Peck wrote

> I should have finished by adding the command to copy over missing values:
> if missing(y) y = PRE_1.
>
> If PRE_1 existed before the regression was run, the name of the predicted
> values variable would be different.
> Note that treating the imputed value as if they are real values might give
> misleading statistical results.
>
>
> On Tue, Feb 26, 2019 at 7:44 AM Jon Peck &lt;

> jkpeck@

> &gt; wrote:
>
>> You can do this simply by saving the predicted values from the
>> regression.
>> data list list/x y.
>> begin data
>> 1 6
>> 2 8
>> 3 .
>> 4 .
>> 5 12
>> end data
>> dataset name xy.
>> REGRESSION
>>   /DEPENDENT y
>>   /METHOD=ENTER x
>>   /SAVE PRED.
>>
>>
>> On Mon, Feb 25, 2019 at 6:59 PM albert_sun &lt;

> xiaoxun.sun@

> &gt; wrote:
>>
>>> Hi
>>>
>>> I got a data like the format below
>>>
>>> X  Y
>>> 1  6
>>> 2  8
>>> 3  ?
>>> 4  ?
>>> 5  12
>>>
>>> The data has 5 rows, for each value in X, some of them have a value in
>>> Y.
>>> In
>>> this data, the values of Y are missing for 3 and 4.
>>>
>>> What I want is to fill up the missing value by using the linear
>>> regression
>>> of values at X=2 and 5.
>>>
>>> Question is how do I do this through SPSS?
>>>
>>> Thanks
>>>
>>>
>>>
>>> --
>>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>>
>>
>>
>> --
>> Jon K Peck
>>

> jkpeck@

>>
>>
>
> --
> Jon K Peck

> jkpeck@

>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
Reply | Threaded
Open this post in threaded view
|

Re: How to fill in the missing values?

Jon Peck
Art must be lying on the beach.

The same tactic (not lying on the beach) could be used with the CURVEFIT command if you want a more flexible fit.

On Tue, Feb 26, 2019 at 4:36 PM Bruce Weaver <[hidden email]> wrote:
I'll jump in here before Art K does and suggest that the OP make a new
variable to hold the imputed values so that the original variable is
preserved.

COMPUTE Y2 = Y.
IF MISSING(Y2) Y2 = PRE_1.

;-) 





Jon Peck wrote
> I should have finished by adding the command to copy over missing values:
> if missing(y) y = PRE_1.
>
> If PRE_1 existed before the regression was run, the name of the predicted
> values variable would be different.
> Note that treating the imputed value as if they are real values might give
> misleading statistical results.
>
>
> On Tue, Feb 26, 2019 at 7:44 AM Jon Peck &lt;

> jkpeck@

> &gt; wrote:
>
>> You can do this simply by saving the predicted values from the
>> regression.
>> data list list/x y.
>> begin data
>> 1 6
>> 2 8
>> 3 .
>> 4 .
>> 5 12
>> end data
>> dataset name xy.
>> REGRESSION
>>   /DEPENDENT y
>>   /METHOD=ENTER x
>>   /SAVE PRED.
>>
>>
>> On Mon, Feb 25, 2019 at 6:59 PM albert_sun &lt;

> xiaoxun.sun@

> &gt; wrote:
>>
>>> Hi
>>>
>>> I got a data like the format below
>>>
>>> X  Y
>>> 1  6
>>> 2  8
>>> 3  ?
>>> 4  ?
>>> 5  12
>>>
>>> The data has 5 rows, for each value in X, some of them have a value in
>>> Y.
>>> In
>>> this data, the values of Y are missing for 3 and 4.
>>>
>>> What I want is to fill up the missing value by using the linear
>>> regression
>>> of values at X=2 and 5.
>>>
>>> Question is how do I do this through SPSS?
>>>
>>> Thanks
>>>
>>>
>>>
>>> --
>>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>>
>>
>>
>> --
>> Jon K Peck
>>

> jkpeck@

>>
>>
>
> --
> Jon K Peck

> jkpeck@

>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to fill in the missing values?

Bruce Weaver
Administrator
Good point re CURVEFIT.  It had also occurred to me that one could use
UNIANOVA to save the unstandardized fitted values.  But when I tried it (see
below), I found that it did not save the fitted values for cases where Y was
missing.  I did not expect that!  (I'm using 64-bit SPSS 25.0.0.2 for
Windows, by the way.)  

Any thoughts on why UNIANOVA behaves differently than CURVEFIT and
REGRESSION, Jon?  


SHOW MXWARNS.
PRESERVE.
SET MXWARNS=0.
DATA LIST FREE / X Y.
begin data
0      0.09    
1      0.24    
2      0.63    
3      1.04
4      1.64    
5      2.38    
6      3.92    
7      6.37
8      .          
9       .
10     19.46  
11      .
12      31.2
13      37.28
14      42.91
15      .
16      52.79
17      56.47
18      .
19      64.5
20      67.38
21      70.5
22      .
23      75.35
24      77.64
25      .
26      82.05
27      84.04
28      .
29      88.94
30      91.11
31      92.94
32      .
33      96.5
34      98.13
35      .
36      99.7
37      99.86
38      .
39      100
40      .
END DATA.
RESTORE.
SHOW MXWARNS.

* Mean-center X to prevent REGRESSION from excluding X from the model later.
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=
  /Xmean=MEAN(x) .
COMPUTE X = x-Xmean.
* COMPUTE X^2 and X^3 for use with REGRESSION later.
COMPUTE Xsq = X**2.
COMPUTE Xcu = X**3.

GRAPH
  /SCATTERPLOT(BIVAR)=X WITH Y
  /MISSING=LISTWISE.

* At first glance, it looks like a cubic fit might not be too bad.
* It may not fit as well at the extremes of the X-axis, but let's go with it
for now.

* [1] Use CURVEFIT to estimate the model and save the fitted values of Y.

* Curve Estimation.
TSET MXNEWVAR=1.
CURVEFIT
  /VARIABLES=Y WITH X
  /CONSTANT
  /MODEL=CUBIC
  /PLOT FIT
  /PRINT ANOVA
  /SAVE=PRED .

* [2] Now use REGRESSION.

REGRESSION
  /DEPENDENT Y
  /METHOD=ENTER X Xsq Xcu
  /SAVE PRED.

* [3] Finally, use UNIANOVA.

* Note that UNIANOVA does not require computation of the polynomial terms,
* as they can be specified on the DESIGN sub-command as x*x and x*x*x.

UNIANOVA Y WITH X
  /SAVE=PRED
  /CRITERIA=ALPHA(0.05)
  /DESIGN=X X*X X*X*X.

VARIABLE LABELS
 X "X (mean-centered)"
 Y "Y (mean-centered)"
 FIT_1 "Y-hat from CURVEFIT"
 PRE_1 "Y-hat from REGRESSION"
 PRE_2 "Y-hat from UNIANOVA"
.

DESCRIPTIVES x y FIT_1 PRE_1 PRE_2.

TEMPORARY.
SELECT IF MISSING(y).
LIST x y FIT_1 PRE_1 PRE_2.

OUTPUT from that final LIST command:

       X        Y       FIT_1       PRE_1    PRE_2
 
  -12.00      .      16.63361    16.63361      .
  -11.00      .      20.23010    20.23010      .
   -9.00      .      27.78512    27.78512      .
   -5.00      .      43.77652    43.77652      .
   -2.00      .      55.90363    55.90363      .
    2.00      .      71.21205    71.21205      .
    5.00      .      81.33909    81.33909      .
    8.00      .      89.69805    89.69805      .
   12.00      .      97.14831    97.14831      .
   15.00      .      99.25626    99.25626      .
   18.00      .      97.77430    97.77430      .
   20.00      .      94.52202    94.52202      .

* How about that--UNIANOVA does not generate fitted values for
* cases where Y is missing.  I did not know that.  



Jon Peck wrote
> Art must be lying on the beach.
>
> The same tactic (not lying on the beach) could be used with the CURVEFIT
> command if you want a more flexible fit.
>
> On Tue, Feb 26, 2019 at 4:36 PM Bruce Weaver &lt;

> bruce.weaver@

> &gt;
> wrote:
>
>> I'll jump in here before Art K does and suggest that the OP make a new
>> variable to hold the imputed values so that the original variable is
>> preserved.
>>
>> COMPUTE Y2 = Y.
>> IF MISSING(Y2) Y2 = PRE_1.
>>
>> ;-)
>>
>>
>>
>>
>>
>> Jon Peck wrote
>> > I should have finished by adding the command to copy over missing
>> values:
>> > if missing(y) y = PRE_1.
>> >
>> > If PRE_1 existed before the regression was run, the name of the
>> predicted
>> > values variable would be different.
>> > Note that treating the imputed value as if they are real values might
>> give
>> > misleading statistical results.
>> >
>> >
>> > On Tue, Feb 26, 2019 at 7:44 AM Jon Peck &lt;
>>
>> > jkpeck@
>>
>> > &gt; wrote:
>> >
>> >> You can do this simply by saving the predicted values from the
>> >> regression.
>> >> data list list/x y.
>> >> begin data
>> >> 1 6
>> >> 2 8
>> >> 3 .
>> >> 4 .
>> >> 5 12
>> >> end data
>> >> dataset name xy.
>> >> REGRESSION
>> >>   /DEPENDENT y
>> >>   /METHOD=ENTER x
>> >>   /SAVE PRED.
>> >>
>> >>
>> >> On Mon, Feb 25, 2019 at 6:59 PM albert_sun &lt;
>>
>> > xiaoxun.sun@
>>
>> > &gt; wrote:
>> >>
>> >>> Hi
>> >>>
>> >>> I got a data like the format below
>> >>>
>> >>> X  Y
>> >>> 1  6
>> >>> 2  8
>> >>> 3  ?
>> >>> 4  ?
>> >>> 5  12
>> >>>
>> >>> The data has 5 rows, for each value in X, some of them have a value
>> in
>> >>> Y.
>> >>> In
>> >>> this data, the values of Y are missing for 3 and 4.
>> >>>
>> >>> What I want is to fill up the missing value by using the linear
>> >>> regression
>> >>> of values at X=2 and 5.
>> >>>
>> >>> Question is how do I do this through SPSS?
>> >>>
>> >>> Thanks
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>> >>>
>> >>> =====================
>> >>> To manage your subscription to SPSSX-L, send a message to
>> >>>
>>
>> > LISTSERV@.UGA
>>
>> >  (not to SPSSX-L), with no body text except the
>> >>> command. To leave the list, send the command
>> >>> SIGNOFF SPSSX-L
>> >>> For a list of commands to manage subscriptions, send the command
>> >>> INFO REFCARD
>> >>>
>> >>
>> >>
>> >> --
>> >> Jon K Peck
>> >>
>>
>> > jkpeck@
>>
>> >>
>> >>
>> >
>> > --
>> > Jon K Peck
>>
>> > jkpeck@
>>
>> >
>> > =====================
>> > To manage your subscription to SPSSX-L, send a message to
>>
>> > LISTSERV@.UGA
>>
>> >  (not to SPSSX-L), with no body text except the
>> > command. To leave the list, send the command
>> > SIGNOFF SPSSX-L
>> > For a list of commands to manage subscriptions, send the command
>> > INFO REFCARD
>>
>>
>>
>>
>>
>> -----
>> --
>> Bruce Weaver
>>

> bweaver@

>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>
>> "When all else fails, RTFM."
>>
>> NOTE: My Hotmail account is not monitored regularly.
>> To send me an e-mail, please use the address shown above.
>>
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>
>
> --
> Jon K Peck

> jkpeck@

>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
Reply | Threaded
Open this post in threaded view
|

Re: How to fill in the missing values?

Art Kendall
In reply to this post by Jon Peck
It is still a little cool here in 33772. Only 71.  So I am sitting at my desk
looking across the lake at the park on the other side.

Actually, I'm working on a Statistics Without Borders project.



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: How to fill in the missing values?

Jon Peck
In reply to this post by Bruce Weaver
"Any thoughts on why UNIANOVA behaves differently than CURVEFIT and
REGRESSION, Jon?  "

I don't know, but I suspect that it was convenient (and sometimes useful) to save predicted values for missing value cases in REGRESSION, because it has its own filtering/selection process while UNIANOVA does not.  CURVEFIT code was probably based on REGRESSION.  I don't see any documentation for any of these procedures that specifies the intended behavior.



On Wed, Feb 27, 2019 at 7:06 AM Bruce Weaver <[hidden email]> wrote:
Good point re CURVEFIT.  It had also occurred to me that one could use
UNIANOVA to save the unstandardized fitted values.  But when I tried it (see
below), I found that it did not save the fitted values for cases where Y was
missing.  I did not expect that!  (I'm using 64-bit SPSS 25.0.0.2 for
Windows, by the way.) 

Any thoughts on why UNIANOVA behaves differently than CURVEFIT and
REGRESSION, Jon? 


SHOW MXWARNS.
PRESERVE.
SET MXWARNS=0.
DATA LIST FREE / X Y.
begin data
0      0.09     
1      0.24     
2      0.63     
3      1.04
4      1.64     
5      2.38     
6      3.92     
7      6.37
8      .         
9       .
10     19.46 
11      .
12      31.2
13      37.28
14      42.91
15      .
16      52.79
17      56.47
18      .
19      64.5
20      67.38
21      70.5
22      .
23      75.35
24      77.64
25      .
26      82.05
27      84.04
28      .
29      88.94
30      91.11
31      92.94
32      .
33      96.5
34      98.13
35      .
36      99.7
37      99.86
38      .
39      100
40      .
END DATA.
RESTORE.
SHOW MXWARNS.

* Mean-center X to prevent REGRESSION from excluding X from the model later.
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=
  /Xmean=MEAN(x) .
COMPUTE X = x-Xmean.
* COMPUTE X^2 and X^3 for use with REGRESSION later.
COMPUTE Xsq = X**2.
COMPUTE Xcu = X**3.

GRAPH
  /SCATTERPLOT(BIVAR)=X WITH Y
  /MISSING=LISTWISE.

* At first glance, it looks like a cubic fit might not be too bad.
* It may not fit as well at the extremes of the X-axis, but let's go with it
for now.

* [1] Use CURVEFIT to estimate the model and save the fitted values of Y.

* Curve Estimation.
TSET MXNEWVAR=1.
CURVEFIT
  /VARIABLES=Y WITH X
  /CONSTANT
  /MODEL=CUBIC
  /PLOT FIT
  /PRINT ANOVA
  /SAVE=PRED .

* [2] Now use REGRESSION.

REGRESSION
  /DEPENDENT Y
  /METHOD=ENTER X Xsq Xcu
  /SAVE PRED.

* [3] Finally, use UNIANOVA.

* Note that UNIANOVA does not require computation of the polynomial terms,
* as they can be specified on the DESIGN sub-command as x*x and x*x*x.

UNIANOVA Y WITH X
  /SAVE=PRED
  /CRITERIA=ALPHA(0.05)
  /DESIGN=X X*X X*X*X.

VARIABLE LABELS
 X "X (mean-centered)"
 Y "Y (mean-centered)"
 FIT_1 "Y-hat from CURVEFIT"
 PRE_1 "Y-hat from REGRESSION"
 PRE_2 "Y-hat from UNIANOVA"
.

DESCRIPTIVES x y FIT_1 PRE_1 PRE_2.

TEMPORARY.
SELECT IF MISSING(y).
LIST x y FIT_1 PRE_1 PRE_2.

OUTPUT from that final LIST command:

       X        Y       FIT_1       PRE_1    PRE_2

  -12.00      .      16.63361    16.63361      .
  -11.00      .      20.23010    20.23010      .
   -9.00      .      27.78512    27.78512      .
   -5.00      .      43.77652    43.77652      .
   -2.00      .      55.90363    55.90363      .
    2.00      .      71.21205    71.21205      .
    5.00      .      81.33909    81.33909      .
    8.00      .      89.69805    89.69805      .
   12.00      .      97.14831    97.14831      .
   15.00      .      99.25626    99.25626      .
   18.00      .      97.77430    97.77430      .
   20.00      .      94.52202    94.52202      .

* How about that--UNIANOVA does not generate fitted values for
* cases where Y is missing.  I did not know that. 



Jon Peck wrote
> Art must be lying on the beach.
>
> The same tactic (not lying on the beach) could be used with the CURVEFIT
> command if you want a more flexible fit.
>
> On Tue, Feb 26, 2019 at 4:36 PM Bruce Weaver &lt;

> bruce.weaver@

> &gt;
> wrote:
>
>> I'll jump in here before Art K does and suggest that the OP make a new
>> variable to hold the imputed values so that the original variable is
>> preserved.
>>
>> COMPUTE Y2 = Y.
>> IF MISSING(Y2) Y2 = PRE_1.
>>
>> ;-)
>>
>>
>>
>>
>>
>> Jon Peck wrote
>> > I should have finished by adding the command to copy over missing
>> values:
>> > if missing(y) y = PRE_1.
>> >
>> > If PRE_1 existed before the regression was run, the name of the
>> predicted
>> > values variable would be different.
>> > Note that treating the imputed value as if they are real values might
>> give
>> > misleading statistical results.
>> >
>> >
>> > On Tue, Feb 26, 2019 at 7:44 AM Jon Peck &lt;
>>
>> > jkpeck@
>>
>> > &gt; wrote:
>> >
>> >> You can do this simply by saving the predicted values from the
>> >> regression.
>> >> data list list/x y.
>> >> begin data
>> >> 1 6
>> >> 2 8
>> >> 3 .
>> >> 4 .
>> >> 5 12
>> >> end data
>> >> dataset name xy.
>> >> REGRESSION
>> >>   /DEPENDENT y
>> >>   /METHOD=ENTER x
>> >>   /SAVE PRED.
>> >>
>> >>
>> >> On Mon, Feb 25, 2019 at 6:59 PM albert_sun &lt;
>>
>> > xiaoxun.sun@
>>
>> > &gt; wrote:
>> >>
>> >>> Hi
>> >>>
>> >>> I got a data like the format below
>> >>>
>> >>> X  Y
>> >>> 1  6
>> >>> 2  8
>> >>> 3  ?
>> >>> 4  ?
>> >>> 5  12
>> >>>
>> >>> The data has 5 rows, for each value in X, some of them have a value
>> in
>> >>> Y.
>> >>> In
>> >>> this data, the values of Y are missing for 3 and 4.
>> >>>
>> >>> What I want is to fill up the missing value by using the linear
>> >>> regression
>> >>> of values at X=2 and 5.
>> >>>
>> >>> Question is how do I do this through SPSS?
>> >>>
>> >>> Thanks
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>> >>>
>> >>> =====================
>> >>> To manage your subscription to SPSSX-L, send a message to
>> >>>
>>
>> > LISTSERV@.UGA
>>
>> >  (not to SPSSX-L), with no body text except the
>> >>> command. To leave the list, send the command
>> >>> SIGNOFF SPSSX-L
>> >>> For a list of commands to manage subscriptions, send the command
>> >>> INFO REFCARD
>> >>>
>> >>
>> >>
>> >> --
>> >> Jon K Peck
>> >>
>>
>> > jkpeck@
>>
>> >>
>> >>
>> >
>> > --
>> > Jon K Peck
>>
>> > jkpeck@
>>
>> >
>> > =====================
>> > To manage your subscription to SPSSX-L, send a message to
>>
>> > LISTSERV@.UGA
>>
>> >  (not to SPSSX-L), with no body text except the
>> > command. To leave the list, send the command
>> > SIGNOFF SPSSX-L
>> > For a list of commands to manage subscriptions, send the command
>> > INFO REFCARD
>>
>>
>>
>>
>>
>> -----
>> --
>> Bruce Weaver
>>

> bweaver@

>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>
>> "When all else fails, RTFM."
>>
>> NOTE: My Hotmail account is not monitored regularly.
>> To send me an e-mail, please use the address shown above.
>>
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>
>
> --
> Jon K Peck

> jkpeck@

>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD