Quantcast

Reshaping from Wide to Long with Differing Dates and Variable Labels

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Reshaping from Wide to Long with Differing Dates and Variable Labels

bdates
Hi, all.

I have a file in wide format with the following variables.

ID  Date1 VarA1 VarB1 VarC1 VarD1...Date2 VarA2 VarB2 VarC2 VarD2...

The numerals identify the time of the assessment. The letters identify the repeating variables for each time of assessment. I have a macro that will identify the first and last assessment dates and variables. However, for most of my work I'm looking for the dates and variables which occurred within the time a patient was in a particular program. This last task is easy given that I can match the id's to a separate file with dates of program enrollment. However, in order to do that I think that I need to move this into long format with the following look.

ID  Date1 VarA1 VarB1 VarC1 VarD1
ID  Date2 VarA2 VarB2 VarC2 VarD2

Once in long format, I know how to vector the first and last occurrences back into a wide format for analysis. My challenge is how to get the data into long format given that the dates and variables are all labeled differently and there are frequently 50 or more administrations of the assessment. Thanks.

Brian

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

Maguin, Eugene
So, since the current data set is wide format, every record has the same number of variables but some records might have valid data for only the DATE1 variable set while others have valid data for all 50 variable sets and everything in between.

One way is to do a massive reordering of the current dataset to this ordering
Date1 date2 ... date50, vara1 ... vara50, varb1 ... varb50, etc.
Then you can do a (massive) casestovars. Lots of "make" statements on this one. There's no limit on the number of make statements.

Another way is to do a (massive) varstocases to create a three variable dataset (personid, indexvar, datavar). Perhaps some people missed some time points and then came back in or missed some variables within a timepoint; you need to keep all variables in the dataset so the NULL subcommand has to be Keep! Probably the variableset date variables have a format different from the data variables. I think you should change format for the date variables to match that of the data variables (vara1, etc) because there will be a format mismatch between the variables being restructured. (At this point you'll the matching file operation and select the variables to be retained.) Then construct a variableset id variable and do a little sorting magic so that the date variable and the associated data variables for that time point all have the same variable set id.  Then do a casestovars using personid and variableset id as the id variables.

Gene Maguin





-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Brian Dates
Sent: Monday, May 15, 2017 4:02 PM
To: [hidden email]
Subject: Reshaping from Wide to Long with Differing Dates and Variable Labels

Hi, all.

I have a file in wide format with the following variables.

ID  Date1 VarA1 VarB1 VarC1 VarD1...Date2 VarA2 VarB2 VarC2 VarD2...

The numerals identify the time of the assessment. The letters identify the repeating variables for each time of assessment. I have a macro that will identify the first and last assessment dates and variables. However, for most of my work I'm looking for the dates and variables which occurred within the time a patient was in a particular program. This last task is easy given that I can match the id's to a separate file with dates of program enrollment. However, in order to do that I think that I need to move this into long format with the following look.

ID  Date1 VarA1 VarB1 VarC1 VarD1
ID  Date2 VarA2 VarB2 VarC2 VarD2

Once in long format, I know how to vector the first and last occurrences back into a wide format for analysis. My challenge is how to get the data into long format given that the dates and variables are all labeled differently and there are frequently 50 or more administrations of the assessment. Thanks.

Brian

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

David Marso
Administrator
I can't test this because my SPSS install has tanked ;-(

Do a massive VARSTOCASES

COMPUTE @=1.
VARSTOCASES /MAKE var FROM date1 TO @/INDEX=strindex(var).
The use CHAR.INDEX to extract the numeric part from strindex.
Then use CASESTOVARS to build the desired chopped sectioned data.

Maguin, Eugene wrote
So, since the current data set is wide format, every record has the same number of variables but some records might have valid data for only the DATE1 variable set while others have valid data for all 50 variable sets and everything in between.

One way is to do a massive reordering of the current dataset to this ordering
Date1 date2 ... date50, vara1 ... vara50, varb1 ... varb50, etc.
Then you can do a (massive) casestovars. Lots of "make" statements on this one. There's no limit on the number of make statements.

Another way is to do a (massive) varstocases to create a three variable dataset (personid, indexvar, datavar). Perhaps some people missed some time points and then came back in or missed some variables within a timepoint; you need to keep all variables in the dataset so the NULL subcommand has to be Keep! Probably the variableset date variables have a format different from the data variables. I think you should change format for the date variables to match that of the data variables (vara1, etc) because there will be a format mismatch between the variables being restructured. (At this point you'll the matching file operation and select the variables to be retained.) Then construct a variableset id variable and do a little sorting magic so that the date variable and the associated data variables for that time point all have the same variable set id.  Then do a casestovars using personid and variableset id as the id variables.

Gene Maguin





-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Brian Dates
Sent: Monday, May 15, 2017 4:02 PM
To: [hidden email]
Subject: Reshaping from Wide to Long with Differing Dates and Variable Labels

Hi, all.

I have a file in wide format with the following variables.

ID  Date1 VarA1 VarB1 VarC1 VarD1...Date2 VarA2 VarB2 VarC2 VarD2...

The numerals identify the time of the assessment. The letters identify the repeating variables for each time of assessment. I have a macro that will identify the first and last assessment dates and variables. However, for most of my work I'm looking for the dates and variables which occurred within the time a patient was in a particular program. This last task is easy given that I can match the id's to a separate file with dates of program enrollment. However, in order to do that I think that I need to move this into long format with the following look.

ID  Date1 VarA1 VarB1 VarC1 VarD1
ID  Date2 VarA2 VarB2 VarC2 VarD2

Once in long format, I know how to vector the first and last occurrences back into a wide format for analysis. My challenge is how to get the data into long format given that the dates and variables are all labeled differently and there are frequently 50 or more administrations of the assessment. Thanks.

Brian

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

bdates
Gene/David,

First...David, I'm sorry about the SPSS Install. Good luck. Second, thanks for the input; it seems to be going in the same direction...a massive VARSTOCASES. I'll try it and then get back re: it's success.

Brian
________________________________________
From: SPSSX(r) Discussion [[hidden email]] on behalf of David Marso [[hidden email]]
Sent: Monday, May 15, 2017 6:28 PM
To: [hidden email]
Subject: Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

I can't test this because my SPSS install has tanked ;-(

Do a massive VARSTOCASES

COMPUTE @=1.
VARSTOCASES /MAKE var FROM date1 TO @/INDEX=strindex(var).
The use CHAR.INDEX to extract the numeric part from strindex.
Then use CASESTOVARS to build the desired chopped sectioned data.


Maguin, Eugene wrote

> So, since the current data set is wide format, every record has the same
> number of variables but some records might have valid data for only the
> DATE1 variable set while others have valid data for all 50 variable sets
> and everything in between.
>
> One way is to do a massive reordering of the current dataset to this
> ordering
> Date1 date2 ... date50, vara1 ... vara50, varb1 ... varb50, etc.
> Then you can do a (massive) casestovars. Lots of "make" statements on this
> one. There's no limit on the number of make statements.
>
> Another way is to do a (massive) varstocases to create a three variable
> dataset (personid, indexvar, datavar). Perhaps some people missed some
> time points and then came back in or missed some variables within a
> timepoint; you need to keep all variables in the dataset so the NULL
> subcommand has to be Keep! Probably the variableset date variables have a
> format different from the data variables. I think you should change format
> for the date variables to match that of the data variables (vara1, etc)
> because there will be a format mismatch between the variables being
> restructured. (At this point you'll the matching file operation and select
> the variables to be retained.) Then construct a variableset id variable
> and do a little sorting magic so that the date variable and the associated
> data variables for that time point all have the same variable set id.
> Then do a casestovars using personid and variableset id as the id
> variables.
>
> Gene Maguin
>
>
>
>
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:

> SPSSX-L@.UGA

> ] On Behalf Of Brian Dates
> Sent: Monday, May 15, 2017 4:02 PM
> To:

> SPSSX-L@.UGA

> Subject: Reshaping from Wide to Long with Differing Dates and Variable
> Labels
>
> Hi, all.
>
> I have a file in wide format with the following variables.
>
> ID  Date1 VarA1 VarB1 VarC1 VarD1...Date2 VarA2 VarB2 VarC2 VarD2...
>
> The numerals identify the time of the assessment. The letters identify the
> repeating variables for each time of assessment. I have a macro that will
> identify the first and last assessment dates and variables. However, for
> most of my work I'm looking for the dates and variables which occurred
> within the time a patient was in a particular program. This last task is
> easy given that I can match the id's to a separate file with dates of
> program enrollment. However, in order to do that I think that I need to
> move this into long format with the following look.
>
> ID  Date1 VarA1 VarB1 VarC1 VarD1
> ID  Date2 VarA2 VarB2 VarC2 VarD2
>
> Once in long format, I know how to vector the first and last occurrences
> back into a wide format for analysis. My challenge is how to get the data
> into long format given that the dates and variables are all labeled
> differently and there are frequently 50 or more administrations of the
> assessment. Thanks.
>
> Brian
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSX-L For a list of commands to manage
> subscriptions, send the command INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Reshaping-from-Wide-to-Long-with-Differing-Dates-and-Variable-Labels-tp5734156p5734158.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

Rich Ulrich
In reply to this post by bdates

The problem seems to be the long set of MAKE specs in CasesToVars.

I would avoid them by using the technique that I used before CasesToVars existed,

setting up a loop and assigning variables to be written with XSAVE.


For these data, you have something like VECTOR items= Date1 to VarD55  (say)

which will be 55*5 in length.


The you do a loop # = 1 to 55, and

Compute Row= 5*# -4.

COMPUTE  Date= items(Row).

COMPUTE  VarA= iems(Row+1).

COMPUTE  VarB= items(Row+2).

COMPUTE  VarC= items(Row+3).

COMPUTE  VarD= items(Row+4).


XSAVE  outfile= ....  /vars= Id Row Date VarA VarB VarC VarD .

end the loop and Execute.


--

Rich Ulrich


From: SPSSX(r) Discussion <[hidden email]> on behalf of Brian Dates <[hidden email]>
Sent: Monday, May 15, 2017 4:02:27 PM
To: [hidden email]
Subject: Reshaping from Wide to Long with Differing Dates and Variable Labels
 
Hi, all.

I have a file in wide format with the following variables.

ID  Date1 VarA1 VarB1 VarC1 VarD1...Date2 VarA2 VarB2 VarC2 VarD2...

The numerals identify the time of the assessment. The letters identify the repeating variables for each time of assessment. I have a macro that will identify the first and last assessment dates and variables. However, for most of my work I'm looking for the dates and variables which occurred within the time a patient was in a particular program. This last task is easy given that I can match the id's to a separate file with dates of program enrollment. However, in order to do that I think that I need to move this into long format with the following look.

ID  Date1 VarA1 VarB1 VarC1 VarD1
ID  Date2 VarA2 VarB2 VarC2 VarD2

Once in long format, I know how to vector the first and last occurrences back into a wide format for analysis. My challenge is how to get the data into long format given that the dates and variables are all labeled differently and there are frequently 50 or more administrations of the assessment. Thanks.

Brian

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

David Marso
Administrator
You can use a single MAKE subcommand with an index and parse the index variable as I previously suggested.
If you take Rich's path note that LOOP has a BY clause which is useful.
LOOP #=1 TO 55 BY 5.
etc...

Another approach would be to use RESHAPE in MATRIX language.
Untested due to FUBARed SPSS .

MATRIX.
GET data / FILE */VARIABLES date1 TO varD50.
GET id /FILE * /VARIABLE ID.
COMPUTE n=NROW(data).
COMPUTE p=NCOL(data)/5.
SAVE ({KRONECKER(ID,MAKE(p,1,1),RESHAPE(data,n*p,5) } )
   /OUTFILE *
  /VARIABLES date varA varB varC varD.
END MATRIX.



Rich Ulrich wrote
The problem seems to be the long set of MAKE specs in CasesToVars.

I would avoid them by using the technique that I used before CasesToVars existed,

setting up a loop and assigning variables to be written with XSAVE.


For these data, you have something like VECTOR items= Date1 to VarD55  (say)

which will be 55*5 in length.


The you do a loop # = 1 to 55, and

Compute Row= 5*# -4.

COMPUTE  Date= items(Row).

COMPUTE  VarA= iems(Row+1).

COMPUTE  VarB= items(Row+2).

COMPUTE  VarC= items(Row+3).

COMPUTE  VarD= items(Row+4).


XSAVE  outfile= ....  /vars= Id Row Date VarA VarB VarC VarD .

end the loop and Execute.


--

Rich Ulrich

________________________________
From: SPSSX(r) Discussion <[hidden email]> on behalf of Brian Dates <[hidden email]>
Sent: Monday, May 15, 2017 4:02:27 PM
To: [hidden email]
Subject: Reshaping from Wide to Long with Differing Dates and Variable Labels

Hi, all.

I have a file in wide format with the following variables.

ID  Date1 VarA1 VarB1 VarC1 VarD1...Date2 VarA2 VarB2 VarC2 VarD2...

The numerals identify the time of the assessment. The letters identify the repeating variables for each time of assessment. I have a macro that will identify the first and last assessment dates and variables. However, for most of my work I'm looking for the dates and variables which occurred within the time a patient was in a particular program. This last task is easy given that I can match the id's to a separate file with dates of program enrollment. However, in order to do that I think that I need to move this into long format with the following look.

ID  Date1 VarA1 VarB1 VarC1 VarD1
ID  Date2 VarA2 VarB2 VarC2 VarD2

Once in long format, I know how to vector the first and last occurrences back into a wide format for analysis. My challenge is how to get the data into long format given that the dates and variables are all labeled differently and there are frequently 50 or more administrations of the assessment. Thanks.

Brian

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

David Marso
Administrator
Test solution for El Grande VarsToCases.  
I opted for AGGREGATE rather than CASESTOVARS as I could not get the names of Root to appear as new variable names.  This method also avoids any SORT which would typically be required.


/* Simulate Brian's data */.

DEFINE !renameAll (n !TOKENS(1) / list !CMDEND).
!LET !vlist=!Null !DO !I=1 !TO !n !DO !Var !IN ( !List)
!LET !vlist=!CONCAT(!vlist,!Var,!I," ")
!DOEND !DOEND
RENAME VARIABLES (X2 TO X251 = !vlist ).
!ENDDEFINE .

MATRIX.
SAVE MAKE(10000,251,1)/OUTFILE * / VARIABLES X1 TO X251 .
END MATRIX.
DATASET NAME raw.
RENAME VARIABLES x1=ID.

SET PRINTBACK ON MPRINT ON.
!renameAll n=50 list=date VarA VarB VarC VarD.

/* Simulation complete */.


/* Actual code */.
COMPUTE ID=$CASENUM.
COMPUTE @=1.
VARSTOCASES / MAKE values FROM date1 TO @ /INDEX =vnames(values).
SELECT IF vnames NE '@'.
COMPUTE firstNumber=CHAR.INDEX(vnames,'0123456789',1).
STRING Root (A8).
COMPUTE NewIndex=NUMBER(CHAR.SUBSTR(vnames,firstNumber),F2).
COMPUTE Root=CHAR.SUBSTR(vnames,1,firstNumber-1).
DO REPEAT v=date VarA VarB VarC VarD     /s="date" "VarA" "VarB" "VarC" "VarD".
IF (Root EQ s) v= values.
END REPEAT.
AGGREGATE OUTFILE * / BREAK ID NewIndex  / date VarA VarB VarC VarD=MAX(date VarA VarB VarC VarD).



David Marso wrote
You can use a single MAKE subcommand with an index and parse the index variable as I previously suggested.
If you take Rich's path note that LOOP has a BY clause which is useful.
LOOP #=1 TO 55 BY 5.
etc...

Another approach would be to use RESHAPE in MATRIX language.
Untested due to FUBARed SPSS .

MATRIX.
GET data / FILE */VARIABLES date1 TO varD50.
GET id /FILE * /VARIABLE ID.
COMPUTE n=NROW(data).
COMPUTE p=NCOL(data)/5.
SAVE ({KRONECKER(ID,MAKE(p,1,1),RESHAPE(data,n*p,5) } )
   /OUTFILE *
  /VARIABLES date varA varB varC varD.
END MATRIX.



Rich Ulrich wrote
The problem seems to be the long set of MAKE specs in CasesToVars.

I would avoid them by using the technique that I used before CasesToVars existed,

setting up a loop and assigning variables to be written with XSAVE.


For these data, you have something like VECTOR items= Date1 to VarD55  (say)

which will be 55*5 in length.


The you do a loop # = 1 to 55, and

Compute Row= 5*# -4.

COMPUTE  Date= items(Row).

COMPUTE  VarA= iems(Row+1).

COMPUTE  VarB= items(Row+2).

COMPUTE  VarC= items(Row+3).

COMPUTE  VarD= items(Row+4).


XSAVE  outfile= ....  /vars= Id Row Date VarA VarB VarC VarD .

end the loop and Execute.


--

Rich Ulrich

________________________________
From: SPSSX(r) Discussion <[hidden email]> on behalf of Brian Dates <[hidden email]>
Sent: Monday, May 15, 2017 4:02:27 PM
To: [hidden email]
Subject: Reshaping from Wide to Long with Differing Dates and Variable Labels

Hi, all.

I have a file in wide format with the following variables.

ID  Date1 VarA1 VarB1 VarC1 VarD1...Date2 VarA2 VarB2 VarC2 VarD2...

The numerals identify the time of the assessment. The letters identify the repeating variables for each time of assessment. I have a macro that will identify the first and last assessment dates and variables. However, for most of my work I'm looking for the dates and variables which occurred within the time a patient was in a particular program. This last task is easy given that I can match the id's to a separate file with dates of program enrollment. However, in order to do that I think that I need to move this into long format with the following look.

ID  Date1 VarA1 VarB1 VarC1 VarD1
ID  Date2 VarA2 VarB2 VarC2 VarD2

Once in long format, I know how to vector the first and last occurrences back into a wide format for analysis. My challenge is how to get the data into long format given that the dates and variables are all labeled differently and there are frequently 50 or more administrations of the assessment. Thanks.

Brian

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

David Marso
Administrator
A tested solution using MATRIX.

SET MXLOOPS=50.
MATRIX.
GET data / FILE */VARIABLES date1 TO varD50.
GET id /FILE * /VARIABLE ID.
COMPUTE IndX=MAKE(NCOL(data)/5,1,1).
LOOP #=1 TO NCOL(data)/5.
COMPUTE IndX(#)=#.
END LOOP.
SAVE  ({KRONEKER(ID,MAKE(NCOL(data)/5,1,1) ), KRONEKER(MAKE(NROW(data),1,1), IndX),RESHAPE(data,NROW(data)*NCOL(data)/5,5) })
        /OUTFILE =*/VARIABLES =ID index date varA varB varC varD.
END MATRIX.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

Bruce Weaver
Administrator
In reply to this post by David Marso
Nice one, David.  But try it with actual date variables.  When I tried that (for Date1, Date2, and Date3 only), the date values were expressed as # of seconds since October 14, 1582, and I had to use ALTER TYPE to get a real date variable.  I expect this will also be an issue for the MATRIX solution you posted.


David Marso wrote
Test solution for El Grande VarsToCases.  
I opted for AGGREGATE rather than CASESTOVARS as I could not get the names of Root to appear as new variable names.  This method also avoids any SORT which would typically be required.


/* Simulate Brian's data */.

DEFINE !renameAll (n !TOKENS(1) / list !CMDEND).
!LET !vlist=!Null !DO !I=1 !TO !n !DO !Var !IN ( !List)
!LET !vlist=!CONCAT(!vlist,!Var,!I," ")
!DOEND !DOEND
RENAME VARIABLES (X2 TO X251 = !vlist ).
!ENDDEFINE .

MATRIX.
SAVE MAKE(10000,251,1)/OUTFILE * / VARIABLES X1 TO X251 .
END MATRIX.
DATASET NAME raw.
RENAME VARIABLES x1=ID.

SET PRINTBACK ON MPRINT ON.
!renameAll n=50 list=date VarA VarB VarC VarD.

/* Simulation complete */.


/* Actual code */.
COMPUTE ID=$CASENUM.
COMPUTE @=1.
VARSTOCASES / MAKE values FROM date1 TO @ /INDEX =vnames(values).
SELECT IF vnames NE '@'.
COMPUTE firstNumber=CHAR.INDEX(vnames,'0123456789',1).
STRING Root (A8).
COMPUTE NewIndex=NUMBER(CHAR.SUBSTR(vnames,firstNumber),F2).
COMPUTE Root=CHAR.SUBSTR(vnames,1,firstNumber-1).
DO REPEAT v=date VarA VarB VarC VarD     /s="date" "VarA" "VarB" "VarC" "VarD".
IF (Root EQ s) v= values.
END REPEAT.
AGGREGATE OUTFILE * / BREAK ID NewIndex  / date VarA VarB VarC VarD=MAX(date VarA VarB VarC VarD).



David Marso wrote
You can use a single MAKE subcommand with an index and parse the index variable as I previously suggested.
If you take Rich's path note that LOOP has a BY clause which is useful.
LOOP #=1 TO 55 BY 5.
etc...

Another approach would be to use RESHAPE in MATRIX language.
Untested due to FUBARed SPSS .

MATRIX.
GET data / FILE */VARIABLES date1 TO varD50.
GET id /FILE * /VARIABLE ID.
COMPUTE n=NROW(data).
COMPUTE p=NCOL(data)/5.
SAVE ({KRONECKER(ID,MAKE(p,1,1),RESHAPE(data,n*p,5) } )
   /OUTFILE *
  /VARIABLES date varA varB varC varD.
END MATRIX.



Rich Ulrich wrote
The problem seems to be the long set of MAKE specs in CasesToVars.

I would avoid them by using the technique that I used before CasesToVars existed,

setting up a loop and assigning variables to be written with XSAVE.


For these data, you have something like VECTOR items= Date1 to VarD55  (say)

which will be 55*5 in length.


The you do a loop # = 1 to 55, and

Compute Row= 5*# -4.

COMPUTE  Date= items(Row).

COMPUTE  VarA= iems(Row+1).

COMPUTE  VarB= items(Row+2).

COMPUTE  VarC= items(Row+3).

COMPUTE  VarD= items(Row+4).


XSAVE  outfile= ....  /vars= Id Row Date VarA VarB VarC VarD .

end the loop and Execute.


--

Rich Ulrich

________________________________
From: SPSSX(r) Discussion <[hidden email]> on behalf of Brian Dates <[hidden email]>
Sent: Monday, May 15, 2017 4:02:27 PM
To: [hidden email]
Subject: Reshaping from Wide to Long with Differing Dates and Variable Labels

Hi, all.

I have a file in wide format with the following variables.

ID  Date1 VarA1 VarB1 VarC1 VarD1...Date2 VarA2 VarB2 VarC2 VarD2...

The numerals identify the time of the assessment. The letters identify the repeating variables for each time of assessment. I have a macro that will identify the first and last assessment dates and variables. However, for most of my work I'm looking for the dates and variables which occurred within the time a patient was in a particular program. This last task is easy given that I can match the id's to a separate file with dates of program enrollment. However, in order to do that I think that I need to move this into long format with the following look.

ID  Date1 VarA1 VarB1 VarC1 VarD1
ID  Date2 VarA2 VarB2 VarC2 VarD2

Once in long format, I know how to vector the first and last occurrences back into a wide format for analysis. My challenge is how to get the data into long format given that the dates and variables are all labeled differently and there are frequently 50 or more administrations of the assessment. Thanks.

Brian

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

David Marso
Administrator
Just after the MATRIX code use

FORMATS date (ADATE).
Or whatever date format rocks your world.

On Tue, May 16, 2017 at 2:05 PM, Bruce Weaver [via SPSSX Discussion] <[hidden email]> wrote:
Nice one, David.  But try it with actual date variables.  When I tried that (for Date1, Date2, and Date3 only), the date values were expressed as # of seconds since October 14, 1582, and I had to use ALTER TYPE to get a real date variable.  I expect this will also be an issue for the MATRIX solution you posted.


David Marso wrote
Test solution for El Grande VarsToCases.  
I opted for AGGREGATE rather than CASESTOVARS as I could not get the names of Root to appear as new variable names.  This method also avoids any SORT which would typically be required.


/* Simulate Brian's data */.

DEFINE !renameAll (n !TOKENS(1) / list !CMDEND).
!LET !vlist=!Null !DO !I=1 !TO !n !DO !Var !IN ( !List)
!LET !vlist=!CONCAT(!vlist,!Var,!I," ")
!DOEND !DOEND
RENAME VARIABLES (X2 TO X251 = !vlist ).
!ENDDEFINE .

MATRIX.
SAVE MAKE(10000,251,1)/OUTFILE * / VARIABLES X1 TO X251 .
END MATRIX.
DATASET NAME raw.
RENAME VARIABLES x1=ID.

SET PRINTBACK ON MPRINT ON.
!renameAll n=50 list=date VarA VarB VarC VarD.

/* Simulation complete */.


/* Actual code */.
COMPUTE ID=$CASENUM.
COMPUTE @=1.
VARSTOCASES / MAKE values FROM date1 TO @ /INDEX =vnames(values).
SELECT IF vnames NE '@'.
COMPUTE firstNumber=CHAR.INDEX(vnames,'0123456789',1).
STRING Root (A8).
COMPUTE NewIndex=NUMBER(CHAR.SUBSTR(vnames,firstNumber),F2).
COMPUTE Root=CHAR.SUBSTR(vnames,1,firstNumber-1).
DO REPEAT v=date VarA VarB VarC VarD     /s="date" "VarA" "VarB" "VarC" "VarD".
IF (Root EQ s) v= values.
END REPEAT.
AGGREGATE OUTFILE * / BREAK ID NewIndex  / date VarA VarB VarC VarD=MAX(date VarA VarB VarC VarD).



David Marso wrote
You can use a single MAKE subcommand with an index and parse the index variable as I previously suggested.
If you take Rich's path note that LOOP has a BY clause which is useful.
LOOP #=1 TO 55 BY 5.
etc...

Another approach would be to use RESHAPE in MATRIX language.
Untested due to FUBARed SPSS .

MATRIX.
GET data / FILE */VARIABLES date1 TO varD50.
GET id /FILE * /VARIABLE ID.
COMPUTE n=NROW(data).
COMPUTE p=NCOL(data)/5.
SAVE ({KRONECKER(ID,MAKE(p,1,1),RESHAPE(data,n*p,5) } )
   /OUTFILE *
  /VARIABLES date varA varB varC varD.
END MATRIX.



Rich Ulrich wrote
The problem seems to be the long set of MAKE specs in CasesToVars.

I would avoid them by using the technique that I used before CasesToVars existed,

setting up a loop and assigning variables to be written with XSAVE.


For these data, you have something like VECTOR items= Date1 to VarD55  (say)

which will be 55*5 in length.


The you do a loop # = 1 to 55, and

Compute Row= 5*# -4.

COMPUTE  Date= items(Row).

COMPUTE  VarA= iems(Row+1).

COMPUTE  VarB= items(Row+2).

COMPUTE  VarC= items(Row+3).

COMPUTE  VarD= items(Row+4).


XSAVE  outfile= ....  /vars= Id Row Date VarA VarB VarC VarD .

end the loop and Execute.


--

Rich Ulrich

________________________________
From: SPSSX(r) Discussion <[hidden email]> on behalf of Brian Dates <[hidden email]>
Sent: Monday, May 15, 2017 4:02:27 PM
To: [hidden email]
Subject: Reshaping from Wide to Long with Differing Dates and Variable Labels

Hi, all.

I have a file in wide format with the following variables.

ID  Date1 VarA1 VarB1 VarC1 VarD1...Date2 VarA2 VarB2 VarC2 VarD2...

The numerals identify the time of the assessment. The letters identify the repeating variables for each time of assessment. I have a macro that will identify the first and last assessment dates and variables. However, for most of my work I'm looking for the dates and variables which occurred within the time a patient was in a particular program. This last task is easy given that I can match the id's to a separate file with dates of program enrollment. However, in order to do that I think that I need to move this into long format with the following look.

ID  Date1 VarA1 VarB1 VarC1 VarD1
ID  Date2 VarA2 VarB2 VarC2 VarD2

Once in long format, I know how to vector the first and last occurrences back into a wide format for analysis. My challenge is how to get the data into long format given that the dates and variables are all labeled differently and there are frequently 50 or more administrations of the assessment. Thanks.

Brian

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.



To unsubscribe from Reshaping from Wide to Long with Differing Dates and Variable Labels, click here.
NAML

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

bdates

Already solved that one. I’m working on the CASESTOVARS syntax. Thanks everyone.

 

B

 

Brian Dates, M.A.
Director of Evaluation and Research | Evaluation & Research | Southwest Counseling Solutions
Southwest Solutions
1906 25th Street, Detroit, MI 48216
313-297-1391 office | 313-849-2702 fax
[hidden email] | www.swsol.org

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso
Sent: Tuesday, May 16, 2017 2:44 PM
To: [hidden email]
Subject: Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

 

Just after the MATRIX code use

 

FORMATS date (ADATE).

Or whatever date format rocks your world.

 

On Tue, May 16, 2017 at 2:05 PM, Bruce Weaver [via SPSSX Discussion] <[hidden email]> wrote:

Nice one, David.  But try it with actual date variables.  When I tried that (for Date1, Date2, and Date3 only), the date values were expressed as # of seconds since October 14, 1582, and I had to use ALTER TYPE to get a real date variable.  I expect this will also be an issue for the MATRIX solution you posted.

David Marso wrote

Test solution for El Grande VarsToCases.  
I opted for AGGREGATE rather than CASESTOVARS as I could not get the names of Root to appear as new variable names.  This method also avoids any SORT which would typically be required.


/* Simulate Brian's data */.

DEFINE !renameAll (n !TOKENS(1) / list !CMDEND).
!LET !vlist=!Null !DO !I=1 !TO !n !DO !Var !IN ( !List)
!LET !vlist=!CONCAT(!vlist,!Var,!I," ")
!DOEND !DOEND
RENAME VARIABLES (X2 TO X251 = !vlist ).
!ENDDEFINE .

MATRIX.
SAVE MAKE(10000,251,1)/OUTFILE * / VARIABLES X1 TO X251 .
END MATRIX.
DATASET NAME raw.
RENAME VARIABLES x1=ID.

SET PRINTBACK ON MPRINT ON.
!renameAll n=50 list=date VarA VarB VarC VarD.

/* Simulation complete */.


/* Actual code */.
COMPUTE ID=$CASENUM.
COMPUTE @=1.
VARSTOCASES / MAKE values FROM date1 TO @ /INDEX =vnames(values).
SELECT IF vnames NE '@'.
COMPUTE firstNumber=CHAR.INDEX(vnames,'0123456789',1).
STRING Root (A8).
COMPUTE NewIndex=NUMBER(CHAR.SUBSTR(vnames,firstNumber),F2).
COMPUTE Root=CHAR.SUBSTR(vnames,1,firstNumber-1).
DO REPEAT v=date VarA VarB VarC VarD     /s="date" "VarA" "VarB" "VarC" "VarD".
IF (Root EQ s) v= values.
END REPEAT.
AGGREGATE OUTFILE * / BREAK ID NewIndex  / date VarA VarB VarC VarD=MAX(date VarA VarB VarC VarD).


David Marso wrote

You can use a single MAKE subcommand with an index and parse the index variable as I previously suggested.
If you take Rich's path note that LOOP has a BY clause which is useful.
LOOP #=1 TO 55 BY 5.
etc...

Another approach would be to use RESHAPE in MATRIX language.
Untested due to FUBARed SPSS .

MATRIX.
GET data / FILE */VARIABLES date1 TO varD50.
GET id /FILE * /VARIABLE ID.
COMPUTE n=NROW(data).
COMPUTE p=NCOL(data)/5.
SAVE ({KRONECKER(ID,MAKE(p,1,1),RESHAPE(data,n*p,5) } )
   /OUTFILE *
  /VARIABLES date varA varB varC varD.
END MATRIX.


Rich Ulrich wrote

The problem seems to be the long set of MAKE specs in CasesToVars.

I would avoid them by using the technique that I used before CasesToVars existed,

setting up a loop and assigning variables to be written with XSAVE.


For these data, you have something like VECTOR items= Date1 to VarD55  (say)

which will be 55*5 in length.


The you do a loop # = 1 to 55, and

Compute Row= 5*# -4.

COMPUTE  Date= items(Row).

COMPUTE  VarA= iems(Row+1).

COMPUTE  VarB= items(Row+2).

COMPUTE  VarC= items(Row+3).

COMPUTE  VarD= items(Row+4).


XSAVE  outfile= ....  /vars= Id Row Date VarA VarB VarC VarD .

end the loop and Execute.


--

Rich Ulrich

________________________________
From: SPSSX(r) Discussion <[hidden email]> on behalf of Brian Dates <[hidden email]>
Sent: Monday, May 15, 2017 4:02:27 PM
To: [hidden email]
Subject: Reshaping from Wide to Long with Differing Dates and Variable Labels

Hi, all.

I have a file in wide format with the following variables.

ID  Date1 VarA1 VarB1 VarC1 VarD1...Date2 VarA2 VarB2 VarC2 VarD2...

The numerals identify the time of the assessment. The letters identify the repeating variables for each time of assessment. I have a macro that will identify the first and last assessment dates and variables. However, for most of my work I'm looking for the dates and variables which occurred within the time a patient was in a particular program. This last task is easy given that I can match the id's to a separate file with dates of program enrollment. However, in order to do that I think that I need to move this into long format with the following look.

ID  Date1 VarA1 VarB1 VarC1 VarD1
ID  Date2 VarA2 VarB2 VarC2 VarD2

Once in long format, I know how to vector the first and last occurrences back into a wide format for analysis. My challenge is how to get the data into long format given that the dates and variables are all labeled differently and there are frequently 50 or more administrations of the assessment. Thanks.

Brian

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

 


To unsubscribe from Reshaping from Wide to Long with Differing Dates and Variable Labels, click here.
NAML

 

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

 


View this message in context: Re: Reshaping from Wide to Long with Differing Dates and Variable Labels
Sent from the SPSSX Discussion mailing list archive at Nabble.com.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

David Marso
Administrator
Brian,
  I got very close earlier but all my values were a constant.
This is cleaner than having to know all of the variable names ;-)
The Key here is /AUTOFIX=NO.

COMPUTE @=1.
VARSTOCASES / MAKE values FROM date1 TO @ /INDEX =vnames(values).
SELECT IF vnames NE '@'.
COMPUTE #firstNumber=CHAR.INDEX(vnames,'0123456789',1).
STRING Root (A8).
COMPUTE NewIndex=NUMBER(CHAR.SUBSTR(vnames,#firstNumber),F2).
COMPUTE Root=CHAR.SUBSTR(vnames,1,#firstNumber-1).
EXECUTE.
DELETE VARIABLES vnames .
CASESTOVARS /ID=ID NewIndex /INDEX=Root /AUTOFIX=NO.

On Tue, May 16, 2017 at 2:46 PM, bdates [via SPSSX Discussion] <[hidden email]> wrote:

Already solved that one. I’m working on the CASESTOVARS syntax. Thanks everyone.

 

B

 

Brian Dates, M.A.
Director of Evaluation and Research | Evaluation & Research | Southwest Counseling Solutions
Southwest Solutions
1906 25th Street, Detroit, MI 48216
<a href="tel:(313)%20297-1391" target="_blank" value="+13132971391">313-297-1391 office | <a href="tel:(313)%20849-2702" target="_blank" value="+13138492702">313-849-2702 fax
[hidden email] | www.swsol.org

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso
Sent: Tuesday, May 16, 2017 2:44 PM
To: [hidden email]
Subject: Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

 

Just after the MATRIX code use

 

FORMATS date (ADATE).

Or whatever date format rocks your world.

 

On Tue, May 16, 2017 at 2:05 PM, Bruce Weaver [via SPSSX Discussion] <[hidden email]> wrote:

Nice one, David.  But try it with actual date variables.  When I tried that (for Date1, Date2, and Date3 only), the date values were expressed as # of seconds since October 14, 1582, and I had to use ALTER TYPE to get a real date variable.  I expect this will also be an issue for the MATRIX solution you posted.

David Marso wrote

Test solution for El Grande VarsToCases.  
I opted for AGGREGATE rather than CASESTOVARS as I could not get the names of Root to appear as new variable names.  This method also avoids any SORT which would typically be required.


/* Simulate Brian's data */.

DEFINE !renameAll (n !TOKENS(1) / list !CMDEND).
!LET !vlist=!Null !DO !I=1 !TO !n !DO !Var !IN ( !List)
!LET !vlist=!CONCAT(!vlist,!Var,!I," ")
!DOEND !DOEND
RENAME VARIABLES (X2 TO X251 = !vlist ).
!ENDDEFINE .

MATRIX.
SAVE MAKE(10000,251,1)/OUTFILE * / VARIABLES X1 TO X251 .
END MATRIX.
DATASET NAME raw.
RENAME VARIABLES x1=ID.

SET PRINTBACK ON MPRINT ON.
!renameAll n=50 list=date VarA VarB VarC VarD.

/* Simulation complete */.


/* Actual code */.
COMPUTE ID=$CASENUM.
COMPUTE @=1.
VARSTOCASES / MAKE values FROM date1 TO @ /INDEX =vnames(values).
SELECT IF vnames NE '@'.
COMPUTE firstNumber=CHAR.INDEX(vnames,'0123456789',1).
STRING Root (A8).
COMPUTE NewIndex=NUMBER(CHAR.SUBSTR(vnames,firstNumber),F2).
COMPUTE Root=CHAR.SUBSTR(vnames,1,firstNumber-1).
DO REPEAT v=date VarA VarB VarC VarD     /s="date" "VarA" "VarB" "VarC" "VarD".
IF (Root EQ s) v= values.
END REPEAT.
AGGREGATE OUTFILE * / BREAK ID NewIndex  / date VarA VarB VarC VarD=MAX(date VarA VarB VarC VarD).


David Marso wrote

You can use a single MAKE subcommand with an index and parse the index variable as I previously suggested.
If you take Rich's path note that LOOP has a BY clause which is useful.
LOOP #=1 TO 55 BY 5.
etc...

Another approach would be to use RESHAPE in MATRIX language.
Untested due to FUBARed SPSS .

MATRIX.
GET data / FILE */VARIABLES date1 TO varD50.
GET id /FILE * /VARIABLE ID.
COMPUTE n=NROW(data).
COMPUTE p=NCOL(data)/5.
SAVE ({KRONECKER(ID,MAKE(p,1,1),RESHAPE(data,n*p,5) } )
   /OUTFILE *
  /VARIABLES date varA varB varC varD.
END MATRIX.


Rich Ulrich wrote

The problem seems to be the long set of MAKE specs in CasesToVars.

I would avoid them by using the technique that I used before CasesToVars existed,

setting up a loop and assigning variables to be written with XSAVE.


For these data, you have something like VECTOR items= Date1 to VarD55  (say)

which will be 55*5 in length.


The you do a loop # = 1 to 55, and

Compute Row= 5*# -4.

COMPUTE  Date= items(Row).

COMPUTE  VarA= iems(Row+1).

COMPUTE  VarB= items(Row+2).

COMPUTE  VarC= items(Row+3).

COMPUTE  VarD= items(Row+4).


XSAVE  outfile= ....  /vars= Id Row Date VarA VarB VarC VarD .

end the loop and Execute.


--

Rich Ulrich

________________________________
From: SPSSX(r) Discussion <[hidden email]> on behalf of Brian Dates <[hidden email]>
Sent: Monday, May 15, 2017 4:02:27 PM
To: [hidden email]
Subject: Reshaping from Wide to Long with Differing Dates and Variable Labels

Hi, all.

I have a file in wide format with the following variables.

ID  Date1 VarA1 VarB1 VarC1 VarD1...Date2 VarA2 VarB2 VarC2 VarD2...

The numerals identify the time of the assessment. The letters identify the repeating variables for each time of assessment. I have a macro that will identify the first and last assessment dates and variables. However, for most of my work I'm looking for the dates and variables which occurred within the time a patient was in a particular program. This last task is easy given that I can match the id's to a separate file with dates of program enrollment. However, in order to do that I think that I need to move this into long format with the following look.

ID  Date1 VarA1 VarB1 VarC1 VarD1
ID  Date2 VarA2 VarB2 VarC2 VarD2

Once in long format, I know how to vector the first and last occurrences back into a wide format for analysis. My challenge is how to get the data into long format given that the dates and variables are all labeled differently and there are frequently 50 or more administrations of the assessment. Thanks.

Brian

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

 


To unsubscribe from Reshaping from Wide to Long with Differing Dates and Variable Labels, click here.
NAML

 

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

 


View this message in context: Re: Reshaping from Wide to Long with Differing Dates and Variable Labels
Sent from the SPSSX Discussion mailing list archive at Nabble.com.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


To unsubscribe from Reshaping from Wide to Long with Differing Dates and Variable Labels, click here.
NAML

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

bdates
David,

Thank-you very much. I'm separated from my SPSS right now, so I'll try this in the morning. Thanks to Bruce, Gene, and Rich as well. I really appreciate this.

B

From: SPSSX(r) Discussion [[hidden email]] on behalf of David Marso [[hidden email]]
Sent: Tuesday, May 16, 2017 4:29 PM
To: [hidden email]
Subject: Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

Brian,
  I got very close earlier but all my values were a constant.
This is cleaner than having to know all of the variable names ;-)
The Key here is /AUTOFIX=NO.

COMPUTE @=1.
VARSTOCASES / MAKE values FROM date1 TO @ /INDEX =vnames(values).
SELECT IF vnames NE '@'.
COMPUTE #firstNumber=CHAR.INDEX(vnames,'0123456789',1).
STRING Root (A8).
COMPUTE NewIndex=NUMBER(CHAR.SUBSTR(vnames,#firstNumber),F2).
COMPUTE Root=CHAR.SUBSTR(vnames,1,#firstNumber-1).
EXECUTE.
DELETE VARIABLES vnames .
CASESTOVARS /ID=ID NewIndex /INDEX=Root /AUTOFIX=NO.

On Tue, May 16, 2017 at 2:46 PM, bdates [via SPSSX Discussion] <[hidden email]> wrote:

Already solved that one. I’m working on the CASESTOVARS syntax. Thanks everyone.

 

B

 

Brian Dates, M.A.
Director of Evaluation and Research | Evaluation & Research | Southwest Counseling Solutions
Southwest Solutions
1906 25th Street, Detroit, MI 48216
<a href="tel:(313)%20297-1391" target="_blank" value="+13132971391">313-297-1391 office | <a href="tel:(313)%20849-2702" target="_blank" value="+13138492702">313-849-2702 fax
[hidden email] | www.swsol.org

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso
Sent: Tuesday, May 16, 2017 2:44 PM
To: [hidden email]
Subject: Re: Reshaping from Wide to Long with Differing Dates and Variable Labels

 

Just after the MATRIX code use

 

FORMATS date (ADATE).

Or whatever date format rocks your world.

 

On Tue, May 16, 2017 at 2:05 PM, Bruce Weaver [via SPSSX Discussion] <[hidden email]> wrote:

Nice one, David.  But try it with actual date variables.  When I tried that (for Date1, Date2, and Date3 only), the date values were expressed as # of seconds since October 14, 1582, and I had to use ALTER TYPE to get a real date variable.  I expect this will also be an issue for the MATRIX solution you posted.

David Marso wrote

Test solution for El Grande VarsToCases.  
I opted for AGGREGATE rather than CASESTOVARS as I could not get the names of Root to appear as new variable names.  This method also avoids any SORT which would typically be required.


/* Simulate Brian's data */.

DEFINE !renameAll (n !TOKENS(1) / list !CMDEND).
!LET !vlist=!Null !DO !I=1 !TO !n !DO !Var !IN ( !List)
!LET !vlist=!CONCAT(!vlist,!Var,!I," ")
!DOEND !DOEND
RENAME VARIABLES (X2 TO X251 = !vlist ).
!ENDDEFINE .

MATRIX.
SAVE MAKE(10000,251,1)/OUTFILE * / VARIABLES X1 TO X251 .
END MATRIX.
DATASET NAME raw.
RENAME VARIABLES x1=ID.

SET PRINTBACK ON MPRINT ON.
!renameAll n=50 list=date VarA VarB VarC VarD.

/* Simulation complete */.


/* Actual code */.
COMPUTE ID=$CASENUM.
COMPUTE @=1.
VARSTOCASES / MAKE values FROM date1 TO @ /INDEX =vnames(values).
SELECT IF vnames NE '@'.
COMPUTE firstNumber=CHAR.INDEX(vnames,'0123456789',1).
STRING Root (A8).
COMPUTE NewIndex=NUMBER(CHAR.SUBSTR(vnames,firstNumber),F2).
COMPUTE Root=CHAR.SUBSTR(vnames,1,firstNumber-1).
DO REPEAT v=date VarA VarB VarC VarD     /s="date" "VarA" "VarB" "VarC" "VarD".
IF (Root EQ s) v= values.
END REPEAT.
AGGREGATE OUTFILE * / BREAK ID NewIndex  / date VarA VarB VarC VarD=MAX(date VarA VarB VarC VarD).


David Marso wrote

You can use a single MAKE subcommand with an index and parse the index variable as I previously suggested.
If you take Rich's path note that LOOP has a BY clause which is useful.
LOOP #=1 TO 55 BY 5.
etc...

Another approach would be to use RESHAPE in MATRIX language.
Untested due to FUBARed SPSS .

MATRIX.
GET data / FILE */VARIABLES date1 TO varD50.
GET id /FILE * /VARIABLE ID.
COMPUTE n=NROW(data).
COMPUTE p=NCOL(data)/5.
SAVE ({KRONECKER(ID,MAKE(p,1,1),RESHAPE(data,n*p,5) } )
   /OUTFILE *
  /VARIABLES date varA varB varC varD.
END MATRIX.


Rich Ulrich wrote

The problem seems to be the long set of MAKE specs in CasesToVars.

I would avoid them by using the technique that I used before CasesToVars existed,

setting up a loop and assigning variables to be written with XSAVE.


For these data, you have something like VECTOR items= Date1 to VarD55  (say)

which will be 55*5 in length.


The you do a loop # = 1 to 55, and

Compute Row= 5*# -4.

COMPUTE  Date= items(Row).

COMPUTE  VarA= iems(Row+1).

COMPUTE  VarB= items(Row+2).

COMPUTE  VarC= items(Row+3).

COMPUTE  VarD= items(Row+4).


XSAVE  outfile= ....  /vars= Id Row Date VarA VarB VarC VarD .

end the loop and Execute.


--

Rich Ulrich

________________________________
From: SPSSX(r) Discussion <[hidden email]> on behalf of Brian Dates <[hidden email]>
Sent: Monday, May 15, 2017 4:02:27 PM
To: [hidden email]
Subject: Reshaping from Wide to Long with Differing Dates and Variable Labels

Hi, all.

I have a file in wide format with the following variables.

ID  Date1 VarA1 VarB1 VarC1 VarD1...Date2 VarA2 VarB2 VarC2 VarD2...

The numerals identify the time of the assessment. The letters identify the repeating variables for each time of assessment. I have a macro that will identify the first and last assessment dates and variables. However, for most of my work I'm looking for the dates and variables which occurred within the time a patient was in a particular program. This last task is easy given that I can match the id's to a separate file with dates of program enrollment. However, in order to do that I think that I need to move this into long format with the following look.

ID  Date1 VarA1 VarB1 VarC1 VarD1
ID  Date2 VarA2 VarB2 VarC2 VarD2

Once in long format, I know how to vector the first and last occurrences back into a wide format for analysis. My challenge is how to get the data into long format given that the dates and variables are all labeled differently and there are frequently 50 or more administrations of the assessment. Thanks.

Brian

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

 


To unsubscribe from Reshaping from Wide to Long with Differing Dates and Variable Labels, click here.
NAML

 

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

 


View this message in context: Re: Reshaping from Wide to Long with Differing Dates and Variable Labels
Sent from the SPSSX Discussion mailing list archive at Nabble.com.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


To unsubscribe from Reshaping from Wide to Long with Differing Dates and Variable Labels, click here.
NAML

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"


View this message in context: Re: Reshaping from Wide to Long with Differing Dates and Variable Labels
Sent from the SPSSX Discussion mailing list archive at Nabble.com.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Loading...