# obtaining the average number of consecutive responses of a true response

8 messages
Open this post in threaded view
|
Report Content as Inappropriate

## obtaining the average number of consecutive responses of a true response

 Dear List: I have a data set with 375 items answered with either a true response or a false response. I want to be able to identify folks who are being careless or demonstrating insufficient effort by responding with the same response over time. In particular, I want to get the average number of runs of true for each individual. I have been able to obtain some syntax that captures the maximum number of consecutive runs of either of T or of F. I have done google search but nothing comes up close to getting me the average number of true runs.  The syntax below captures max number of runs of either true or false. Is this possible.     VECTOR v = v1 to v375. COMPUTE #run = 1. COMPUTE maxrun = 1. LOOP #i = 2 to 375.      DO IF v(#i) eq v(#i-1).         COMPUTE #run = #run + 1.        COMPUTE maxrun = max(maxrun, #run).      ELSE.        COMPUTE #run = 1.     END IF. END LOOP. EXECUTE.     Martin F. Sherman, Ph.D. Professor of Psychology Director of Master’s Education: Thesis Track   Department of Psychology 222 B Beatty Hall 4501 North Charles Street Baltimore, MD 21210   410-617-2417 tel 410-617-5341 fax   ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Open this post in threaded view
|
Report Content as Inappropriate

## Re: obtaining the average number of consecutive responses of a true response

 The usual Runs test, which I expect to be more robust in general, tests the number or runs rather than the maximum length.  You can obtain this by using FLIP to transpose the data, and Nonparametric tests to test for each individual. If you want to test the maximum length, as you propose, you can use the cdf of the binomial function to get p-values.   If the proportions are not always near 0.5, you could compute the exact mean for that binomial. -- Rich Ulrich From: SPSSX(r) Discussion <[hidden email]> on behalf of Martin Sherman <[hidden email]> Sent: Saturday, November 5, 2016 3:11 PM To: [hidden email] Subject: obtaining the average number of consecutive responses of a true response   Dear List: I have a data set with 375 items answered with either a true response or a false response. I want to be able to identify folks who are being careless or demonstrating insufficient effort by responding with the same response over time. In particular, I want to get the average number of runs of true for each individual. I have been able to obtain some syntax that captures the maximum number of consecutive runs of either of T or of F. I have done google search but nothing comes up close to getting me the average number of true runs.  The syntax below captures max number of runs of either true or false. Is this possible.     VECTOR v = v1 to v375. COMPUTE #run = 1. COMPUTE maxrun = 1. LOOP #i = 2 to 375.      DO IF v(#i) eq v(#i-1).         COMPUTE #run = #run + 1.        COMPUTE maxrun = max(maxrun, #run).      ELSE.        COMPUTE #run = 1.     END IF. END LOOP. EXECUTE.   ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Open this post in threaded view
|
Report Content as Inappropriate

## Re: obtaining the average number of consecutive responses of a true response

 Good call Rich, I would use VARSTOCASES and SPLIT FILE though. Example below, plus an example heatmap to visualize the incorrect responses. **************************************************. *SIMULATING EXAMPLE DATA. SET SEED 10. INPUT PROGRAM. LOOP Person = 1 TO 40. END CASE. END LOOP. END FILE. END INPUT PROGRAM. *Simulating 100 variables, random T/F. *Person 4 has weird run at 70 to 100. *Person 7 has weird run 40 to 65. *Person 32 has weird run 1 to 10. VECTOR A(100). LOOP #i = 1 TO 100.   COMPUTE A(#i) = RV.BERNOULLI(0.80). END LOOP. DO IF Person = 4.   RECODE A70 TO A100 (ELSE = 0). ELSE IF Person = 7.   RECODE A40 TO A65 (ELSE = 0). ELSE IF Person = 32.   RECODE A1 TO A10 (ELSE = 0). END IF. EXECUTE. *CONDUCTING THE ANALYSIS *Now you would reshape the dataset, then split file. VARSTOCASES /MAKE A FROM A1 TO A100 /INDEX AnswerNum. SPLIT FILE BY Person. NPAR TESTS /RUNS(0.5)=A. SPLIT FILE OFF. *May want to use OMS to make it easier to flag folks. *EXTRA VIZ - HEATMAP OF RUNS. FORMATS A (F1.0) Person (F2.0). VALUE LABELS A 0 'False' 1 'Correct'. GGRAPH   /GRAPHDATASET NAME="graphdataset" VARIABLES=AnswerNum Person A MISSING=LISTWISE REPORTMISSING=NO   /GRAPHSPEC SOURCE=INLINE. BEGIN GPL   PAGE: begin(scale(600px,900px))   SOURCE: s=userSource(id("graphdataset"))   DATA: AnswerNum=col(source(s), name("AnswerNum"), unit.category())   DATA: Person=col(source(s), name("Person"), unit.category())   DATA: A=col(source(s), name("A"), unit.category())   GUIDE: axis(dim(1), null())   GUIDE: axis(dim(2), label("Individual Incorrect Answers"))   GUIDE: legend(aesthetic(aesthetic.color.interior), null())   SCALE: cat(dim(2), sort.statistic(summary.mean(A)), reverse())   SCALE: cat(aesthetic(aesthetic.color.interior), map(("0", color.black),("1",color.white)))   ELEMENT: polygon(position(AnswerNum*Person), color.interior(A), color.exterior(color.white))   PAGE: end() END GPL. *This automatically sorts those with the most incorrect to the top of the graphic. **************************************************. I've written some code for a runs test for multiple groups, so could be an ok exploratory tool for multiple choice answers.  - http://stats.stackexchange.com/a/73170/1036 - code here, https://www.dropbox.com/sh/kr6qvukrw6xvue4/AABWSg-DAcoLoysqTKMyeRdNaAs is the FLIP file approach may be easier with that though, as that macro won't work with SPLIT FILE. (It could, but it would take alittle work.) Andy W apwheele@gmail.com http://andrewpwheeler.wordpress.com/
Open this post in threaded view
|
Report Content as Inappropriate

## Re: obtaining the average number of consecutive responses of a true response

 To the analysis and the graph suggested by Andy W above I might add a trick to highlight chains (runs) of 1s of different lengths with my macro function /*!runs()*/ which operates in MATRIX session. The highlighted dataset could then be plotted by GPL syntax similar to Andy's. *Andy's simulated example dataset. SET SEED 10. INPUT PROGRAM. LOOP Person = 1 TO 40. END CASE. END LOOP. END FILE. END INPUT PROGRAM. *Simulating 100 variables, random T/F. *Person 4 has weird run at 70 to 100. *Person 7 has weird run 40 to 65. *Person 32 has weird run 1 to 10. VECTOR A(100). LOOP #i = 1 TO 100.   COMPUTE A(#i) = RV.BERNOULLI(0.80). END LOOP. DO IF Person = 4.   RECODE A70 TO A100 (ELSE = 0). ELSE IF Person = 7.   RECODE A40 TO A65 (ELSE = 0). ELSE IF Person = 32.   RECODE A1 TO A10 (ELSE = 0). END IF. EXECUTE. *The code of the macro function, to read into memory. *(you can find the function at http://www.spsstools.net/en/KO-spssmacros *collection "Matrix - End Matrix fuctuions"). define !runs(!pos= !token(1) /!pos= !charend('%') /!pos= !charend('%') /!pos= !charend(')')) comp !4= !2. comp @maxw= !3. loop @w= 2 to @maxw. -comp @w_= @w-1. -comp @a= !4(:,1:(ncol(!4)-@w_)). -comp @b= @a. -loop @i= 2 to @w. - comp @b= @b and !4(:,@i:(ncol(!4)-@w+@i)). -end loop. -comp !4(:,1:(ncol(!4)-@w_))= @a+@b. -loop @i= 1 to @w_. - comp @a= !4(:,2:ncol(!4)). - comp @b= !4(:,1:(ncol(!4)-1)). - comp !4(:,2:ncol(!4))= @a+(@a=@w_)&*(@b=@w). -end loop. end loop. release @maxw,@w,@w_,@a,@b,@i. !enddefine. *Run the highlighting. set mxloops 10000. matrix. get data /vari= A1 to A100 /names= names. !runs(data%5%runs). /*I set argument maxw here to 5 save runs /out= * /names= names. end matrix. *In this example with maxw=5 all chains (runs) of length 1 will be coded as 1, *of length 2  will be coded as 2, ..., of length 5+  will be coded as 5. 06.11.2016 16:43, Andy W пишет: ```Good call Rich, I would use VARSTOCASES and SPLIT FILE though. Example below, plus an example heatmap to visualize the incorrect responses. **************************************************. *SIMULATING EXAMPLE DATA. SET SEED 10. INPUT PROGRAM. LOOP Person = 1 TO 40. END CASE. END LOOP. END FILE. END INPUT PROGRAM. *Simulating 100 variables, random T/F. *Person 4 has weird run at 70 to 100. *Person 7 has weird run 40 to 65. *Person 32 has weird run 1 to 10. VECTOR A(100). LOOP #i = 1 TO 100. COMPUTE A(#i) = RV.BERNOULLI(0.80). END LOOP. DO IF Person = 4. RECODE A70 TO A100 (ELSE = 0). ELSE IF Person = 7. RECODE A40 TO A65 (ELSE = 0). ELSE IF Person = 32. RECODE A1 TO A10 (ELSE = 0). END IF. EXECUTE. *CONDUCTING THE ANALYSIS *Now you would reshape the dataset, then split file. VARSTOCASES /MAKE A FROM A1 TO A100 /INDEX AnswerNum. SPLIT FILE BY Person. NPAR TESTS /RUNS(0.5)=A. SPLIT FILE OFF. *May want to use OMS to make it easier to flag folks. *EXTRA VIZ - HEATMAP OF RUNS. FORMATS A (F1.0) Person (F2.0). VALUE LABELS A 0 'False' 1 'Correct'. GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=AnswerNum Person A MISSING=LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL PAGE: begin(scale(600px,900px)) SOURCE: s=userSource(id("graphdataset")) DATA: AnswerNum=col(source(s), name("AnswerNum"), unit.category()) DATA: Person=col(source(s), name("Person"), unit.category()) DATA: A=col(source(s), name("A"), unit.category()) GUIDE: axis(dim(1), null()) GUIDE: axis(dim(2), label("Individual Incorrect Answers")) GUIDE: legend(aesthetic(aesthetic.color.interior), null()) SCALE: cat(dim(2), sort.statistic(summary.mean(A)), reverse()) SCALE: cat(aesthetic(aesthetic.color.interior), map(("0", color.black),("1",color.white))) ELEMENT: polygon(position(AnswerNum*Person), color.interior(A), color.exterior(color.white)) PAGE: end() END GPL. *This automatically sorts those with the most incorrect to the top of the graphic. ``` ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Open this post in threaded view
|
Report Content as Inappropriate

## Re: obtaining the average number of consecutive responses of a true response

 I ran the syntax below and changed the maxw to 100 and obtained a data file that contains the runs of true (here is one subject) Here is the actual true and false  (1 and 0) .00          1.00        .00          1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00                .00     etc.   And here is the output   .00          1.00        .00          12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00                .00   So this creates the number to trues for each run of true but now I want to obtain each participant’s average number of true runs Just using the above the average would be  (1 + 12)/2 = 6.5   averaged across the two runs of true   And now I want to do this across all 100 variables and get an average for each participant. The I will examine the distribution of all participants To see which participants are outliers in regard to average number of runs per participant.      From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Kirill Orlov Sent: Sunday, November 06, 2016 9:35 AM To: [hidden email] Subject: Re: obtaining the average number of consecutive responses of a true response   To the analysis and the graph suggested by Andy W above I might add a trick to highlight chains (runs) of 1s of different lengths with my macro function /*!runs()*/ which operates in MATRIX session. The highlighted dataset could then be plotted by GPL syntax similar to Andy's. *Andy's simulated example dataset. SET SEED 10. INPUT PROGRAM. LOOP Person = 1 TO 40. END CASE. END LOOP. END FILE. END INPUT PROGRAM. *Simulating 100 variables, random T/F. *Person 4 has weird run at 70 to 100. *Person 7 has weird run 40 to 65. *Person 32 has weird run 1 to 10. VECTOR A(100). LOOP #i = 1 TO 100.   COMPUTE A(#i) = RV.BERNOULLI(0.80). END LOOP. DO IF Person = 4.   RECODE A70 TO A100 (ELSE = 0). ELSE IF Person = 7.   RECODE A40 TO A65 (ELSE = 0). ELSE IF Person = 32.   RECODE A1 TO A10 (ELSE = 0). END IF. EXECUTE. *The code of the macro function, to read into memory. *(you can find the function at http://www.spsstools.net/en/KO-spssmacros *collection "Matrix - End Matrix fuctuions"). define !runs(!pos= !token(1) /!pos= !charend('%') /!pos= !charend('%') /!pos= !charend(')')) comp !4= !2. comp @maxw= !3. loop @w= 2 to @maxw. -comp @w_= @w-1. -comp @a= !4(:,1:(ncol(!4)-@w_)). -comp @b= @a. -loop @i= 2 to @w. - comp @b= @b and !4(:,@i:(ncol(!4)-@w+@i)). -end loop. -comp !4(:,1:(ncol(!4)-@w_))= @a+@b. -loop @i= 1 to @w_. - comp @a= !4(:,2:ncol(!4)). - comp @b= !4(:,1:(ncol(!4)-1)). - comp !4(:,2:ncol(!4))= @a+(@a=@w_)&*(@b=@w). -end loop. end loop. release @maxw,@w,@w_,@a,@b,@i. !enddefine. *Run the highlighting. set mxloops 10000. matrix. get data /vari= A1 to A100 /names= names. !runs(data%5%runs). /*I set argument maxw here to 5 save runs /out= * /names= names. end matrix. *In this example with maxw=5 all chains (runs) of length 1 will be coded as 1, *of length 2  will be coded as 2, ..., of length 5+  will be coded as 5. 06.11.2016 16:43, Andy W пишет: `Good call Rich, I would use VARSTOCASES and SPLIT FILE though. Example below,` `plus an example heatmap to visualize the incorrect responses.` ` ` `**************************************************.` `*SIMULATING EXAMPLE DATA.` `SET SEED 10.` `INPUT PROGRAM.` `LOOP Person = 1 TO 40.` `END CASE.` `END LOOP.` `END FILE.` `END INPUT PROGRAM.` ` ` `*Simulating 100 variables, random T/F.` `*Person 4 has weird run at 70 to 100.` `*Person 7 has weird run 40 to 65.` `*Person 32 has weird run 1 to 10.` `VECTOR A(100).` `LOOP #i = 1 TO 100.` `  COMPUTE A(#i) = RV.BERNOULLI(0.80).` `END LOOP.` `DO IF Person = 4.` `  RECODE A70 TO A100 (ELSE = 0).` `ELSE IF Person = 7.` `  RECODE A40 TO A65 (ELSE = 0).` `ELSE IF Person = 32.` `  RECODE A1 TO A10 (ELSE = 0).` `END IF.` `EXECUTE.` ` ` `*CONDUCTING THE ANALYSIS` `*Now you would reshape the dataset, then split file.` `VARSTOCASES /MAKE A FROM A1 TO A100 /INDEX AnswerNum.` `SPLIT FILE BY Person.` `NPAR TESTS /RUNS(0.5)=A.` `SPLIT FILE OFF.` `*May want to use OMS to make it easier to flag folks.` ` ` `*EXTRA VIZ - HEATMAP OF RUNS.` `FORMATS A (F1.0) Person (F2.0).` `VALUE LABELS A 0 'False' 1 'Correct'.` `GGRAPH` `  /GRAPHDATASET NAME="graphdataset" VARIABLES=AnswerNum Person A` `MISSING=LISTWISE REPORTMISSING=NO` `  /GRAPHSPEC SOURCE=INLINE.` `BEGIN GPL` `  PAGE: begin(scale(600px,900px))` `  SOURCE: s=userSource(id("graphdataset"))` `  DATA: AnswerNum=col(source(s), name("AnswerNum"), unit.category())` `  DATA: Person=col(source(s), name("Person"), unit.category())` `  DATA: A=col(source(s), name("A"), unit.category())` `  GUIDE: axis(dim(1), null())` `  GUIDE: axis(dim(2), label("Individual Incorrect Answers"))` `  GUIDE: legend(aesthetic(aesthetic.color.interior), null())` `  SCALE: cat(dim(2), sort.statistic(summary.mean(A)), reverse())` `  SCALE: cat(aesthetic(aesthetic.color.interior), map(("0",` `color.black),("1",color.white)))` `  ELEMENT: polygon(position(AnswerNum*Person), color.interior(A),` `color.exterior(color.white))` `  PAGE: end()` `END GPL.` `*This automatically sorts those with the most incorrect to the top of the` `graphic.` ` `   ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Open this post in threaded view
|
Report Content as Inappropriate

## Re: obtaining the average number of consecutive responses of a true response

 Martin, Having not studied Kiril’s macro, I’m mystified by the results you got. You ran Andy’s example to create a dataset of 40 records of 100 variables and a ‘1’ was coded at probability .8. Passed it through the macro and got what? A record with 16 values for each of your 40 people and that for one person shows 12 strings of each of length 12—all that from a string of length 100 going in to the macro. Or, a record of the first 16 people with one value per person. Gene Maguin     From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Martin Sherman Sent: Sunday, November 06, 2016 12:10 PM To: [hidden email] Subject: Re: obtaining the average number of consecutive responses of a true response   I ran the syntax below and changed the maxw to 100 and obtained a data file that contains the runs of true (here is one subject) Here is the actual true and false  (1 and 0) .00          1.00        .00          1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00                .00     etc.   And here is the output   .00          1.00        .00          12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00                .00   So this creates the number to trues for each run of true but now I want to obtain each participant’s average number of true runs Just using the above the average would be  (1 + 12)/2 = 6.5   averaged across the two runs of true   And now I want to do this across all 100 variables and get an average for each participant. The I will examine the distribution of all participants To see which participants are outliers in regard to average number of runs per participant.      From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Kirill Orlov Sent: Sunday, November 06, 2016 9:35 AM To: [hidden email] Subject: Re: obtaining the average number of consecutive responses of a true response   To the analysis and the graph suggested by Andy W above I might add a trick to highlight chains (runs) of 1s of different lengths with my macro function /*!runs()*/ which operates in MATRIX session. The highlighted dataset could then be plotted by GPL syntax similar to Andy's. *Andy's simulated example dataset. SET SEED 10. INPUT PROGRAM. LOOP Person = 1 TO 40. END CASE. END LOOP. END FILE. END INPUT PROGRAM. *Simulating 100 variables, random T/F. *Person 4 has weird run at 70 to 100. *Person 7 has weird run 40 to 65. *Person 32 has weird run 1 to 10. VECTOR A(100). LOOP #i = 1 TO 100.   COMPUTE A(#i) = RV.BERNOULLI(0.80). END LOOP. DO IF Person = 4.   RECODE A70 TO A100 (ELSE = 0). ELSE IF Person = 7.   RECODE A40 TO A65 (ELSE = 0). ELSE IF Person = 32.   RECODE A1 TO A10 (ELSE = 0). END IF. EXECUTE. *The code of the macro function, to read into memory. *(you can find the function at http://www.spsstools.net/en/KO-spssmacros *collection "Matrix - End Matrix fuctuions"). define !runs(!pos= !token(1) /!pos= !charend('%') /!pos= !charend('%') /!pos= !charend(')')) comp !4= !2. comp @maxw= !3. loop @w= 2 to @maxw. -comp @w_= @w-1. -comp @a= !4(:,1:(ncol(!4)-@w_)). -comp @b= @a. -loop @i= 2 to @w. - comp @b= @b and !4(:,@i:(ncol(!4)-@w+@i)). -end loop. -comp !4(:,1:(ncol(!4)-@w_))= @a+@b. -loop @i= 1 to @w_. - comp @a= !4(:,2:ncol(!4)). - comp @b= !4(:,1:(ncol(!4)-1)). - comp !4(:,2:ncol(!4))= @a+(@a=@w_)&*(@b=@w). -end loop. end loop. release @maxw,@w,@w_,@a,@b,@i. !enddefine. *Run the highlighting. set mxloops 10000. matrix. get data /vari= A1 to A100 /names= names. !runs(data%5%runs). /*I set argument maxw here to 5 save runs /out= * /names= names. end matrix. *In this example with maxw=5 all chains (runs) of length 1 will be coded as 1, *of length 2  will be coded as 2, ..., of length 5+  will be coded as 5. 06.11.2016 16:43, Andy W пишет: `Good call Rich, I would use VARSTOCASES and SPLIT FILE though. Example below,` `plus an example heatmap to visualize the incorrect responses.` ` ` `**************************************************.` `*SIMULATING EXAMPLE DATA.` `SET SEED 10.` `INPUT PROGRAM.` `LOOP Person = 1 TO 40.` `END CASE.` `END LOOP.` `END FILE.` `END INPUT PROGRAM.` ` ` `*Simulating 100 variables, random T/F.` `*Person 4 has weird run at 70 to 100.` `*Person 7 has weird run 40 to 65.` `*Person 32 has weird run 1 to 10.` `VECTOR A(100).` `LOOP #i = 1 TO 100.` `  COMPUTE A(#i) = RV.BERNOULLI(0.80).` `END LOOP.` `DO IF Person = 4.` `  RECODE A70 TO A100 (ELSE = 0).` `ELSE IF Person = 7.` `  RECODE A40 TO A65 (ELSE = 0).` `ELSE IF Person = 32.` `  RECODE A1 TO A10 (ELSE = 0).` `END IF.` `EXECUTE.` ` ` `*CONDUCTING THE ANALYSIS` `*Now you would reshape the dataset, then split file.` `VARSTOCASES /MAKE A FROM A1 TO A100 /INDEX AnswerNum.` `SPLIT FILE BY Person.` `NPAR TESTS /RUNS(0.5)=A.` `SPLIT FILE OFF.` `*May want to use OMS to make it easier to flag folks.` ` ` `*EXTRA VIZ - HEATMAP OF RUNS.` `FORMATS A (F1.0) Person (F2.0).` `VALUE LABELS A 0 'False' 1 'Correct'.` `GGRAPH` `  /GRAPHDATASET NAME="graphdataset" VARIABLES=AnswerNum Person A` `MISSING=LISTWISE REPORTMISSING=NO` `  /GRAPHSPEC SOURCE=INLINE.` `BEGIN GPL` `  PAGE: begin(scale(600px,900px))` `  SOURCE: s=userSource(id("graphdataset"))` `  DATA: AnswerNum=col(source(s), name("AnswerNum"), unit.category())` `  DATA: Person=col(source(s), name("Person"), unit.category())` `  DATA: A=col(source(s), name("A"), unit.category())` `  GUIDE: axis(dim(1), null())` `  GUIDE: axis(dim(2), label("Individual Incorrect Answers"))` `  GUIDE: legend(aesthetic(aesthetic.color.interior), null())` `  SCALE: cat(dim(2), sort.statistic(summary.mean(A)), reverse())` `  SCALE: cat(aesthetic(aesthetic.color.interior), map(("0",` `color.black),("1",color.white)))` `  ELEMENT: polygon(position(AnswerNum*Person), color.interior(A),` `color.exterior(color.white))` `  PAGE: end()` `END GPL.` `*This automatically sorts those with the most incorrect to the top of the` `graphic.` ` `   ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Open this post in threaded view
|
Report Content as Inappropriate

## Re: obtaining the average number of consecutive responses of a true response

 Gene:  My mistake. The syntax I ran is below.  Creating the data used p  = .50 and then changed maxw to 100%.   Data generated ID            v1           v2           v3           v4           v5           v6           v7           v8           v9           v10         v11         v12         v13         v14                ………  v100 1.00        .00          .00          .00          .00          1.00        .00          1.00        1.00        1.00        .00          1.00        1.00        .00          .00        etc 2.00        1.00        1.00        .00          .00          1.00        .00          1.00        .00          1.00        1.00        .00          1.00        .00          1.00      etc 3.00        1.00        .00          1.00        .00          1.00        .00          .00          1.00        .00          .00          .00          .00          1.00        .00 4.00        .00          .00          .00          1.00        .00          1.00        1.00        .00          1.00        .00          1.00        1.00        1.00        .00 5.00        1.00        1.00        1.00        1.00        .00          .00          .00          .00          .00          1.00        1.00        1.00        .00          .00     Data where strings of True are created 1.00        .00          .00          .00          .00          1.00        .00          3.00        3.00        3.00        .00          2.00        2.00        .00          .00     etc 2.00        2.00        2.00        .00          .00          1.00        .00          1.00        .00          2.00        2.00        .00          1.00        .00          1.00 3.00        1.00        .00          1.00        .00          1.00        .00          .00          1.00        .00          .00          .00          .00          1.00        .00 4.00        .00          .00          .00          1.00        .00          2.00        2.00        .00          1.00        .00          3.00        3.00        3.00        .00 5.00        4.00        4.00        4.00        4.00        .00          .00          .00          .00          .00          3.00        3.00        3.00        .00          .00   Just using the above data the averages would be   1.00                     (1 + 3 + 2)/3    = 2.00 2.00        (2 + 1 + 1 + 2 + 1 +1)/6 =1.33 3.00        (1 + 1+ 1 + 1 + 1)/ 1.00 4.00        (1 + 2 + 1 + 3)/4 = 1.75 5.0          (4 + 3)/2= 3.5   So at this point I need to count up the number times that 1 appeared plus the number of times that 2 appeared (but I need to address the fact that when 2 is presented it is presented 2 times for each string of  True and True) plus the number of times that 3 appeared (but I need to address the fact that when 3 appears it is presented 3 times for each string of True, True, and True), etc   for example-here are three strings of  (True = 1   and False  = 0).   1 0 1 0 1 0 1  1 0 1  1 0  1  1  1  0 1  1  1  0 1 1 1   And now in terms of strings we would have     1   1   1   2  2  0 2  2  0  3  3  3  0 3  3  3   so if there are 3  ones (1) then the number I want     3X1      = 3 if there are 2 twos (2) then the number I want is       2X2      = 4 if there are 2 threes (3) then the number I want is    2X3      = 6                                                                                                        total = 13                                                                                                      13/7(distinct strings) =  1.86  average number to strings of trues   Unless my thinking and math is off.                   * Encoding: UTF-8. *Andy's simulated example dataset. SET SEED 10. INPUT PROGRAM. LOOP Person = 1 TO 40. END CASE. END LOOP. END FILE. END INPUT PROGRAM. *Simulating 100 variables, random T/F. *Person 4 has weird run at 70 to 100. *Person 7 has weird run 40 to 65. *Person 32 has weird run 1 to 10. VECTOR A(100). LOOP #i = 1 TO 100.   COMPUTE A(#i) = RV.BERNOULLI(0.50). END LOOP. DO IF Person = 4.   RECODE A70 TO A100 (ELSE = 0). ELSE IF Person = 7.   RECODE A40 TO A65 (ELSE = 0). ELSE IF Person = 32.   RECODE A1 TO A10 (ELSE = 0). END IF. EXECUTE.   *The code of the macro function, to read into memory. *(you can find the function at http://www.spsstools.net/en/KO-spssmacros *collection "Matrix - End Matrix fuctuions"). define !runs(!pos= !token(1) /!pos= !charend('%') /!pos= !charend('%') /!pos= !charend(')')) comp !4= !2. comp @maxw= !3. loop @w= 2 to @maxw. -comp @w_= @w-1. -comp @a= !4(:,1:(ncol(!4)-@w_)). -comp @b= @a. -loop @i= 2 to @w. - comp @b= @b and !4(:,@i:(ncol(!4)-@w+@i)). -end loop. -comp !4(:,1:(ncol(!4)-@w_))= @a+@b. -loop @i= 1 to @w_. - comp @a= !4(:,2:ncol(!4)). - comp @b= !4(:,1:(ncol(!4)-1)). - comp !4(:,2:ncol(!4))= @a+(@a=@w_)&*(@b=@w). -end loop. end loop. release @maxw,@w,@w_,@a,@b,@i. !enddefine.   *Run the highlighting. set mxloops 10000. matrix. get data /vari= A1 to A100 /names= names. !runs(data%100%runs). /*I set argument maxw here to 5   Note to MFS change this to 100 for 100 variables save runs /out= * /names= names. end matrix. *In this example with maxw=5 all chains (runs) of length 1 will be coded as 1, *of length 2  will be coded as 2, ..., of length 5+  will be coded as 5 but I changed it to 100%make is !runs(data%100%runs).   From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Maguin, Eugene Sent: Monday, November 07, 2016 9:20 AM To: [hidden email] Subject: Re: obtaining the average number of consecutive responses of a true response   Martin, Having not studied Kiril’s macro, I’m mystified by the results you got. You ran Andy’s example to create a dataset of 40 records of 100 variables and a ‘1’ was coded at probability .8. Passed it through the macro and got what? A record with 16 values for each of your 40 people and that for one person shows 12 strings of each of length 12—all that from a string of length 100 going in to the macro. Or, a record of the first 16 people with one value per person. Gene Maguin       From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Martin Sherman Sent: Sunday, November 06, 2016 12:10 PM To: [hidden email] Subject: Re: obtaining the average number of consecutive responses of a true response   I ran the syntax below and changed the maxw to 100 and obtained a data file that contains the runs of true (here is one subject) Here is the actual true and false  (1 and 0) .00          1.00        .00          1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00        1.00                .00     etc.   And here is the output   .00          1.00        .00          12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00     12.00                .00   So this creates the number to trues for each run of true but now I want to obtain each participant’s average number of true runs Just using the above the average would be  (1 + 12)/2 = 6.5   averaged across the two runs of true   And now I want to do this across all 100 variables and get an average for each participant. The I will examine the distribution of all participants To see which participants are outliers in regard to average number of runs per participant.      From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Kirill Orlov Sent: Sunday, November 06, 2016 9:35 AM To: [hidden email] Subject: Re: obtaining the average number of consecutive responses of a true response   To the analysis and the graph suggested by Andy W above I might add a trick to highlight chains (runs) of 1s of different lengths with my macro function /*!runs()*/ which operates in MATRIX session. The highlighted dataset could then be plotted by GPL syntax similar to Andy's. *Andy's simulated example dataset. SET SEED 10. INPUT PROGRAM. LOOP Person = 1 TO 40. END CASE. END LOOP. END FILE. END INPUT PROGRAM. *Simulating 100 variables, random T/F. *Person 4 has weird run at 70 to 100. *Person 7 has weird run 40 to 65. *Person 32 has weird run 1 to 10. VECTOR A(100). LOOP #i = 1 TO 100.   COMPUTE A(#i) = RV.BERNOULLI(0.80). END LOOP. DO IF Person = 4.   RECODE A70 TO A100 (ELSE = 0). ELSE IF Person = 7.   RECODE A40 TO A65 (ELSE = 0). ELSE IF Person = 32.   RECODE A1 TO A10 (ELSE = 0). END IF. EXECUTE. *The code of the macro function, to read into memory. *(you can find the function at http://www.spsstools.net/en/KO-spssmacros *collection "Matrix - End Matrix fuctuions"). define !runs(!pos= !token(1) /!pos= !charend('%') /!pos= !charend('%') /!pos= !charend(')')) comp !4= !2. comp @maxw= !3. loop @w= 2 to @maxw. -comp @w_= @w-1. -comp @a= !4(:,1:(ncol(!4)-@w_)). -comp @b= @a. -loop @i= 2 to @w. - comp @b= @b and !4(:,@i:(ncol(!4)-@w+@i)). -end loop. -comp !4(:,1:(ncol(!4)-@w_))= @a+@b. -loop @i= 1 to @w_. - comp @a= !4(:,2:ncol(!4)). - comp @b= !4(:,1:(ncol(!4)-1)). - comp !4(:,2:ncol(!4))= @a+(@a=@w_)&*(@b=@w). -end loop. end loop. release @maxw,@w,@w_,@a,@b,@i. !enddefine. *Run the highlighting. set mxloops 10000. matrix. get data /vari= A1 to A100 /names= names. !runs(data%5%runs). /*I set argument maxw here to 5 save runs /out= * /names= names. end matrix. *In this example with maxw=5 all chains (runs) of length 1 will be coded as 1, *of length 2  will be coded as 2, ..., of length 5+  will be coded as 5. 06.11.2016 16:43, Andy W пишет: `Good call Rich, I would use VARSTOCASES and SPLIT FILE though. Example below,` `plus an example heatmap to visualize the incorrect responses.` ` ` `**************************************************.` `*SIMULATING EXAMPLE DATA.` `SET SEED 10.` `INPUT PROGRAM.` `LOOP Person = 1 TO 40.` `END CASE.` `END LOOP.` `END FILE.` `END INPUT PROGRAM.` ` ` `*Simulating 100 variables, random T/F.` `*Person 4 has weird run at 70 to 100.` `*Person 7 has weird run 40 to 65.` `*Person 32 has weird run 1 to 10.` `VECTOR A(100).` `LOOP #i = 1 TO 100.` `  COMPUTE A(#i) = RV.BERNOULLI(0.80).` `END LOOP.` `DO IF Person = 4.` `  RECODE A70 TO A100 (ELSE = 0).` `ELSE IF Person = 7.` `  RECODE A40 TO A65 (ELSE = 0).` `ELSE IF Person = 32.` `  RECODE A1 TO A10 (ELSE = 0).` `END IF.` `EXECUTE.` ` ` `*CONDUCTING THE ANALYSIS` `*Now you would reshape the dataset, then split file.` `VARSTOCASES /MAKE A FROM A1 TO A100 /INDEX AnswerNum.` `SPLIT FILE BY Person.` `NPAR TESTS /RUNS(0.5)=A.` `SPLIT FILE OFF.` `*May want to use OMS to make it easier to flag folks.` ` ` `*EXTRA VIZ - HEATMAP OF RUNS.` `FORMATS A (F1.0) Person (F2.0).` `VALUE LABELS A 0 'False' 1 'Correct'.` `GGRAPH` `  /GRAPHDATASET NAME="graphdataset" VARIABLES=AnswerNum Person A` `MISSING=LISTWISE REPORTMISSING=NO` `  /GRAPHSPEC SOURCE=INLINE.` `BEGIN GPL` `  PAGE: begin(scale(600px,900px))` `  SOURCE: s=userSource(id("graphdataset"))` `  DATA: AnswerNum=col(source(s), name("AnswerNum"), unit.category())` `  DATA: Person=col(source(s), name("Person"), unit.category())` `  DATA: A=col(source(s), name("A"), unit.category())` `  GUIDE: axis(dim(1), null())` `  GUIDE: axis(dim(2), label("Individual Incorrect Answers"))` `  GUIDE: legend(aesthetic(aesthetic.color.interior), null())` `  SCALE: cat(dim(2), sort.statistic(summary.mean(A)), reverse())` `  SCALE: cat(aesthetic(aesthetic.color.interior), map(("0",` `color.black),("1",color.white)))` `  ELEMENT: polygon(position(AnswerNum*Person), color.interior(A),` `color.exterior(color.white))` `  PAGE: end()` `END GPL.` `*This automatically sorts those with the most incorrect to the top of the` `graphic.` ` `   ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD