Counting data in multiple columns - Clarification

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Counting data in multiple columns - Clarification

Paul Sillitoe

Thank you for the initial replies to my query; here are some data from my 400-case, 60-variable dataset, which will hopefully better explain what I wish to do:

 

CASE  SIGDRAWN    SIGTRACED   SIGCHKED    SIGPASS     SIGAPRVD

1     Present     Annotation  C-CHECKED   Present     [No Data]

2     Present     [No Data]   C-CHECKED   Present     [No Data]

3     Present     [No Data]   C-CHECKED   Present     [No Data]

4     Present     Present     [No Data]   Present     Present

5     Present     Present     Present     Present     Present

 

The same data in comma-delimited form:

 

CASE,SIGDRAWN,SIGTRACED,SIGCHKED,SIGPASS,SIGAPRVD

1,Present,Annotation,C-CHECKED,Present,[No Data],2,Present,[No Data],C-CHECKED,Present,[No Data],3,Present,[No Data],C-CHECKED,Present,[No Data],4,Present,Present,[No Data],Present,Present,5,Present,Present,Present,Present,Present

 

 

EXAMPLE QUERY 1: Show a number for the total number of cases which contain any data value (excluding [No Data]) in columns, SIGTRACED,SIGCHKED,SIGPASS

 

SOLUTION 1: Cases 1,5 meet those criteria.

Total number of cases therefore = 2 (Required result)

 

 

EXAMPLE QUERY 2: Show a number for the total number of cases which contain any data value (excluding [No Data]) in columns SIGDRAWN,SIGCHKED,SIGPASS

 

SOLUTION 2: Cases 1,2,3,5 meet those criteria.

Total number of cases therefore = 4 (Required result)

 

I anticipate that a procedure might need to use some simple syntax, rather than just a COUNT across columns, because of the different data values that are to be included.

 

Thank you.

 

Paul S.

 

Reply | Threaded
Open this post in threaded view
|

Re: Counting data in multiple columns - Clarification

Justin Black
Paul,
 
I didn't see the initial replies, so apologies if this is a repeat of something you've already tried, but...Does the syntax below achieve the desired result?
 
IF (sigtraced NE '' AND sigchked NE '' AND sigpass NE '') query1 = 1 .
IF (sigdrawn NE '' AND sigchked NE '' AND sigpass NE '') query2 = 1 .
VALUE LABELS query1 query2 1 'Number of cases meeting criteria' .
FREQUENCIES VARIABLES = query1 query2 .
 
Best,
 
--Justin

On Sat, May 8, 2010 at 7:05 AM, Paul Sillitoe <[hidden email]> wrote:

Thank you for the initial replies to my query; here are some data from my 400-case, 60-variable dataset, which will hopefully better explain what I wish to do:

 

CASE  SIGDRAWN    SIGTRACED   SIGCHKED    SIGPASS     SIGAPRVD

1     Present     Annotation  C-CHECKED   Present     [No Data]

2     Present     [No Data]   C-CHECKED   Present     [No Data]

3     Present     [No Data]   C-CHECKED   Present     [No Data]

4     Present     Present     [No Data]   Present     Present

5     Present     Present     Present     Present     Present

 

The same data in comma-delimited form:

 

CASE,SIGDRAWN,SIGTRACED,SIGCHKED,SIGPASS,SIGAPRVD

1,Present,Annotation,C-CHECKED,Present,[No Data],2,Present,[No Data],C-CHECKED,Present,[No Data],3,Present,[No Data],C-CHECKED,Present,[No Data],4,Present,Present,[No Data],Present,Present,5,Present,Present,Present,Present,Present

 

 

EXAMPLE QUERY 1: Show a number for the total number of cases which contain any data value (excluding [No Data]) in columns, SIGTRACED,SIGCHKED,SIGPASS

 

SOLUTION 1: Cases 1,5 meet those criteria.

Total number of cases therefore = 2 (Required result)

 

 

EXAMPLE QUERY 2: Show a number for the total number of cases which contain any data value (excluding [No Data]) in columns SIGDRAWN,SIGCHKED,SIGPASS

 

SOLUTION 2: Cases 1,2,3,5 meet those criteria.

Total number of cases therefore = 4 (Required result)

 

I anticipate that a procedure might need to use some simple syntax, rather than just a COUNT across columns, because of the different data values that are to be included.

 

Thank you.

 

Paul S.

 


Reply | Threaded
Open this post in threaded view
|

Re: Counting data in multiple columns

Richard Ristow
In reply to this post by Paul Sillitoe
At 07:05 AM 5/8/2010, Paul Sillitoe wrote (well-presented, by the way):

Here are some data from my 400-case, 60-variable dataset, which will hopefully better explain what I wish to do:
|-----------------------------|---------------------------|
|Output Created               |09-MAY-2010 23:29:04       |
|-----------------------------|---------------------------|
CASE SIGDRAWN   SIGTRACED  SIGCHKED   SIGPASS    SIGAPRVD

  1  Present    Annotation C-CHECKED  Present    [No Data]
  2  Present    [No Data]  C-CHECKED  Present    [No Data]
  3  Present    [No Data]  C-CHECKED  Present    [No Data]
  4  Present    Present    [No Data]  Present    Present
  5  Present    Present    Present    Present    Present

Number of cases read:  5    Number of cases listed:  5


EXAMPLE QUERY 1: Show a number for the total number of cases which contain any data value (excluding [No Data]) in columns, SIGTRACED,SIGCHKED,SIGPASS
 
SOLUTION 1: Cases 1,5 meet those criteria.
Total number of cases therefore = 2 (Required result)
 
EXAMPLE QUERY 2: Show a number for the total number of cases which contain any data value (excluding [No Data]) in columns SIGDRAWN,SIGCHKED,SIGPASS
 
SOLUTION 2: Cases 1,2,3,5 meet those criteria.
Total number of cases therefore = 4 (Required result)
 
I anticipate that a procedure might need to use some simple syntax, rather than just a COUNT across columns, because of the different data values that are to be included.

That's probably right. It looks like you want to be able to mark cases satisfying any logical expression on 'present/absent' for your 5 categories. There's no easier way than writing those expressions in syntax as you need them.

But the expressions may be easier to write if you first create 5 5 yes/no variables, designating data present/absent for each of the five fields. Like this (tested):

RECODE  SIGDRAWN,SIGTRACED,SIGCHKED,SIGPASS,SIGAPRVD
    ('Present','Annotation','C-CHECKED' = 1)
    ('       ','[No Data]'              = 0)
    (ELSE                               = 9)
   INTO IsDRAWN, IsTRACED, IsCHKED, IsPASS,  IsAPRVD.
FORMATS IsDRAWN, IsTRACED, IsCHKED, IsPASS,  IsAPRVD (F2).
MISSING VALUES
        IsDRAWN, IsTRACED, IsCHKED, IsPASS,  IsAPRVD (9).
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |09-MAY-2010 23:29:05       |
|-----------------------------|---------------------------|
                                                          IsD
CA                                                        RAW IsTR IsCH IsP IsAP
SE SIGDRAWN   SIGTRACED  SIGCHKED   SIGPASS    SIGAPRVD   N   ACED KED  ASS RVD

 1 Present    Annotation C-CHECKED  Present    [No Data]    1   1    1    1   0
 2 Present    [No Data]  C-CHECKED  Present    [No Data]    1   0    1    1   0
 3 Present    [No Data]  C-CHECKED  Present    [No Data]    1   0    1    1   0
 4 Present    Present    [No Data]  Present    Present      1   1    0    1   1
 5 Present    Present    Present    Present    Present      1   1    1    1   1

Number of cases read:  5    Number of cases listed:  5

 
*  EXAMPLE QUERY 1: Show a number for the total number of cases      .
*  which contain any data value (excluding [No Data]) in [all of]    .
*  columns SIGTRACED,SIGCHKED,SIGPASS                                .

COMPUTE   Qry1 = IsTRACED  AND  IsCHKED  AND IsPASS.


*  EXAMPLE QUERY 2: Show a number for the total number of cases      .
*  which contain any data value (excluding [No Data]) in [all of]    .
*  columns SIGDRAWN,SIGCHKED,SIGPASS                                 .

COMPUTE   Qry2 = IsDRAWN  AND  IsCHKED  AND  IsPASS.

FORMATS   Qry1 Qry2 (F2).

LIST  CASE,
      IsDRAWN, IsTRACED, IsCHKED, IsPASS,  IsAPRVD,
      Qry1,    Qry2.

List
|-----------------------------|---------------------------|
|Output Created               |09-MAY-2010 23:29:05       |
|-----------------------------|---------------------------|
CASE IsDRAWN IsTRACED IsCHKED IsPASS IsAPRVD Qry1 Qry2

  1      1       1        1      1       0     1    1
  2      1       0        1      1       0     0    1
  3      1       0        1      1       0     0    1
  4      1       1        0      1       1     0    0
  5      1       1        1      1       1     1    1

Number of cases read:  5    Number of cases listed:  5

 
DESCRIPTIVES
   VARIABLES =Qry1 Qry2
  /STATISTICS=SUM .

Descriptives
|-----------------------------|---------------------------|
|Output Created               |09-MAY-2010 23:29:05       |
|-----------------------------|---------------------------|
Descriptive Statistics
|---------------|-|---|
|               |N|Sum|
|---------------|-|---|
|Qry1           |5|2  |
|---------------|-|---|
|Qry2           |5|4  |
|---------------|-|---|
|Valid N        |5|   |
|(listwise)     | |   |
|---------------|-|---|
=============================
APPENDIX: Test data, and code
=============================
DATA LIST LIST (',')/
    CASE,SIGDRAWN,SIGTRACED,SIGCHKED,SIGPASS,SIGAPRVD
   (F2,  5A10).
BEGIN DATA  
    1,Present,Annotation,C-CHECKED,Present,[No Data]
    2,Present,[No Data],C-CHECKED,Present,[No Data]
    3,Present,[No Data],C-CHECKED,Present,[No Data]
    4,Present,Present,[No Data],Present,Present
    5,Present,Present,Present,Present,Present
END DATA.
LIST.

RECODE  SIGDRAWN,SIGTRACED,SIGCHKED,SIGPASS,SIGAPRVD
    ('Present','Annotation','C-CHECKED' = 1)
    ('       ','[No Data]'              = 0)
    (ELSE                               = 9)
   INTO IsDRAWN, IsTRACED, IsCHKED, IsPASS,  IsAPRVD.
FORMATS IsDRAWN, IsTRACED, IsCHKED, IsPASS,  IsAPRVD (F2).
MISSING VALUES
        IsDRAWN, IsTRACED, IsCHKED, IsPASS,  IsAPRVD (9).
LIST.

*  EXAMPLE QUERY 1: Show a number for the total number of cases      .
*  which contain any data value (excluding [No Data]) in [all of]    .
*  columns SIGTRACED,SIGCHKED,SIGPASS                                .

COMPUTE   Qry1 = IsTRACED  AND  IsCHKED  AND IsPASS.


*  EXAMPLE QUERY 2: Show a number for the total number of cases      .
*  which contain any data value (excluding [No Data]) in [all of]    .
*  columns SIGDRAWN,SIGCHKED,SIGPASS                                 .

COMPUTE   Qry2 = IsDRAWN  AND  IsCHKED  AND  IsPASS.

FORMATS   Qry1 Qry2 (F2).

LIST  CASE,
      IsDRAWN, IsTRACED, IsCHKED, IsPASS,  IsAPRVD,
      Qry1,    Qry2.

DESCRIPTIVES
   VARIABLES =Qry1 Qry2
  /STATISTICS=SUM .

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Counting data in multiple columns - Clarification

Art Kendall
In reply to this post by Paul Sillitoe
Two other ways to do this is to use autorecode with a /group specification. You can specify what value indicates missing on the original file or you can specify what values indicate missing after doing the autorecode.
A string variable can have a missing value specified if the value is 8 bytes or less.
If the values are entered by hand there could be typos and the /print from autorecode could help find typos.


data list list/ CASE(f2)  SIGDRAWN    SIGTRACED   SIGCHKED    SIGPASS     SIGAPRVD (5a12).
begin data
1     Present     Annotation  C-CHECKED   Present     [noData]
2     Present     [noData]   C-CHECKED   Present     [noData]
3     Present     [noData]   C-CHECKED   Present     [noData]
4     Present     Present     [noData]   Present     Present
5     Present     Present     Present     Present     Present
end data.
* if the value that indicates missing data in the original input is 8 bytes or less put the missing values command here.
missing values SIGDRAWN    SIGTRACED   SIGCHKED    SIGPASS     SIGAPRVD ('[noData]').
autorecode
     variables = SIGTRACED, SIGCHKED,SIGPASS
     /into flagtraced flagchked flagpass
    /group
    /print.
*alternatively you could look at the autorecode statement and put a missing values command here.
* missing values
flagtraced flagchked flagpass(4).
compute passtest = nvalid(flagtraced,flagchked,flagpass) eq 3.
list.


Art Kendall
Social Research Consultants

On 5/8/2010 7:05 AM, Paul Sillitoe wrote:

Thank you for the initial replies to my query; here are some data from my 400-case, 60-variable dataset, which will hopefully better explain what I wish to do:

 

CASE  SIGDRAWN    SIGTRACED   SIGCHKED    SIGPASS     SIGAPRVD

1     Present     Annotation  C-CHECKED   Present     [No Data]

2     Present     [No Data]   C-CHECKED   Present     [No Data]

3     Present     [No Data]   C-CHECKED   Present     [No Data]

4     Present     Present     [No Data]   Present     Present

5     Present     Present     Present     Present     Present

 

The same data in comma-delimited form:

 

CASE,SIGDRAWN,SIGTRACED,SIGCHKED,SIGPASS,SIGAPRVD

1,Present,Annotation,C-CHECKED,Present,[No Data],2,Present,[No Data],C-CHECKED,Present,[No Data],3,Present,[No Data],C-CHECKED,Present,[No Data],4,Present,Present,[No Data],Present,Present,5,Present,Present,Present,Present,Present

 

 

EXAMPLE QUERY 1: Show a number for the total number of cases which contain any data value (excluding [No Data]) in columns, SIGTRACED,SIGCHKED,SIGPASS

 

SOLUTION 1: Cases 1,5 meet those criteria.

Total number of cases therefore = 2 (Required result)

 

 

EXAMPLE QUERY 2: Show a number for the total number of cases which contain any data value (excluding [No Data]) in columns SIGDRAWN,SIGCHKED,SIGPASS

 

SOLUTION 2: Cases 1,2,3,5 meet those criteria.

Total number of cases therefore = 4 (Required result)

 

I anticipate that a procedure might need to use some simple syntax, rather than just a COUNT across columns, because of the different data values that are to be included.

 

Thank you.

 

Paul S.

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants