Fwd: Complex logic assistance please

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Fwd: Complex logic assistance please

Mark Webb-5

Regards

Mark Webb

WhatsApp +27 (72) 199 1000 [Good Reception]
Cell +27 (72) 199 1000 [Poor Reception]
Fax to email +27 (86) 5513075
Skype  tomarkwebb
Email  [hidden email]
No landline


---------- Forwarded message ---------
From: Mark Webb <[hidden email]>
Date: Tue, 28 Jan 2020 at 13:31
Subject: Complex logic assistance please
To: Mark Webb <[hidden email]>


Format of data is -
2 Variables - Record number [can be more than 1 line per respondent], Brand used
Want to compute n_brands [how many unique brands in customer repertoire], n_shops [how many times they have shopped [I can get this via summarise - n_breaks.]

Record Brand n_brands n_shops
1           a           2           3
1           a           2           3
1           b           2           3
2           a           1           2
2           a           1           2
3           b           1           1

Record 1 has 3 visits, 2 different brands bought - not loyal
Record 2 has 2 visits, 1 brand bought - loyal
Record 3 has 1 visit only - no repertoire

I'm trying to compute the n_brands column i.e. the number of unique brands by record number.

Regards

Mark Webb

WhatsApp +27 (72) 199 1000 [Good Reception]
Cell +27 (72) 199 1000 [Poor Reception]
Fax to email +27 (86) 5513075
Skype  tomarkwebb
Email  [hidden email]
No landline
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Complex logic assistance please

Maguin, Eugene

I think this is a double aggregate problem.

First.

Aggregate outfile=*/break=record brand/brands=nu.

This gives:

Record Brand brands

1           a           2

1           b           1

2           a           2

3           b           1

 

Second.

Aggregate outfile=*/break=record/n_brands=nu

This gives:

Record n_brands

1            2

2            1

3            1

 

I think this might not be satisfying because you lose the brand variable. You can recover this by doing a match files with the second file being the table and matching by record.

 

Gene Maguin

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Mark Webb
Sent: Tuesday, January 28, 2020 6:32 AM
To: [hidden email]
Subject: Fwd: Complex logic assistance please

 


Regards

Mark Webb

WhatsApp +27 (72) 199 1000 [Good Reception]
Cell +27 (72) 199 1000 [Poor Reception]
Fax to email +27 (86) 5513075
Skype  tomarkwebb
Email  [hidden email]

No landline

 

---------- Forwarded message ---------
From: Mark Webb <[hidden email]>
Date: Tue, 28 Jan 2020 at 13:31
Subject: Complex logic assistance please
To: Mark Webb <[hidden email]>

 

Format of data is -

2 Variables - Record number [can be more than 1 line per respondent], Brand used

Want to compute n_brands [how many unique brands in customer repertoire], n_shops [how many times they have shopped [I can get this via summarise - n_breaks.]

 

Record Brand n_brands n_shops

1           a           2           3

1           a           2           3

1           b           2           3

2           a           1           2

2           a           1           2

3           b           1           1

 

Record 1 has 3 visits, 2 different brands bought - not loyal

Record 2 has 2 visits, 1 brand bought - loyal

Record 3 has 1 visit only - no repertoire

 

I'm trying to compute the n_brands column i.e. the number of unique brands by record number.

 

Regards


Mark Webb

WhatsApp +27 (72) 199 1000 [Good Reception]
Cell +27 (72) 199 1000 [Poor Reception]
Fax to email +27 (86) 5513075
Skype  tomarkwebb
Email  [hidden email]

No landline

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Complex logic assistance please

Mark Webb-5
Thanks for this. I will look at and let you know if any good.
In the mean time I found this - which work well.
Regards

Mark Webb

WhatsApp +27 (72) 199 1000 [Good Reception]
Cell +27 (72) 199 1000 [Poor Reception]
Fax to email +27 (86) 5513075
Skype  tomarkwebb
Email  [hidden email]
No landline


On Tue, 28 Jan 2020 at 16:04, Maguin, Eugene <[hidden email]> wrote:

I think this is a double aggregate problem.

First.

Aggregate outfile=*/break=record brand/brands=nu.

This gives:

Record Brand brands

1           a           2

1           b           1

2           a           2

3           b           1

 

Second.

Aggregate outfile=*/break=record/n_brands=nu

This gives:

Record n_brands

1            2

2            1

3            1

 

I think this might not be satisfying because you lose the brand variable. You can recover this by doing a match files with the second file being the table and matching by record.

 

Gene Maguin

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Mark Webb
Sent: Tuesday, January 28, 2020 6:32 AM
To: [hidden email]
Subject: Fwd: Complex logic assistance please

 


Regards

Mark Webb

WhatsApp +27 (72) 199 1000 [Good Reception]
Cell +27 (72) 199 1000 [Poor Reception]
Fax to email +27 (86) 5513075
Skype  tomarkwebb
Email  [hidden email]

No landline

 

---------- Forwarded message ---------
From: Mark Webb <[hidden email]>
Date: Tue, 28 Jan 2020 at 13:31
Subject: Complex logic assistance please
To: Mark Webb <[hidden email]>

 

Format of data is -

2 Variables - Record number [can be more than 1 line per respondent], Brand used

Want to compute n_brands [how many unique brands in customer repertoire], n_shops [how many times they have shopped [I can get this via summarise - n_breaks.]

 

Record Brand n_brands n_shops

1           a           2           3

1           a           2           3

1           b           2           3

2           a           1           2

2           a           1           2

3           b           1           1

 

Record 1 has 3 visits, 2 different brands bought - not loyal

Record 2 has 2 visits, 1 brand bought - loyal

Record 3 has 1 visit only - no repertoire

 

I'm trying to compute the n_brands column i.e. the number of unique brands by record number.

 

Regards


Mark Webb

WhatsApp +27 (72) 199 1000 [Good Reception]
Cell +27 (72) 199 1000 [Poor Reception]
Fax to email +27 (86) 5513075
Skype  tomarkwebb
Email  [hidden email]

No landline

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Complex logic assistance please

Bruce Weaver
Administrator
The solution on that support page seems rather complicated.  I agree with
Gene that AGGREGATE is called for here, but think it can be done with a
single AGGREGATE.  It's not clear to me if you want to end up with a file
that has one row per each value of variable Record, or if you want to retain
the multiple rows per value of Record. This example does the latter.  But
you could easily tweak it to go the former.

DATA LIST LIST / Record(F2.0) Brand(A1).
BEGIN DATA
1           a          
1           a          
1           b          
2           a          
2           a          
3           b          
END DATA.

AUTORECODE VARIABLES=Brand /INTO BrandNum.

  AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=Record
  /n_brands=MAX(BrandNum)
  /n_shops=NU.

* Keep only the first case for each value of Record.
COMPUTE FirstRec = ($CASENUM EQ 1) OR (Record NE LAG(Record)).
FORMATS FirstRec(F1).

TEMPORARY.
SELECT IF FirstRec.
LIST Record n_shops n_brands.

OUTPUT from the last LIST command:

Record n_shops n_brands
 
   1         3     2
   2         2     1
   3         1     2
 
Number of cases read:  3    Number of cases listed:  3


Use DELETE VARIABLES to get rid of BrandNum if you don't want it.  

HTH.


Mark Webb-5 wrote

> Thanks for this. I will look at and let you know if any good.
> In the mean time I found this - which work well.
> https://www.ibm.com/support/pages/how-count-distinct-values-variable-and-compute-variable-frequency-distincts-id
>
> Regards
>
> Mark Webb
>
> WhatsApp +27 (72) 199 1000 [Good Reception]
> Cell +27 (72) 199 1000 [Poor Reception]
> Fax to email +27 (86) 5513075
> Skype  tomarkwebb
> Email  

> targetlinkmark@

> No landline
>
>
> On Tue, 28 Jan 2020 at 16:04, Maguin, Eugene &lt;

> emaguin@

> &gt; wrote:
>
>> I think this is a double aggregate problem.
>>
>> First.
>>
>> Aggregate outfile=*/break=record brand/brands=nu.
>>
>> This gives:
>>
>> Record Brand brands
>>
>> 1           a           2
>>
>> 1           b           1
>>
>> 2           a           2
>>
>> 3           b           1
>>
>>
>>
>> Second.
>>
>> Aggregate outfile=*/break=record/n_brands=nu
>>
>> This gives:
>>
>> Record n_brands
>>
>> 1            2
>>
>> 2            1
>>
>> 3            1
>>
>>
>>
>> I think this might not be satisfying because you lose the brand variable.
>> You can recover this by doing a match files with the second file being
>> the
>> table and matching by record.
>>
>>
>>
>> Gene Maguin
>>
>>
>>
>> *From:* SPSSX(r) Discussion [mailto:

> SPSSX-L@.UGA

> ] *On Behalf
>> Of *Mark Webb
>> *Sent:* Tuesday, January 28, 2020 6:32 AM
>> *To:*

> SPSSX-L@.UGA

>> *Subject:* Fwd: Complex logic assistance please
>>
>>
>>
>>
>> Regards
>>
>> Mark Webb
>>
>> WhatsApp +27 (72) 199 1000 [Good Reception]
>> Cell +27 (72) 199 1000 [Poor Reception]
>> Fax to email +27 (86) 5513075
>> Skype  tomarkwebb
>> Email  

> targetlinkmark@

>>
>> No landline
>>
>>
>>
>> ---------- Forwarded message ---------
>> From: *Mark Webb* &lt;

> targetlinkmark@

> &gt;
>> Date: Tue, 28 Jan 2020 at 13:31
>> Subject: Complex logic assistance please
>> To: Mark Webb &lt;

> targetlinkmark@

> &gt;
>>
>>
>>
>> Format of data is -
>>
>> 2 Variables - Record number [can be more than 1 line per respondent],
>> Brand used
>>
>> Want to compute n_brands [how many unique brands in customer repertoire],
>> n_shops [how many times they have shopped [I can get this via summarise -
>> n_breaks.]
>>
>>
>>
>> Record Brand n_brands n_shops
>>
>> 1           a           2           3
>>
>> 1           a           2           3
>>
>> 1           b           2           3
>>
>> 2           a           1           2
>>
>> 2           a           1           2
>>
>> 3           b           1           1
>>
>>
>>
>> Record 1 has 3 visits, 2 different brands bought - not loyal
>>
>> Record 2 has 2 visits, 1 brand bought - loyal
>>
>> Record 3 has 1 visit only - no repertoire
>>
>>
>>
>> I'm trying to compute the n_brands column i.e. the number of unique
>> brands
>> by record number.
>>
>>
>>
>> Regards
>>
>>
>> Mark Webb
>>
>> WhatsApp +27 (72) 199 1000 [Good Reception]
>> Cell +27 (72) 199 1000 [Poor Reception]
>> Fax to email +27 (86) 5513075
>> Skype  tomarkwebb
>> Email  

> targetlinkmark@

>>
>> No landline
>>
>> ===================== To manage your subscription to SPSSX-L, send a
>> message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text
>> except the command. To leave the list, send the command SIGNOFF SPSSX-L
>> For
>> a list of commands to manage subscriptions, send the command INFO REFCARD
>>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
Reply | Threaded
Open this post in threaded view
|

Re: Complex logic assistance please

Mark Webb-5
In reply to this post by Maguin, Eugene
I have [maybe] found a simple solution to this. But not being an accomplished syntax writer I would like to hear from those out there who are. This works - but is it robust - will it always work? Why is this solution not in the domain when it is so simple compared to the other solutions with many steps and macros.


Data List List /Recno (F1.0) Site (A2).
Begin Data
1 aa
1 aa
1 bb
2 aa
2 bb
2 bb
2 cc
3 gg
3 gg
3 gg
4 ss
End data.
List.
* Convert Recno numeric into a string.
String RecnoS (A1).
compute RecnoS = string(Recno, F1.0).
exe.
* Concat Recno + Site.
String Join (A3).
Compute Join = concat(RecnoS,Site).
exe.
* Flag Distincts.
If (Join NE LAG(Join)) Distinct = 1.
exe.
* Aggregate on Recno - Sum Distinct and get N_Break.
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=Recno
  /Distinct_sum=SUM(Distinct)
  /N_BREAK=N.


Regards

Mark Webb

WhatsApp +27 (72) 199 1000 [Good Reception]
Cell +27 (72) 199 1000 [Poor Reception]
Fax to email +27 (86) 5513075
Skype  tomarkwebb
Email  [hidden email]
No landline


On Tue, 28 Jan 2020 at 16:04, Maguin, Eugene <[hidden email]> wrote:

I think this is a double aggregate problem.

First.

Aggregate outfile=*/break=record brand/brands=nu.

This gives:

Record Brand brands

1           a           2

1           b           1

2           a           2

3           b           1

 

Second.

Aggregate outfile=*/break=record/n_brands=nu

This gives:

Record n_brands

1            2

2            1

3            1

 

I think this might not be satisfying because you lose the brand variable. You can recover this by doing a match files with the second file being the table and matching by record.

 

Gene Maguin

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Mark Webb
Sent: Tuesday, January 28, 2020 6:32 AM
To: [hidden email]
Subject: Fwd: Complex logic assistance please

 


Regards

Mark Webb

WhatsApp +27 (72) 199 1000 [Good Reception]
Cell +27 (72) 199 1000 [Poor Reception]
Fax to email +27 (86) 5513075
Skype  tomarkwebb
Email  [hidden email]

No landline

 

---------- Forwarded message ---------
From: Mark Webb <[hidden email]>
Date: Tue, 28 Jan 2020 at 13:31
Subject: Complex logic assistance please
To: Mark Webb <[hidden email]>

 

Format of data is -

2 Variables - Record number [can be more than 1 line per respondent], Brand used

Want to compute n_brands [how many unique brands in customer repertoire], n_shops [how many times they have shopped [I can get this via summarise - n_breaks.]

 

Record Brand n_brands n_shops

1           a           2           3

1           a           2           3

1           b           2           3

2           a           1           2

2           a           1           2

3           b           1           1

 

Record 1 has 3 visits, 2 different brands bought - not loyal

Record 2 has 2 visits, 1 brand bought - loyal

Record 3 has 1 visit only - no repertoire

 

I'm trying to compute the n_brands column i.e. the number of unique brands by record number.

 

Regards


Mark Webb

WhatsApp +27 (72) 199 1000 [Good Reception]
Cell +27 (72) 199 1000 [Poor Reception]
Fax to email +27 (86) 5513075
Skype  tomarkwebb
Email  [hidden email]

No landline

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Complex logic assistance please

Bruce Weaver
Administrator
Mark, does this achieve what you want?  It's similar to the code I posted
previously, but with MAX replaced by SUM in the AGGREGATE command.  (I
discovered that MAX did not work properly with the new sample data you
supplied.)  

Data List List /Recno (F1.0) Site (A2).
Begin Data
1 aa
1 aa
1 bb
2 aa
2 bb
2 bb
2 cc
3 gg
3 gg
3 gg
4 ss
End data.

SORT CASES by Recno Site.
* Flag first row for each unique value of Recno.
COMPUTE RecnoFirst = ($CASENUM EQ 1) OR (Recno NE LAG(Recno)).
* Flag first row for each unique value of Site within Recno.
COMPUTE SiteFirst = RecnoFirst OR (Recno EQ LAG(Recno) AND Site NE
LAG(Site)).
FORMATS RecnoFirst SiteFirst (F1).

* Let n_break = number of rows within each unique value of Recno;
* Let n_sites = the number of unique sites within each value of Recno.
AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=Recno
  /n_break=NU
  /n_sites=SUM(SiteFirst).

FORMATS n_sites(F2.0).

TEMPORARY.
SELECT IF RecnoFirst.
LIST Recno n_break n_sites.


OUTPUT from LIST:

Recno n_break n_sites
 
  1         3     2
  2         4     3
  3         3     1
  4         1     1




Mark Webb-5 wrote

> I have [maybe] found a simple solution to this. But not being an
> accomplished syntax writer I would like to hear from those out there who
> are. This works - but is it robust - will it always work? Why is this
> solution not in the domain when it is so simple compared to the other
> solutions with many steps and macros.
>
>
> Data List List /Recno (F1.0) Site (A2).
> Begin Data
> 1 aa
> 1 aa
> 1 bb
> 2 aa
> 2 bb
> 2 bb
> 2 cc
> 3 gg
> 3 gg
> 3 gg
> 4 ss
> End data.
> List.
> * Convert Recno numeric into a string.
> String RecnoS (A1).
> compute RecnoS = string(Recno, F1.0).
> exe.
> * Concat Recno + Site.
> String Join (A3).
> Compute Join = concat(RecnoS,Site).
> exe.
> * Flag Distincts.
> If (Join NE LAG(Join)) Distinct = 1.
> exe.
> * Aggregate on Recno - Sum Distinct and get N_Break.
> AGGREGATE
>   /OUTFILE=* MODE=ADDVARIABLES
>   /BREAK=Recno
>   /Distinct_sum=SUM(Distinct)
>   /N_BREAK=N.
>
>
> Regards
>
> Mark Webb





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.