# Repeated measures analysis of fractions summing to a constant

17 messages
Open this post in threaded view
|

## Repeated measures analysis of fractions summing to a constant

 Consider you have a between-within design: several between-subject groups and several (3 or more) repeated measures (= within-subject) trials. It's all very classic and typical. The nuance, however, is that the values for every subject sum across the repeated levels to a **constant**. This is because the data are complementary, i.e. percentages of fractions, so, in this case they sum to 100 for every individual. For example, with 3 RM levels, a respondent's data is like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100). I know that I can analyze between-groups X repeated-measures count data via Generalized Estimating Equations procedure. By I doubt in this case because the values *sum to a constant*, they are complementary fractions; they are not counts of successes in repeated independent trials! Can I analyze such data in SPSS and how? Thanks.
Open this post in threaded view
|

## Re: Repeated measures analysis of fractions summing to a constant

 Administrator Hello Kirill.  This is not a direct answer to your question.  I'm just pointing to a thread from a couple years ago that addressed the same question.  One of my posts in it gives a couple of references that may be of interest to you.  Both of them suggest that ANOVA generally works quite well with "ipsative" data (or "allocated observations").  You can see the relevant messages here:    http://listserv.uga.edu/cgi-bin/wa?A2=ind1101&L=spssx-l&P=36237HTH. Kirill Orlov wrote Consider you have a between-within design: several between-subject groups and several (3 or more) repeated measures (= within-subject) trials. It's all very classic and typical. The nuance, however, is that the values for every subject sum across the repeated levels to a **constant**. This is because the data are complementary, i.e. percentages of fractions, so, in this case they sum to 100 for every individual. For example, with 3 RM levels, a respondent's data is like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100). I know that I can analyze between-groups X repeated-measures count data via Generalized Estimating Equations procedure. By I doubt in this case because the values *sum to a constant*, they are complementary fractions; they are not counts of successes in repeated independent trials! Can I analyze such data in SPSS and how? Thanks. -- Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/"When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
Open this post in threaded view
|

## Re: Repeated measures analysis of fractions summing to a constant

Open this post in threaded view
|

## Re: Repeated measures analysis of fractions summing to a constant

 In reply to this post by Kirill Orlov There is a literature on "compositional data" which probably will be helpful.  Years ago, I found Aitchison to be readable. I have no idea whether it will work for your model, but I will mentionthat you escape the absolute linear dependency if you represent each fraction as its log-odds, like log(25/75)  in place of 25%.-- Rich Ulrich Date: Thu, 4 Apr 2013 12:05:47 +0400From: [hidden email]Subject: Repeated measures analysis of fractions summing to a constantTo: [hidden email] Consider you have a between-within design: several between-subject groups and several (3 or more) repeated measures (= within-subject) trials. It's all very classic and typical. The nuance, however, is that the values for every subject sum across the repeated levels to a **constant**. This is because the data are complementary, i.e. percentages of fractions, so, in this case they sum to 100 for every individual. For example, with 3 RM levels, a respondent's data is like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100). I know that I can analyze between-groups X repeated-measures count data via Generalized Estimating Equations procedure. By I doubt in this case because the values *sum to a constant*, they are complementary fractions; they are not counts of successes in repeated independent trials! Can I analyze such data in SPSS and how? Thanks.
Open this post in threaded view
|

## Automatic reply: Repeated measures analysis of fractions summing to a constant

 ﻿ I will be out of the office until Tuesday December 11th. I will check email periodically. Dr. Gonzales
Open this post in threaded view
|

## Re: Repeated measures analysis of fractions summing to a constant

Open this post in threaded view
|

## Re: Repeated measures analysis of fractions summing to a constant

 The Wikip article on "ipsative" tells me that my own use ofthat term falls under the third type that they mention, whereeducators may standardize the scores for an individual based only on that individuals previous scores.   It seems that you are apt to find several different uses under "ipsative" in addition to the one that resembles "compositional".-- Rich Ulrich > Date: Thu, 4 Apr 2013 11:42:15 -0700> From: [hidden email]> Subject: Re: Repeated measures analysis of fractions summing to a constant> To: [hidden email]> > Judging from what I see on the Wikipedia page> (http://en.wikipedia.org/wiki/Compositional_data), "compositional data" is> another name for with Shaffer called "allocated observations" and Greer &> Dunlap called "ipsative data". But it also looks like there are two sets of> literature that do not overlap all that much.> > > > Rich Ulrich-2 wrote> > There is a literature on "compositional data" which probably will be> > helpful.> > Years ago, I found Aitchison to be readable.> >> > I have no idea whether it will work for your model, but I will mention> > that you escape the absolute linear dependency if you represent each> > fraction as its log-odds, like log(25/75) in place of 25%.> >> > --> > Rich Ulrich> >> > Date: Thu, 4 Apr 2013 12:05:47 +0400> > From:> > > kior@> > > Subject: Repeated measures analysis of fractions summing to a constant> > To:> > > SPSSX-L@.UGA> > >> >> > Consider you have a between-within design: several between-subject> > groups and several (3 or more) repeated measures (= within-subject)> > trials. It's all very classic and typical. The nuance, however, is> > that the values for every subject sum across the repeated levels to> > a **constant**. This is because the data are complementary, i.e.> > percentages of fractions, so, in this case they sum to 100 for every> > individual. For example, with 3 RM levels, a respondent's data is> > like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42%> > (sum=100).> >> >> >> > I know that I can analyze between-groups X repeated-measures count> > data via Generalized Estimating Equations procedure. By I doubt in> > this case because the values *sum to a constant*, they are> > complementary fractions; they are not counts of successes in> > repeated independent trials!> >> >> >> > Can I analyze such data in SPSS and how? Thanks.> ...
Open this post in threaded view
|

## Re: Repeated measures analysis of fractions summing to a constant

 In reply to this post by Kirill Orlov Would the OP mind explaining exactly what the DV is?   It might help shed light on how to proceed, notwithstanding other interesting solutions. Speaking of which, the idea of converting the probs to logits is intriguing. I am generally in favor of the logit scale because of its properties, but the fact that it addresses the linear dependence is an added bonus here. Anyway, I would appreciate if the OP would be willing to tell us more about the DV. Ryan On Apr 4, 2013, at 4:05 AM, Kirill Orlov <[hidden email]> wrote: > Consider you have a between-within design: several between-subject groups and several (3 or more) repeated measures (= within-subject) trials. It's all very classic and typical. The nuance, however, is that the values for every subject sum across the repeated levels to a **constant**. This is because the data are complementary, i.e. percentages of fractions, so, in this case they sum to 100 for every individual. For example, with 3 RM levels, a respondent's data is like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100). > > I know that I can analyze between-groups X repeated-measures count data via Generalized Estimating Equations procedure. By I doubt in this case because the values *sum to a constant*, they are complementary fractions; they are not counts of successes in repeated independent trials! > > Can I analyze such data in SPSS and how? Thanks. > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Open this post in threaded view
|

## Re: Repeated measures analysis of fractions summing to a constant

 Ryan, For example, the DV might be "how do you spend your typical day?" question Work __% of time Meals__% of time Stroll__% of time Comp/TV/Reading__% of time Else__% of time [Please check that your answers sum to 100%] Converting to logits might be interesting idea, although not necessarily most right. But I wonder if SPSS (GEE or other procedure) have already forseen and provided tools (reference distribution + link function) exactly for a DV which is fractions summing to a constant; for such a DV isn't uncommon. 05.04.2013 2:32, Subscribe SAS-L Anonymous пишет: ```Would the OP mind explaining exactly what the DV is? It might help shed light on how to proceed, notwithstanding other interesting solutions. Speaking of which, the idea of converting the probs to logits is intriguing. I am generally in favor of the logit scale because of its properties, but the fact that it addresses the linear dependence is an added bonus here. Anyway, I would appreciate if the OP would be willing to tell us more about the DV. Ryan On Apr 4, 2013, at 4:05 AM, Kirill Orlov [hidden email] wrote: ``` ```Consider you have a between-within design: several between-subject groups and several (3 or more) repeated measures (= within-subject) trials. It's all very classic and typical. The nuance, however, is that the values for every subject sum across the repeated levels to a **constant**. This is because the data are complementary, i.e. percentages of fractions, so, in this case they sum to 100 for every individual. For example, with 3 RM levels, a respondent's data is like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100). I know that I can analyze between-groups X repeated-measures count data via Generalized Estimating Equations procedure. By I doubt in this case because the values *sum to a constant*, they are complementary fractions; they are not counts of successes in repeated independent trials! Can I analyze such data in SPSS and how? Thanks. ``` ```===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ```
Open this post in threaded view
|

## Re: Repeated measures analysis of fractions summing to a constant

Open this post in threaded view
|

## Re: Repeated measures analysis of fractions summing to a constant

 Kirill, This is a question that has come up on cross-validated a few times, see here for an example http://stats.stackexchange.com/q/24187/1036. A frequent recommendation seems to be a Stata library by the name of dirifit (see http://maartenbuis.nl/software/dirifit.html) or a synonymous R library DirichletReg. I do not know if the current GENLIN procedure can be wrangled to produce the same model. A quick perusing of some of the materials floating around the web related to said packages suggest a quick and dirty way is to fit separate beta regression models for each of the subsets - although that doesn't constrain the total to be 1. (Smithson & Verkuilen (2006) A Better Lemon Squeezer has supplementary material on how to fit beta regression models in SPSS.) Count data models are not appropriate here because of the ceiling effect. You can look up ways around that (like censored Poisson regression or Tobit models) - but those ignore the compositional nature of the data here. Another suggestion on the CV site recommends multinomial models - which I see the relationship but I don't quite understand how you turn this into discrete outcomes to feed into a multinomial logistic regression. Looks like you will have some (hopefully fun) reading to do to sort through all these disparate recommendations! Andy Andy W apwheele@gmail.com http://andrewpwheeler.wordpress.com/
Open this post in threaded view
|

## Re: Repeated measures analysis of fractions summing to a constant

 In reply to this post by Kirill Orlov Correspondence analysis is designed for compositional data. ```Art Kendall Social Research Consultants``` On 4/4/2013 4:07 AM, Kirill Orlov [via SPSSX Discussion] wrote: Consider you have a between-within design: several between-subject groups and several (3 or more) repeated measures (= within-subject) trials. It's all very classic and typical. The nuance, however, is that the values for every subject sum across the repeated levels to a **constant**. This is because the data are complementary, i.e. percentages of fractions, so, in this case they sum to 100 for every individual. For example, with 3 RM levels, a respondent's data is like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100). I know that I can analyze between-groups X repeated-measures count data via Generalized Estimating Equations procedure. By I doubt in this case because the values *sum to a constant*, they are complementary fractions; they are not counts of successes in repeated independent trials! Can I analyze such data in SPSS and how? Thanks. If you reply to this email, your message will be added to the discussion below: http://spssx-discussion.1045642.n5.nabble.com/Repeated-measures-analysis-of-fractions-summing-to-a-constant-tp5719257.html To start a new topic under SPSSX Discussion, email [hidden email] To unsubscribe from SPSSX Discussion, click here. NAML Art Kendall Social Research Consultants
Open this post in threaded view
|

## Re: Repeated measures analysis of fractions summing to a constant

 In reply to this post by Kornbrot, Diana Thank you for all your answers that came so far. I haven't read them carefully yet. But here is what meanwhile came to my own mind after a little meditation. It is very simple: I just thought that (PLEASE correct me if I'm mistaken!) that there is no problem at all. The constraint that repeated-measures sum to a constant within individuals *does not* refute using common RM-ANOVA model. If only ANOVA distributional and spericity assumptions hold, no need for GEE or other procedures arise at all. Let's have some data: between-subject grouping factor GROUP and within-subject factor RM with 3 levels summing up to a constant (100). group rm1 rm2 rm3 sum 1 50 30 20 100 1 24 42 34 100 1 34 16 50 100 1 61 28 11 100 1 46 46 8 100 1 23 18 59 100 2 55 22 23 100 2 27 39 34 100 2 44 36 20 100 2 28 40 32 100 Run usual Repeated-measures ANOVA: GLM rm1 rm2 rm3 BY group /WSFACTOR= rm 3 /METHOD= SSTYPE(3) /WSDESIGN= rm /DESIGN= group. Summing up to a constant just means that upon collapsing the RM levels, all respondents appear to be the same: there exist no between-subject variation at all, or in other words, the "respondent ID" factor's effect is zero. Hence, in the table "Tests of Between-Subjects Effects" Error term is zero. Also, the effect of GROUP factor is zero too - of course, because the constant sum (100) in our data is the same for both groups 1 and 2. Now, - I'd ask you, - does these results invalid in any way? Do we say that ANOVA is misused when an error variation - which is left unxplained - is zero? I would not say it, and so RM-ANOVA *is* an appropriate method for fractions (i.e values summing up to a constant). If I'm wrong, please explain me why.
Open this post in threaded view
|

## Re: Repeated measures analysis of fractions summing to a constant

 No, you are a bit wrong in concluding that there is no problem.If you think of the situation of dummy variables, you have providedan "extra" dummy, like entering dichotomies for both Male and Female.There is redundancy.  There is over-parameterization.  There is, somewhere, the loss of one d.f.  for RM when you perform any analysis.  A "fixed" zero-effect is not the same as a randomly occurring near-zero-effect. You retain full information (in the statistical sense) if you set up yourmodel to leave out one of the categories, just as one would for any dummy coding.  The others will be most "independent" if you omit thecategory that has the greatest variance.  The drawback might lie in theease of interpreting your results. -- Rich Ulrich Date: Fri, 5 Apr 2013 19:36:04 +0400From: [hidden email]Subject: Re: Repeated measures analysis of fractions summing to a constantTo: [hidden email] Thank you for all your answers that came so far. I haven't read them carefully yet. But here is what meanwhile came to my own mind after a little meditation. It is very simple: I just thought that (PLEASE correct me if I'm mistaken!) that there is no problem at all. The constraint that repeated-measures sum to a constant within individuals *does not* refute using common RM-ANOVA model. If only ANOVA distributional and spericity assumptions hold, no need for GEE or other procedures arise at all. Let's have some data: between-subject grouping factor GROUP and within-subject factor RM with 3 levels summing up to a constant (100). group rm1 rm2 rm3 sum 1 50 30 20 100 1 24 42 34 100 1 34 16 50 100 1 61 28 11 100 1 46 46 8 100 1 23 18 59 100 2 55 22 23 100 2 27 39 34 100 2 44 36 20 100 2 28 40 32 100 Run usual Repeated-measures ANOVA: GLM rm1 rm2 rm3 BY group /WSFACTOR= rm 3 /METHOD= SSTYPE(3) /WSDESIGN= rm /DESIGN= group. Summing up to a constant just means that upon collapsing the RM levels, all respondents appear to be the same: there exist no between-subject variation at all, or in other words, the "respondent ID" factor's effect is zero. Hence, in the table "Tests of Between-Subjects Effects" Error term is zero. Also, the effect of GROUP factor is zero too - of course, because the constant sum (100) in our data is the same for both groups 1 and 2. Now, - I'd ask you, - does these results invalid in any way? Do we say that ANOVA is misused when an error variation - which is left unxplained - is zero? I would not say it, and so RM-ANOVA *is* an appropriate method for fractions (i.e values summing up to a constant). If I'm wrong, please explain me why.
Open this post in threaded view
|