# testing statistical dfference between medians of a sample and a subsample extracted from the sample

9 messages
Open this post in threaded view
|

## testing statistical dfference between medians of a sample and a subsample extracted from the sample

 HI all ! In a data analysis I am required to perform a statistical test (parametric) to know the statistical  difference(if significant) between median of 2 sample where one is full sample and another is sub sample extracted from the full sample based on a given characteristics (e.g. respondents belonging to certain age group). Can anyone suggest how togo about it in spss ? regards vini
Open this post in threaded view
|

## Re: testing statistical dfference between medians of a sample and a subsample extracted from the sample

 "Parametric" and "median" don't usually go together.  See (3)for a possible meaning.  There are other problems.First - There is no such thing as a "proper" test for a sub-sampleversus the whole sample that it comes from.  The necessary logicsays that you compare a sub-sample to the *rest*  of the sample.You may occasionally see a good presentation that does use theapproximate tests of this sort, for convenience and ease, plus astrong desire to accommodate Ns that are unequal.  For sub-sampleswith equal Ns, you can use a simple Confidence Interval around theoverall mean.Second - Almost nobody actually, ever compares "medians".  That description is less often accurate than it is an erroneous reference to a test of ranks.  Third - The most "non-parametric" way to put a Confidence Intervalaround the median of a single sample (full sample, here?) is to end up using ranks of scores in the sample to delimit the range.For instance, for a sample of a certain N, the 40th and 60th centilesmight determine the scores to mark the 95% CI.  There is no strongreason to expect that CI to be symmetrical around the median.  Ifyou wanted a "parametric" version of that, I suppose you would use the SD to determine a range.  Do you want to pick out the sampleswhose medians do not fall in that range?-- Rich Ulrich > Date: Tue, 21 Aug 2012 01:48:46 -0700> From: [hidden email]> Subject: testing statistical dfference between medians of a sample and a subsample extracted from the sample> To: [hidden email]> > HI all !> > In a data analysis I am required to perform a statistical test (parametric)> to know the statistical difference(if significant) between median of 2> sample where one is full sample and another is sub sample extracted from the> full sample based on a given characteristics (e.g. respondents belonging to> certain age group).> > Can anyone suggest how togo about it in spss ?> > regards...
Open this post in threaded view
|

## Re: testing statistical dfference between medians of a sample and a subsample extracted from the sample

 Administrator I too was wondering why you wanted a test comparing medians.  People sometimes assume that the Wilcoxon-Mann-Whitney test (aka Mann-Whitney U) compares medians (as opposed to means).  But that is only true if the two populations being compared are identical apart from a shift in location.  And in that case, the test could be said to be comparing means, medians, or any other percentile point you might choose.   By the way, the WMW is quite sensitive to small differences in variance or skewness in the populations, which can cause it to reject H0 far too often when it is used purely as a test of differences in location.  See for example the nice article by Fagerland & Sandvik (2009). http://www.ncbi.nlm.nih.gov/pubmed/19247980HTH. Rich Ulrich-2 wrote "Parametric" and "median" don't usually go together.  See (3) for a possible meaning.  There are other problems. First - There is no such thing as a "proper" test for a sub-sample versus the whole sample that it comes from.  The necessary logic says that you compare a sub-sample to the *rest*  of the sample. You may occasionally see a good presentation that does use the approximate tests of this sort, for convenience and ease, plus a strong desire to accommodate Ns that are unequal.  For sub-samples with equal Ns, you can use a simple Confidence Interval around the overall mean. Second - Almost nobody actually, ever compares "medians".  That description is less often accurate than it is an erroneous reference to a test of ranks.   Third - The most "non-parametric" way to put a Confidence Interval around the median of a single sample (full sample, here?) is to end up using ranks of scores in the sample to delimit the range. For instance, for a sample of a certain N, the 40th and 60th centiles might determine the scores to mark the 95% CI.  There is no strong reason to expect that CI to be symmetrical around the median.  If you wanted a "parametric" version of that, I suppose you would use the SD to determine a range.  Do you want to pick out the samples whose medians do not fall in that range? -- Rich Ulrich > Date: Tue, 21 Aug 2012 01:48:46 -0700 > From: [hidden email]> Subject: testing statistical dfference between medians of a sample and a              subsample extracted from the sample > To: [hidden email]> > HI all ! > > In a data analysis I am required to perform a statistical test (parametric) > to know the statistical  difference(if significant) between median of 2 > sample where one is full sample and another is sub sample extracted from the > full sample based on a given characteristics (e.g. respondents belonging to > certain age group). > > Can anyone suggest how togo about it in spss ? > > regards ... -- Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/"When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
Open this post in threaded view
|

## Re: testing statistical dfference between medians of a sample and a subsample extracted from the sample

 Hi: 1) Bruce, see also Anna Hart. "Mann-Whitnet test is not just a test of medians: differences in spread can be important" (BMJ 2001;323:391-3). I lost track of a reference that stated the same for Kruskal-Wallis test, I'll try to dig it (to many files in my external hard disk). 2) Maybe vinikalra could compute a 95%CI for the median of the subsample, and check if the full sample median is included within the limits. Best regards, Marta GG El 21/08/2012 23:32, Bruce Weaver escribió: > I too was wondering why you wanted a test comparing medians.  People > sometimes assume that the Wilcoxon-Mann-Whitney test (aka Mann-Whitney U) > compares medians (as opposed to means).  But that is only true if the two > populations being compared are identical apart from a shift in location. > And in that case, the test could be said to be comparing means, medians, or > any other percentile point you might choose. > > By the way, the WMW is quite sensitive to small differences in variance or > skewness in the populations, which can cause it to reject H0 far too often > when it is used purely as a test of differences in location.  See for > example the nice article by Fagerland & Sandvik (2009). > > http://www.ncbi.nlm.nih.gov/pubmed/19247980> > HTH. > > > Rich Ulrich-2 wrote >> "Parametric" and "median" don't usually go together.  See (3) >> for a possible meaning.  There are other problems. >> >> First - There is no such thing as a "proper" test for a sub-sample >> versus the whole sample that it comes from.  The necessary logic >> says that you compare a sub-sample to the *rest*  of the sample. >> >> You may occasionally see a good presentation that does use the >> approximate tests of this sort, for convenience and ease, plus a >> strong desire to accommodate Ns that are unequal.  For sub-samples >> with equal Ns, you can use a simple Confidence Interval around the >> overall mean. >> >> Second - Almost nobody actually, ever compares "medians".  That >> description is less often accurate than it is an erroneous reference >> to a test of ranks. >> >> Third - The most "non-parametric" way to put a Confidence Interval >> around the median of a single sample (full sample, here?) is to >> end up using ranks of scores in the sample to delimit the range. >> For instance, for a sample of a certain N, the 40th and 60th centiles >> might determine the scores to mark the 95% CI.  There is no strong >> reason to expect that CI to be symmetrical around the median.  If >> you wanted a "parametric" version of that, I suppose you would use >> the SD to determine a range.  Do you want to pick out the samples >> whose medians do not fall in that range? >> >> -- >> Rich Ulrich >> >>> Date: Tue, 21 Aug 2012 01:48:46 -0700 >>> From: vinikalra@ >>> Subject: testing statistical dfference between medians of a sample and a >>> subsample extracted from the sample >>> To: SPSSX-L@.UGA >>> >>> HI all ! >>> >>> In a data analysis I am required to perform a statistical test >>> (parametric) >>> to know the statistical  difference(if significant) between median of 2 >>> sample where one is full sample and another is sub sample extracted from >>> the >>> full sample based on a given characteristics (e.g. respondents belonging >>> to >>> certain age group). >>> >>> Can anyone suggest how togo about it in spss ? >>> >>> regards >> ... >> > > > > ----- > -- > Bruce Weaver > [hidden email] > http://sites.google.com/a/lakeheadu.ca/bweaver/> > "When all else fails, RTFM." > > NOTE: My Hotmail account is not monitored regularly. > To send me an e-mail, please use the address shown above. > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/testing-statistical-dfference-between-medians-of-a-sample-and-a-subsample-extracted-from-the-sample-tp5714777p5714788.html> Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Open this post in threaded view
|

## Re: testing statistical dfference between medians of a sample and a subsample extracted from the sample

 In reply to this post by vini Thanks for your reply. In the light of discussion above, it seems to me that to test the statistical difference between 2 samples' mean would be a better idea and in that case I can go for t-test ( As the data based on field survey, it can 'safely' be considered as normally distributed. ANY COMMENT ?). And as far as the relevance of comparing a full sample and a sub sample is concerned, the idea is to analysie if a particular sub sample (extracted based on certain parameter e.g. age group, education etc.) has the influence on the full sample . However, my question was how to go about it in SPSS(steps?) i.e. comparing full sample and sub sample of the full sample. Any suggestions? regards, vini
Open this post in threaded view
|

## Re: testing statistical dfference between medians of a sample and a subsample extracted from the sample

Open this post in threaded view
|

## Re: testing statistical dfference between medians of a sample and a subsample extracted from the sample

 In reply to this post by vini Perhaps it is just an exercise to understand the properties of such methods, useful to convince ourselves of the theoretical appropriateness of it. :) BEGIN PROGRAM R. # read dataset, suppose it is called column x mydata <- spssdata.GetDataFromSPSS() fullsample <- mydata\$x median.fullsample <- median(fullsample) # loop if you want some kind of bootstraping subsample <- subset( [whatever condition], fullsample) # t.test(subsample,mu=median.fullsample) END PROGRAM. -----Original Message----- From: vini [mailto:[hidden email]] Sent: Tuesday, August 21, 2012 3:49 AM Subject: testing statistical dfference between medians of a sample and a subsample extracted from the sample HI all ! In a data analysis I am required to perform a statistical test (parametric) to know the statistical  difference(if significant) between median of 2 sample where one is full sample and another is sub sample extracted from the full sample based on a given characteristics (e.g. respondents belonging to certain age group). Can anyone suggest how togo about it in spss ? regards vini -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/testing-statistical-dfference-between-medians-of-a-sample-and-a-subsample-extracted-from-the-sample-tp5714777.htmlSent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD