Required Variable Levels in Nonparametric Tests

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Required Variable Levels in Nonparametric Tests

Heidi Green

Hello-

 

I am wondering why I can only select scale variables to be the “Test Fields” in Nonparametric Tests: Two or more independent samples.

 

Here is the background of what I am trying to accomplish. I have survey results with the following question (among others):

Overall, how satisfied are you with your purchase of …

The responses are structured as follows:

5 = Completely Satisfied

4 = Very Satisfied

3 = Somewhat Satisfied

2 = Slightly Satisfied

1 = Not At All Satisfied

 

(A “Likert” scale, I believe?)

 

I already know from looking at frequencies that I do not have a normal distribution, because 90% of responders are either Completely Satisfied or Very Satisfied. I have a very large sample size (over 90,000 responses) covering multiple years. Some of the responders shopped at “Premier” stores, and some at regular stores (I have a 0/1 flag in my data for that). I also have a few other interesting groups that I could look at, if they responded via mail or on the web, or the year/month that they responded in, or what particular product variation they purchased etc…

 

Main question to answer: is there a statistically significant difference in satisfaction for those shopping at premier stores vs. regular stores?

 

I am not a statistician, but some basic internet research and reading some of the links you have all posted led me to examining the nonparametric tests, and specifically the Mann-Whitney & Kruskal-Wallis tests.

 

In SPSS 18 (with current patches), I go to Analyze, Nonparametric Tests, Independent Samples. On the Objective tab, I select Automatically compare distributions across groups. On the Fields tab, I want to put my Premier/Non-Premier flag in the Groups spot, and my overall satisfaction question in the Test Fields area. I originally had the overall satisfaction question designated as an Ordinal field. SPSS appears to only accept Scale variables as test fields in this procedure. I went ahead and changed my field to scale, just to see the results, but fundamentally, this bothers me. Can I call it a scale variable? Or am I totally off-base trying to use this procedure?

 

Incidentally, everything that I have tried running using the field as a scale variable has come up with the following:

Null Hypothesis = The distribution of overall satisfaction with purchase is the same across categories of premier/non-premier.

Decision = Reject the null hypothesis;

That’s what I wanted to prove, but I’m not sure how to interpret the Mann-Whitney U and/or Wilcoxon W, Test Statistic, etc… but first I need to make sure my test is appropriate!

 

Thanks for any input,

-Heidi

Reply | Threaded
Open this post in threaded view
|

Re: Required Variable Levels in Nonparametric Tests

Julius Sim
Hello Heidi,


As I've not yet got onto SPSS 18, I'll leave others to address the
question as to how to perform the analysis on this release. But as regards
the scale, it is not strictly a Likert scale, but an adjectival rating
scale. A Likert item consists of a statement to which respondents are
asked to agree or disagree (or possibly approve or disapprove), whereas
your item is a question to which respondents are asked to select an answer
- here in adjectival form, but it could also be in adverbial form (e.g.
the responses 'Very often', 'Often, 'Sometimes' etc. would be an adverbial
rating scale).

If the description 'Likert' is used for all such ordinal scales, it blurs
a potentially important psychometric distinction between these different
question forms.

On more immediate matters, I hope you get your analysis sorted on version 18!

Best wishes,

Julius



> Hello-
>
>
>
> I am wondering why I can only select scale variables to be the "Test
> Fields" in Nonparametric Tests: Two or more independent samples.
>
>
>
> Here is the background of what I am trying to accomplish. I have survey
> results with the following question (among others):
>
> Overall, how satisfied are you with your purchase of ...
>
> The responses are structured as follows:
>
> 5 = Completely Satisfied
>
> 4 = Very Satisfied
>
> 3 = Somewhat Satisfied
>
> 2 = Slightly Satisfied
>
> 1 = Not At All Satisfied
>
>
>
> (A "Likert" scale, I believe?)
>
>
>
> I already know from looking at frequencies that I do not have a normal
> distribution, because 90% of responders are either Completely Satisfied
> or Very Satisfied. I have a very large sample size (over 90,000
> responses) covering multiple years. Some of the responders shopped at
> "Premier" stores, and some at regular stores (I have a 0/1 flag in my
> data for that). I also have a few other interesting groups that I could
> look at, if they responded via mail or on the web, or the year/month
> that they responded in, or what particular product variation they
> purchased etc...
>
>
>
> Main question to answer: is there a statistically significant difference
> in satisfaction for those shopping at premier stores vs. regular stores?
>
>
>
> I am not a statistician, but some basic internet research and reading
> some of the links you have all posted led me to examining the
> nonparametric tests, and specifically the Mann-Whitney & Kruskal-Wallis
> tests.
>
>
>
> In SPSS 18 (with current patches), I go to Analyze, Nonparametric Tests,
> Independent Samples. On the Objective tab, I select Automatically
> compare distributions across groups. On the Fields tab, I want to put my
> Premier/Non-Premier flag in the Groups spot, and my overall satisfaction
> question in the Test Fields area. I originally had the overall
> satisfaction question designated as an Ordinal field. SPSS appears to
> only accept Scale variables as test fields in this procedure. I went
> ahead and changed my field to scale, just to see the results, but
> fundamentally, this bothers me. Can I call it a scale variable? Or am I
> totally off-base trying to use this procedure?
>
>
>
> Incidentally, everything that I have tried running using the field as a
> scale variable has come up with the following:
>
> Null Hypothesis = The distribution of overall satisfaction with purchase
> is the same across categories of premier/non-premier.
>
> Decision = Reject the null hypothesis;
>
> That's what I wanted to prove, but I'm not sure how to interpret the
> Mann-Whitney U and/or Wilcoxon W, Test Statistic, etc... but first I
> need to make sure my test is appropriate!
>
>
>
> Thanks for any input,
>
> -Heidi
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Required Variable Levels in Nonparametric Tests

Bruce Weaver
Administrator
In reply to this post by Heidi Green
Heidi Green wrote
Hello-

I am wondering why I can only select scale variables to be the "Test
Fields" in Nonparametric Tests: Two or more independent samples.

Here is the background of what I am trying to accomplish. I have survey
results with the following question (among others):

Overall, how satisfied are you with your purchase of ...

The responses are structured as follows:

5 = Completely Satisfied
4 = Very Satisfied
3 = Somewhat Satisfied
2 = Slightly Satisfied
1 = Not At All Satisfied

(A "Likert" scale, I believe?)

I already know from looking at frequencies that I do not have a normal
distribution, because 90% of responders are either Completely Satisfied
or Very Satisfied. I have a very large sample size (over 90,000
responses) covering multiple years. Some of the responders shopped at
"Premier" stores, and some at regular stores (I have a 0/1 flag in my
data for that). I also have a few other interesting groups that I could
look at, if they responded via mail or on the web, or the year/month
that they responded in, or what particular product variation they
purchased etc...

Main question to answer: is there a statistically significant difference
in satisfaction for those shopping at premier stores vs. regular stores?

I am not a statistician, but some basic internet research and reading
some of the links you have all posted led me to examining the
nonparametric tests, and specifically the Mann-Whitney & Kruskal-Wallis
tests.

In SPSS 18 (with current patches), I go to Analyze, Nonparametric Tests,
Independent Samples. On the Objective tab, I select Automatically
compare distributions across groups. On the Fields tab, I want to put my
Premier/Non-Premier flag in the Groups spot, and my overall satisfaction
question in the Test Fields area. I originally had the overall
satisfaction question designated as an Ordinal field. SPSS appears to
only accept Scale variables as test fields in this procedure. I went
ahead and changed my field to scale, just to see the results, but
fundamentally, this bothers me. Can I call it a scale variable? Or am I
totally off-base trying to use this procedure?

Incidentally, everything that I have tried running using the field as a
scale variable has come up with the following:

Null Hypothesis = The distribution of overall satisfaction with purchase
is the same across categories of premier/non-premier.

Decision = Reject the null hypothesis;

That's what I wanted to prove, but I'm not sure how to interpret the
Mann-Whitney U and/or Wilcoxon W, Test Statistic, etc... but first I
need to make sure my test is appropriate!

Thanks for any input,

-Heidi
The Mann-Whitney U and Kruskal-Wallis tests are carried out on ranks, but don't perform all that well when there are a lot of ties.  What you have I would call ordered categories, not ranks.  One thing you could do is a variant of the chi-square test that takes into account the order in your categories.  See Dave Howell's notes at the link below for some discussion of this.

   http://www.uvm.edu/~dhowell/StatPages/More_Stuff/OrdinalChisq/OrdinalChiSq.html

For some reason the title says "Treatment of Missing Data", but it's really about chi-square tests with ordered categories.

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
Reply | Threaded
Open this post in threaded view
|

Re: Required Variable Levels in Nonparametric Tests

SPSS Support-2
In reply to this post by Heidi Green

Hello Heidi,

Following is a resolution from our Support knowledgebase regarding the requirement for scale level definition of dependents for these tests in the NPTESTS procedure in Release 18. This was really done for logistic or technical reasons related to how to deal with ordinal variables that might be defined as strings, or more specifically, to avoid those issues. We've filed an enhancement request asking that this be revisited, but for the time being, you have to either use the older NPAR TESTS procedure (the legacy dialogs in the menus) or redefine the measurement levels of variables in the Data Editor or through command syntax.

I've opened a support case and will contact you individually in case there are follow-up questions.

David Nichols
Statistical Support
SPSS, an IBM Company


Resolution 86245

Problem Summary:
NPTESTS procedure requires measurement levels for TEST fields for INDEPENDENT and RELATED subcommands to be scale

Problem Description:

I'm trying to run tests such as the Kruskal-Wallis test for independent samples or the Friedman test for related samples using the new NPTESTS procedure (Analyze>Nonparametric Tests>Independent Samples or Related Samples in the menus) with ordinal response variables. When I attempt to move an ordinal variable into the Test Fields box in the Two or More Independent Samples dialog box, I get a popup telling me that ordinal fields cannot be placed in that box. If I try to work around this using command syntax, I am able to run the procedure, but in the output, the Decision column in the Hypothesis Test Summary in the left side of the Model Viewer states that the procedure was unable to compute the test, and the right-hand side of the Viewer contains the information that the test field is not continuous. Why does this procedure require that the measurement level of test fields be continuous for tests that are perfectly well-defined for ordinal response data?

Resolution Summary:

Enhancement Request Submitted

Resolution Description:

The NPTESTS procedure was designed to include capabilities not supported by the older NPAR TESTS procedure, including handling of string variables and the ability to automatically select appropriate tests for particular situations. Inclusion of these capabilities produced complications with regard to handling of ordinal responses, particularly with string variables. For that reason, the decision was made to require test fields for certain tests to be numeric and declared as scale or continuous in measurement level. An enhancement request has been filed with SPSS Development, requesting that this decision be revisited. For the time being, you will need to change the measurement level of ordinal variables outside of the procedure in order to use them as test fields in these tests. We apologize for any resulting inconvenience.

From: Heidi Green <[hidden email]>
To: [hidden email]
Date: 05/19/2010 05:16 PM
Subject: Required Variable Levels in Nonparametric Tests
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Hello-
 
I am wondering why I can only select scale variables to be the “Test Fields” in Nonparametric Tests: Two or more independent samples.
 
Here is the background of what I am trying to accomplish. I have survey results with the following question (among others):
Overall, how satisfied are you with your purchase of …
The responses are structured as follows:
5 = Completely Satisfied
4 = Very Satisfied
3 = Somewhat Satisfied
2 = Slightly Satisfied
1 = Not At All Satisfied
 
(A “Likert” scale, I believe?)
 
I already know from looking at frequencies that I do not have a normal distribution, because 90% of responders are either Completely Satisfied or Very Satisfied. I have a very large sample size (over 90,000 responses) covering multiple years. Some of the responders shopped at “Premier” stores, and some at regular stores (I have a 0/1 flag in my data for that). I also have a few other interesting groups that I could look at, if they responded via mail or on the web, or the year/month that they responded in, or what particular product variation they purchased etc…
 
Main question to answer: is there a statistically significant difference in satisfaction for those shopping at premier stores vs. regular stores?
 
I am not a statistician, but some basic internet research and reading some of the links you have all posted led me to examining the nonparametric tests, and specifically the Mann-Whitney & Kruskal-Wallis tests.
 
In SPSS 18 (with current patches), I go to Analyze, Nonparametric Tests, Independent Samples. On the Objective tab, I select Automatically compare distributions across groups. On the Fields tab, I want to put my Premier/Non-Premier flag in the Groups spot, and my overall satisfaction question in the Test Fields area. I originally had the overall satisfaction question designated as an Ordinal field. SPSS appears to only accept Scale variables as test fields in this procedure. I went ahead and changed my field to scale, just to see the results, but fundamentally, this bothers me. Can I call it a scale variable? Or am I totally off-base trying to use this procedure?
 
Incidentally, everything that I have tried running using the field as a scale variable has come up with the following:
Null Hypothesis = The distribution of overall satisfaction with purchase is the same across categories of premier/non-premier.
Decision = Reject the null hypothesis;
That’s what I wanted to prove, but I’m not sure how to interpret the Mann-Whitney U and/or Wilcoxon W, Test Statistic, etc… but first I need to make sure my test is appropriate!
 
Thanks for any input,
-Heidi