At 04:11 AM 6/20/2006, Feinstein, Zachary wrote:

>The rows can be proportions or they can be means. Let's say a mean

>example is the number of children in each household and we wish to test

>if there are statistically significant differences among the regions.

Zachary,

You're still not using the language of statistics correctly (by which I

mean "in a way that other people can understand what you're asking"). In

this example, the "number of children" is the *row* and the *cell contents*

are proportions or means. What you're describing here sounds like a

half-done Analysis of Variance model. In order to test the statistical

significance, you're going to need not just the means, or proportions, but

the variances, because that's what the tests of significance are based on.

Usually in an analysis of this sort, the "rows" are observations of the

variable in question. In your example, the "variable" is the number of

children, and the unit of observation is the household. You initially

presented your problem in what sounded to me like a cross-tabulation,

because you spoke of a "Totals" column. But now I see that's not the

appropriate model.

> A proportion example could be who people are likely voting for in the next

>presidential election. Again, would be interesting to statistically

>test the proportion that would choose Hillary Clinton among the four

>regions.

This looks like a cross-tabulation model, in which the row variable is

"Vote for Hillary?" and the rows are "Yes" and "No," and your column

variable is "Regions" with columns for North, South, etc. The null

hypothesis is that voting for Hillary is independent of what region you're in.

But what you need in the table then are not proportions, but frequencies,

in order to take sample size into account. Statistical tests of

significance depend on sample size. For example, if the "Yes" vote for

Hillary in "North" is 60%, and the "No" vote is 40%, that's a landslide in

a realistic vote population (e.g., 600,000 yes, vs. 400,000 no), but if its

based on a sample with 3 votes for Hillary and 2 votes against, that's not

a significant sample.

>But the part that I really should clarify is what I mean when I say I

>want to test the columns by Total. When I wish to test North versus

>Total I mean North versus Total minus North, or North versus all the

>other regions. That's what the test within a platform like CfMC does.

This is where your first example gets confusing, because you wrote about

using means. If you use means, what does "Total" mean? What it really

sounds like you're saying here is that "Total" doesn't mean "Total," but

rather "Other".

>Thank you and sorry about any of the confusion. Still hoping there is

>an automatic way to do this with SPSS.

It seems to me that your problem, and the data for it, are not precisely

formulated.

You seem to be confusing several statistical models. So I can't think of

anything to help you.

Bob

Robert M. Schacht, Ph.D. <

[hidden email]>

Pacific Basin Rehabilitation Research & Training Center

1268 Young Street, Suite #204

Research Center, University of Hawaii

Honolulu, HI 96814