Hi Mark,

A slightly better idea would be to drop the unclassifiable cluster from

the analyzis. These unclassifiable cases are hardly separable and will

destroy your DA. Also clusters with small number of cases can create

similar problems. I suspect that your problems with DA can be caused by

such splittered solution of CA.

Try to find a good, stable solution of CA first, eliminate the outliers

(small clusters + you can use standard diagnostics to find the unusual

cases), and DA will probably work better.

Jan

-----Original Message-----

From: Mark Webb [mailto:

[hidden email]]

Sent: Monday, July 31, 2006 1:27 PM

To: Spousta Jan

Cc:

[hidden email]
Subject: Re: Distance from cluster centre query.

Thanks for this Jan.

I may well use your suggestion & compute the centroids BUT would like to

discuss the idea of a cluster centroid in the context of what I'm trying

to do.

I'm finding that discriminant analysis [DA] based on clusters[dep var] &

the statements used to make the clusters [indep vars] are not working

well in practice.

I would like to remove "weakly"associated respondents from each clusters

and put them into an additional cluster representing "unclassifiable".

I was hoping to define these weak respondents by using the distance from

centriod idea but I use Hierarchical methods [Wards] most often - hence

my initial querry.

Do you think what I'm suggesting is feasible ?

I would then run DA on the original clusters plus 1.

Regards

Mark

----- Original Message -----

From: "Spousta Jan" <

[hidden email]>

To: "Mark Webb" <

[hidden email]>; <

[hidden email]>

Sent: Monday, July 31, 2006 12:55 PM

Subject: RE: Distance from cluster centre query.

Hi Mark,

While K-Means operates in a metric Euclidean space or something similar,

and therefore can easily define the centroids (and uses them during the

computing), the Hierarchical algorithm can be used in a more general

topological spaces where there are no well defined centroids. Imagine

clustering species; take a cluster {baboon, human, chimpanzee} - what is

the centroid here? Michael Jackson? Really hard to say. And that is

perhaps the reason why SPSS does not prompt you to save the

centroid-derived statistics.

Otherwise, if you think that they really do give a sense, you can

compute the centroid coordinates easily using Aggregate and add them to

the file. And then you can compute the distance case - centroid using

the familiar formula for the Euclidean distance.

Unfortunately, my SPSS 14 is broken now, so I will draft the example

syntax in SPSS 12 which is more cumbersome because of the lack of

ADDVARIABLES mode in Aggregate.

GET FILE='C:\Program Files\SPSS\Cars.sav'.

SELE IF nmiss(mpg to cylinder)=0 and uniform(1) < 0.2.

DESCRIPTIVES mpg to accel /SAVE.

CLUSTER Zmpg to Zaccel /SAVE CLUSTER(5).

*Save the coordinates of the centroids.

AGGREGATE /OUTF='C:\Program Files\SPSS/aggr.sav' /BREAK=CLU5_1

/Cmpg Cengine Chorse Cweight Caccel = MEAN(Zmpg Zengine Zhorse Zweight

Zaccel).

*Add them to the file.

SORT CASES BY CLU5_1 (A) .

MATCH FILES /FILE=* /TABLE='C:\Program Files\SPSS\aggr.sav' /BY CLU5_1.

exe.

*Compute the Euclidean distance case-centroid.

comp distance = 0.

do repe centr = Cmpg to Caccel /case = Zmpg to Zaccel.

- comp distance = distance + (centr-case)**2.

end repe.

comp distance = sqrt(distance).

var lab distance "Distance case-centroid".

exe.

*End of the example.

Greetings

Jan

-----Original Message-----

From: SPSSX(r) Discussion [mailto:

[hidden email]] On Behalf Of

Mark Webb

Sent: Monday, July 31, 2006 7:43 AM

To:

[hidden email]
Subject: Distance from cluster centre query.

In K Means it's possible to save this information as a variable.

Is this possible in any of the hierarchical methods offered in SPSS ?

They offer a proximity matrix - which I see as different - as this shows

distances between individual respondents NOT the classification mean.

Am I missing something ?

Regards

__________ NOD32 1.1684 (20060729) Information __________

This message was checked by NOD32 antivirus system.

http://www.eset.com