
In a research project we work together on, a colleague of mine constructed
an index based on factor scores obtained through classical factor analysis
of a number of categorical census variables all transformed into dummies.
The variables concerned the standard of living and included quality of
dwelling and basic services such as sanitation, water supply, electricity
and the like. (The score was not simply the score for the first factor, but
the average score of several factors, weighted by their respective
contribution to explaining the overall variance of observed variables, but
this is beside the point.)
Now, he found out that the choice of reference or "omitted" category for
defining the dummies has an influence on results. He first ran the analysis
using the first category of all categorical variables as the reference
category, and then repeated the analysis using the last category as the
reference or omitted category. He found that the resulting scores varied not
only in absolute value but also in the shape of their distribution.
I can understand that the absolute value of the factor scores may change and
even the ranking of the categories of the various variables (in terms of
their average scores) may also be different, since after all the list of
dummies used has varied. But the shape of the distribution should not
change, I guess, especially not in a drastic manner. In this case both
distributions are roughly similar but not equal, and both have one pointed
density peak but of different height and at different places, one of them
around 1 and the other around +1 on the zscore scale, the rest of the
distributions being approximately alike. The two scores were inversely
correlated (probably to be expected, since the first category in the
original census variables represented often a "good" situation like living
in a home of brick or concrete, and the last category was often a poor or
residual situation like living in some "other" kind of nondescript dwelling,
probably on the streets or suchlike, but they were not perfectly correlated
as could have been expected considering that the two scores were just
different combinations of the same categorical variables: their linear
correlation coefficient was 0.54, indicating they share only 29% of their
variance. The dataset was a large sample of census data, and all the results
were statistically significant.
Any ideas why choosing different reference categories for dummy conversion
could have such impact on results?
Hector
