Information Technology Reference
In-Depth Information
Chi-squared distance is not a symmetric approach, requiring separate maps for the row
and column approximations. These maps are often superimposed by taking one set of
points from each map, to give a single map, but because neither row-column distances
nor inner products can be readily interpreted, the justification for this practice is not clear.
The Pearson residuals have a simple interpretation in terms of illustrating the extent of
departures from the independence model that is analogous to examining the interaction
term in biadditive models. We also differ from much of the correspondence analysis
literature in our attitude to underlying models. It is often stressed that correspondence
analysis is not model based, but we believe that it is only by examining underlying
models that it is possible to appreciate the different variants and how to interpret them.
As we have mentioned, and is evident from the above examples, there is considerable
agreement between the maps produced by the different variants, so at some level, all
may be regarded as giving similar information. Nevertheless, it is good to understand
what are the underlying approximations.
Whichever method is used, there remains the question of its graphical presentation,
which we have seen may be in terms of two sets of points, two sets of axes or one of each.
Distances are best appreciated between pairs of points, inner products are best appreciated
by projection onto calibrated axes. However, two sets of calibrated axes are confusing
and, therefore, usually unacceptable. We may also show points on calibrated axes, but
these too should be used with circumspection. In general we prefer one set of points and
one set of axes. This allows distance interpretations for the points together with inner
product interpretations. There is an element of asymmetry when used with symmetric
methods, such as the approximation of Pearson residuals, because the axes have to be
chosen to represent either the rows or the columns. Note that this is an asymmetry
in presentation, not in the analysis. Sometimes two sets of points are acceptable, as
when representing the contingency ratio and using the centroid interpretation; centroids
require points.
We have presented λ -scaling as a method for improving presentation when one set of
points is concentrated near the origin, and the other set is more scattered. This becomes
unnecessary when the concentrated set is replaced by calibrated axes. The situation is
analogous to presenting a set of scales in millimetres when the axes are better calibrated
in metres. Nevertheless,
-scaling can be useful but it cannot be used with centroid inter-
pretations and in its general form, which allows different scaling for different dimensions,
it will destroy distance interpretations. The simple form of
λ
λ
-scaling used in the above
examples gives harmless isotropic scaling.
Search WWH ::




Custom Search