Graphics Reference
In-Depth Information
Let us consider the two following matrices: Y r and Y c . We assume that they are
respectively of order n
p .Rowsof Y r are centered and reduced with
respect to means and standard deviations of the matrix X n,p ,and Y c contains stan-
dardized variables.
Rowvectorsin Y r can beprojected as supplementary units and column vectors in
Y c as supplementary variables according to the following formulae:
p and n
Ψ r
Y r U q ,
=
Y c V q ,
Φ c
=
where U and V are defined in ( . ) and ( . ), respectively.
Anybody with a little practice in FA knows that performing the analysis on the
results of a huge dataset will lose appeal, whereas choosing a subset as a reference
group and projecting other units as supplementary points ensures a much more in-
terestinginterpretation. hesameholdsifweturnourattention totherepresentation
of variables. his procedure is advantageous to avoid heavy computational efforts on
the whole dataset. Clearly we should be able to recognize in the graphical displays
groups of active and supplementary units. Looking at the factorial plots in Fig. .
and . , we notice that there are two extreme points: Mexico and Portugal. Delet-
ing these points and projecting them as supplementary, the plots in Fig. . illus-
trate the changes in the variable relationships. As a direct consequence of the dele-
tion, the total inertia associated to the first factorial plan increases from . % to
. %.
Effects depending on the deletion of Portugal and Mexico are clearly evident on
thefirstprincipalplan(Fig. . ):Portugalmovedthroughtheaxes'origin;thevertical
axis presents a counterclockwise rotation, with no influence by Mexico.
hus far we have focused our attention on the main features of FA, and we have
provided the reader with guidelines on how to evaluate correct interpretations of
a factorial plan. In the following sections we will show how to exploit FA's data anal-
ysis capabilities to produce useful exploratory analyses with the help of interactive
tools.
Distance Visualization in R p
4.4
Previous sections showed how FA allows us to graphically evaluate the differences
among statistical units in terms of distances. Obviously, we can represent -D or
-D space, so that this task can be exploited taking into account no more than three
factors simultaneously. he question is: how can we evaluate the differences in terms
of the distance in R p spaces when p
? Even if we cannot graphically represent
distances over spaces having more than three dimensions, we can compute distances
in R p (with p
) using the Pythagorean theorem. he unresolved issue remains
how to visualize these distances. Hierarchical clustering approaches furnish a good
solution.
Search WWH ::




Custom Search