Information Technology Reference
In-Depth Information
0.3
5
r
RGF
0.2
4
g
3
SLF
8
7
q
2
0.1
6
6
5
u
v
4
h
1
4
3
j
c
m
2
2
1
0
1
p
i
f
0
2
b
3
n
2
a
1
4
0.1
d
2
6
k
w
8
s
e
10
0.2
12
t
14
16
0.3
SPR
18
PLF
Figure 2.26 Interpolation biplot of the aircraft data with obliquely translated biplot
axes, such that the point of concurrency is zero on each axis. The original origin is
retained and marked with a black cross. A new point is interpolated graphically with
values SPR = 8, RGF = 4, PLF = 0.3 and SLF = 3. The black arrow extends from
the original origin to the centre of the polygon with vertices the values of the sample to
be interpolated. The red arrow is p = 4 times the length of the black arrow and indicates
the interpolated position of the new sample.
The data set is available as mailcatbuyers.data in UBbipl . Figure 2.28 is a biplot
approximating the raw data matrix. In this biplot colour is used to distinguish the buying
behaviour of different age groups. In Chapter 8, we consider other types of biplot more
appropriate for questionnaire data of this kind.
The biplot clearly shows the main difficulty encountered with large data sets: over-
plotting and too much ink leave us with a graph so cluttered that it is hardly of practical
use. Two factors contribute to the amount of ink in Figure 2.28: the number of variables
and the number of samples. These two issues are addressed differently when considering
biplots for use with large data sets.
First, we consider the number of variables - that is, the number of biplot axes. This
issue can be addressed by having interactive computer software for turning on and off
biplot axes. Using the ax argument of our function PCAbipl we can suppress plotting of
any subset of axes. This facility was used to obtain the biplots in Figures 2.29 and 2.30.
Search WWH ::




Custom Search