Information Technology Reference
In-Depth Information
and G into the matrix
G
G
,
G =
amatrix D ( 2 ) =−
G G )
D ( 2 )
1
2
p ( 2 ) 1 n + m 1 n + m
(
can be calculated. The matrix
can be
partitioned as
D 11 : n × n
,
D 12 : n × m
D ( 2 ) =
D 21 : m × n
D 22 : m × m
where the matrix D 11 is equivalent to D ( 2 ) , while each of the m columns of D 12 forms one
of the d n + 1 vectors necessary for the interpolation formula z = 1
1
n D1 ) .
Graphical interpolation remains possible, but the nonconcurrency of the trajectories
introduces complications concerned with the marker O k for the mean on the k th trajectory
( k = 1, 2, ... , p ( 1 ) ) and the centroid G of the sample points. For details, see Gower and
Hand (1996).
Y r ( d n + 1
r
9.7 Prediction
To predict the values of quantitative variables, we have seen that the generalized biplot
pseudo-samples lead to nonconcurrent trajectories that are parallel to the trajectories
of nonlinear biplots. So far as prediction is concerned, we may continue to use the
concurrent nonlinear biplot trajectories themselves. Prediction for continuous variables is
performed with either normal or circular projection as described in Section 5.3.3 exactly
as in the case of the nonlinear biplot. The predicted category for each categorical variable
is determined by the prediction regions. In R + the nearest-neighbour region for the h th
level of the k th categorical variable will be the subspace that contains all points that are
nearer to this CLP than to any of the other CLPs of the k th variable. If this variable
has L k category levels, the space R + will be divided into L k disjoint and exhaustive
neighbour regions, F 1 , F 2 , ... , F L k . To predict the category level to be associated with
a point z in L its representation z ∗+ in R + is needed. Since L is a linear subspace
of R and the n th component of any point in R is 0, this is given by
z :
r × 1
z ∗+ =
.
0 :
( n 1 r ) × 1
1 × 1
0:
The distances between z ∗+ and each of the CLPs must be calculated and the predicted
category corresponds to the nearest CLP.
The prediction regions of categorical variables are the representation of the nearest-
neighbour regions in L , that is, the intersection F h L . It may happen that L does
not intersect with all the neighbour regions of a categorical variable, and in that case
certain category levels do not appear and are never predicted. Furthermore, projected
CLPs need not lie in their own prediction regions, and certainly do not define nearest-
neighbour regions in L that coincide with the prediction regions. Indeed, the prediction
regions are not nearest-neighbour regions for any set of points in L .
Search WWH ::




Custom Search