Information Technology Reference
In-Depth Information
way with each successively ranked dimension placed alternately above
and below. The top five dimensions for any class would thus be arranged
to be read in the order 4-2-1-3-5. Fig. 13.3, for example, shows this in
practice. This method is a minimal and straightforward way to position
dimensional anchors in a non-random, hence easily reproducible, way.
AAM is used as the placement method with the Class Discrimination
Layout (CDL) [5]. That is, the CDL is the general method of laying out
DAs according to class, and AAM is used to decide the actual positions for
the DAs. With the CDL only the highest ranking dimensions for each class
are used. The CDL may be specified as: (a) select the most highly
expressed dimensions; (b) assign each dimension to the classes which, in
respect to the other classes, have significantly higher values statistically;
and (c) place the DAs assigned to the same class next to each other to form
a “classification sector”. We rank and select dimensions using the mean
ratio, although other methods may be applied.
For a data set of d dimensions which may be divided into C classes
(i.e. the data may be partitioned into C disjoint subsets
VV V
} )
{, , }
C
we define, as done by Zhou et al. [5], the mean ratio
r for any dimension
i as:
C
¦
M
ij
(4)
j
z
1,
j
k
r
1
. If
C
1 then
r
1
i
i
(
CM
) *
ik
where
i
} ;
1,
,
d
is the mean value of subset
V on dimension i ;
M
ij
and
M
max{
M
,
}
,
M
}
. k is referred to as the discriminative class
ik
i
1
iC
label for dimension i .
To measure the effects of varying dimensional anchor selection and/or
dimensional anchor placement, we must first rigorously quantify cluster
sensitivity and overall visualization quality. This will provide us with a
tool to score future results. While cluster validation indices [10] provide a
good measure of cluster quality they do not necessarily lend themselves to
quantifying the quality of a RadViz image in its entirety. For example,
suppose we were to have a single cluster of data images clustered tightly
around the image space origin. This would be rated by any measure as a
high quality cluster; however, this is not necessarily the case. Due to the
cancellation of opposing “spring forces” we know that features of the data
set may remain hidden and there may actually be two strong clusters that
would be more visible by simply moving the position of the DAs. Our
 
Search WWH ::




Custom Search