Information Technology Reference
In-Depth Information
using hyper spherical coordinates and so may be extended to higher
dimensional visualizations. An introduction and overview of a more
general class of visualizations called Normalized Radial Visualizations
(NRVs) from a formal theoretical perspective is given by Daniels et al.
[1]. An NRV is a mapping of high dimensional records into lower
dimensional space where data records' images are convex combinations of
the “dimensional anchors”: labels arranged on a circle (in two dimensions)
or on the surface of a hyper sphere (in higher dimensions).
RadViz [2] can be shown to be the particular instance of an NRV in
which the image space is two dimensional. We use RadViz throughout this
chapter; however, we take care to note the applicability of these techniques
to higher dimensional NRVs.
RadViz implementations are used as data exploration tools for high
dimensional data sets [3]. In practice it is often possible to use thousands
of dimensions with good insightful results. Other applications are better
served by reducing the dimensionality of the data being explored.
To determine clusters of interest in the data set we are often required to
manipulate the placement of the dimensional anchors in addition to
making a selection of dimensions to visualize. Unfortunately, optimal
placement of dimensional anchors has been shown to be NP-hard [4].
To simplify the work of assessing dimension selection we introduce
and use a straightforward dimensional anchor placement algorithm we call
the “Alternating Anchor Metric” (AAM) method.
The task of reducing dimensionality of data may be done using any of
a number of statistical and machine learning methods [5]. Our current
work uses the mean ratio technique introduced by Zhou et al. [5]. The
mean ratio was introduced by way of an example using the Khan's Small
Round Blue Cell Tumours (SRBCT) gene data [6]. Zhou et al. use the
mean ratio to select the dimensions (genes) which are most likely to be
expressed in the data. These dimensions are then used for the dimensional
anchors in the associated RadViz visualization [2] of the data set. These
earlier mean ratio results provide a test-bed for our new post-RadViz
method for assessing dimensional anchor selection which is based on a
formal partitioning of the image space. We later describe this method in
detail and give several examples on real data sets.
Since dimensional anchor selection and NRV visualization quality are
so closely linked we note the recent work of others in the relatively new
area of visualization quality metrics. Peng et al. [7] examine clutter
reduction in the visualization of high dimensional data. Peng et al.
examine the nature of clutter in a variety of visualizations such as parallel
Search WWH ::




Custom Search