Information Technology Reference
In-Depth Information
relevant and discard the rest. One recent winner of the Knowledge Discovery in
Databases competition reduced the number of features from the original 139,351
down to 4 features in his final model [37]. An alternate approach is summarizing.
For example, data can be clustered and the individual components can be studied
in various ways.
Machine learning tools can help with both dimension reduction and sum-
marizing. We use the Weka [38] implementations of many machine learning
algorithms. We also use the autoclass [39] clustering software. In addition we
have developed our own genetic programming software package, GPP [40]. We
also have our own equation discovery software [41]. This provides us with many
avenues for displaying, interacting, and gaining insight into our results.
Our visualization of the Iris data set [42] contains multiple representations.
Figure 10-a shows part of our visualization. On the near side of the left wall
is a parallel coordinate plot [36] of the cluster identified with the transparent
envelope. On the far side of the left wall is a plot of the probability density
distribution of each of the attributes in the data set. The right wall shows how
the attributes rank with Information Gain [38]. In the foreground is a set of
statistics that have been computed on the fly in response to a user command.
The points of the data set are represented as glyphs where the attributes have
been mapped to glyph attributes using our glyph toolbox. The points are plotted
in the central cube. A user can also interact with this visualization by turning
the transparent envelops of the clusters on and off individually, and the parallel
coordinate plots with them. Figure 10-b shows the same dataset visualized in
three different ways, shown in three separate rooms.
Figure 10 helps to bring together all of the main components of our VL.
The visualization is run through a distributed computing environment, in which
multiple users can interact with the data. The figure demonstrates the interactive
IVE, in which users can move, hide, and select objects in the system to control the
display the data and the movement of the data into and out of the visualization.
Figure 10 also displays results of our machine learning tools, used to analyze the
data and select which components to study. With all three of these components,
we can speed up concept development.
3 Applications
We speed up insight into our data through representation of the data in the
IVE and through interactions with the data. One representation may not be
sucient, so the ability to switch between and interact with representations is
important. We describe a set of applications that highlight our approach.
3.1
Multi-modal Imaging and Visualization
In this project we are developing methods for combining related three-dimens-
ional data sets from a variety of sources into visualizations that enable explo-
ration and understanding of the data at a variety of scales and with a variety of
Search WWH ::




Custom Search