Graphics Reference
In-Depth Information
Introduction
4.1
he amount of data and information collected and retained by organizations and
businesses is constantly increasing, due to advances in data collection, computeri-
zation of transactions, and breakthroughs in storage technology. Further, many at-
tributes are also recorded, resulting in very high-dimensional data sets. Typically,
the applications involve large-scale information banks, such as data warehouses that
contain interrelated data from a number of sources. Examples of new technologies
giving rise to large, high-dimensional data sets are high-throughput genomic and
proteomic technologies, sensor-based monitoring systems, etc. Finally, new appli-
cation areas such as biochemical pathways, web documents, etc. produce data with
inherent structure that cannot be simply captured by numbers.
To extract useful information from such large and structured data sets, a first step
is to be able to visualize their structure, identifying interesting patterns, trends, and
complex relationships between the items. he main idea of visual data exploration is
toproducea representation of the data insuch awaythat the human eye can gain in-
sight into their structure and patterns. Visual data mining techniques have proven to
beof particularly highvalue inexploratory data analysis, asindicated bythe research
in this area (Eick and Wills a, b).
In this exposition, we focus on the visual exploration of data through their graph
representations. Specifically, it is shown how various commonly encountered struc-
tures in data analysis can be represented by graphs. Special emphasis is paid to cate-
gorical data forwhichmany commonly usedplotting techniques (scatterplots, paral-
lelcoordinateplots,etc.)proveproblematic.Further,arigorousmathematical frame-
work based on optimizing an objective function is introduced that results in a graph
layout. Several examples are used to illustrate the techniques.
Data and Graphs
4.2
Graphs are useful entities since they can represent relationships between sets of ob-
jects. hey are used to model complex systems (e.g., computer and transportation
networks, VLSI and Web site layouts, molecules, etc.) and to visualize relationships
(e.g.,socialnetworks,entity-relationshipdiagramsindatabasesystems,etc.).Instatis-
tics and data analysis, weusually encounter them as dendrograms in clusteranalysis,
as trees in classification and regression, and as path diagrams in structural equation
models and Bayesian belief diagrams. Graphs are also very interesting mathematical
objects, and a lot of attention has been paid to their properties. In many instances,
the right picture is the key to understanding. he various ways of visualizing a graph
providedifferent insights, andotenhiddenrelationships andinteresting patterns are
revealed. An increasing body of literature is considering the problem of how to draw
a graph [see for instance the topic by Di Battista et al. ( ) on graph drawing, the
Proceedings of the Annual Conference on Graph Drawing, and the annotated bib-
liography by Di Battista et al. ( )]. Also, several problems in distance geometry
Search WWH ::




Custom Search