Graphics Reference
In-Depth Information
Despitetheundoubtedutility ofthis approach,itdoespresentsomeproblemsthat
prevent it from being a complete solution. he main ones are:
Asplotsbecomeincreasingly complex,theybecomehardertointerpret.Fewpeo-
ple have problems with -D plots. Scatterplots, tables, and grouped boxplots or
other displays involving two dimensions are easily learnable. But the necessity of
spinning and navigating a -D point cloud or understanding the contributions
to a multivariate projection make views that contain many variables intrinsically
less intuitive.
It is harder for monolithic data views to accommodate differences in the basic
types of data. High-dimensional projection techniques assume the variables are
numeric, as do techniques that display multivariate glyphs and, to a large extent,
parallel-axis techniques. Given a table of two categorical variables, adding a nu-
meric variable requires changing to a quite different type of view, such as a trellis
display.
Data that are of a type specific to a particular domain can be impossible to add
directly. Exploring relationships in multivariate data collected at geographical lo-
cations, on nodes of a graph, or on parts of a text document is very hard because
of the di culty of building views that correlate the statistical element and the
structural element of the data. Oten, two completely different packages are used
for the analysis, with results from one package mangled to fit the input form of
the other package - a frustrating situation to be in.
helinked views paradigm can beusedtoovercome these problems.heidea is sim-
ple; instead of creating one complex view, create several simpler views and link them
together so that when the user interacts with one view the other views will update
and show the results of such an interaction. his allows the user to use views that
require less interpretation and views that are directly aimed at particular combina-
tions of data. It also allows the easy integration of domain-specific views; views of
networks or maps can easily be linked to more general-purpose views.
It should not be argued that linked data views are a uniformly superior method to
that of monolithic complex views mentioned above. hat is not the case, as there are
examples where a single multivariate technique is necessary to see a given feature,
and multiple simpler views simply won't do. It is also generally harder to present the
results of an interactive view exploration to another person than it is to present the
results if displayed as a single view. Having said that, for many problems, especially
those whereconditional distributions are of interest, the linked data views technique
works extremely effectively.
In Fig. . , we have changed the variable being displayed in the histogram to be
the number of years in the league. ( indicates a rookie, indicates one previous year
of experience, etc.). he shape of the histogram fits our intuition by being close to
a Poisson distribution. We select those players with years of experience or greater
in the league and see that not only do they have a higher salary on average, but the
relationship between batting average and log(salary) is much closer to linear. For the
younger players, a reasonable case might be made that performance has no strong
effect on pay unless the batting average of a player is itself better than average.
Search WWH ::




Custom Search