Image Processing Reference
In-Depth Information
the consideration of data clustering (e.g., based on supervised or unsupervised
clustering techniques), the integration of measures of outlyingness (e.g., based on
the Mahalanobis distance of the data points, or derived from normality tests such
as Shapiro's p -test), etc. One example for combining IVA with clusterings is the
analysis of three-dimensional gene expression data with integrated clustering [ 28 ].
Advanced brushing mechanisms can be integrated in IVA as an alternative or in
addition to these data derivation approaches. Brushes developed for special purposes
include angular brushing [ 10 ] of parallel coordinates to access the slopes of the lines,
or similarity brushing [ 23 , 24 ], which utilizes a more advanced similarity measure
between data and brush to determine the data items that are selected by a certain
brushing interaction.
In principle, it is possible to design advanced brushes for any of the data aspects
that otherwise could be made accessible (to standard brushing) via the further above
described data derivation mechanism. The more indirections, however, in terms of
implicitly considered data derivations, are built into an advanced brush, the more
challenging the additional cognitive load becomes when using such a brush. It there-
fore stands to reason that highly complicated relations in the data, which only can
be accessed through a number of concepts as described above (some statistics, some
dimension reduction, some outlyingness measure, etc.), are better made available to
interactive feature specification in a step-by-step procedure (a certain sequence of
data derivation steps, for example) than packing too much into a single advanced
brushing tool.
Figure 15.4 shows an example of a Complex Analysis—in this case an outlier
analysis in a multi-run climate simulation dataset. As part of a coupled atmosphere-
ocean-biosphere simulation model, temperature values in the world's big oceans,
represented by three 2D cross-sections (longitude vs. depth), are analyzed, which
are given over a 500 year period at about 6000 BC. The goal of this analysis was
to identify spatiotemporal locations where the simulated temperature values exhibit
large differences (as compared to the main trend) in some simulation runs. Using the
interactive data derivation mechanism, first the overall number of outliers per space-
time location was computed (this step uses a mild univariate outlyingness measure,
i.e., all values which lie more than 3
2 above q 3 (the 3rd quartile) or below
q 1 (with IQR being the interquartile range q 3
·
IQR
/
q 1 ). The scatter plot in Fig. 15.4 a
identifies all locations according to how many such outliers exist ( x -axis) and to
which degree they are large- or small-value outliers ( y -axis). A smooth brush was
then used to highlight all locations with a substantial number of outliers, and the
glyph-based visualization in Fig. 15.4 b shows these locations emphasized (larger,
less transparent glyphs). In a next step, the analysis was confined to lower-value
outliers. This restriction was achieved by first using the data derivation mechanism,
again, to ”normalize” the y -axis wrt. its vertical extent per x -location. This step
enables a selection—with a standard rectangular brush—of those outliers, which are
mainly lower-value outliers. The scatter plot after loading this new attribute and the
according brush are illustrated in Fig. 15.4 c.
Search WWH ::




Custom Search