Beyond Querying - User-Centered Data Management

Databases Reference

In-Depth Information

coordinates or scatterplot matrices (described in the next section) or on variations of basic data

dimension reduction techniques, e.g., self organizing map ( Kohonem, T. , 2001 ), principal component

analysis ( Eick, S. , 2000 ), and multidimensional scaling ( Cox and Cox , 1994 ). The main idea is either

to select a subset of the original dimensions that best represent some data feature (e.g., clustering)

or to compute new, synthetic dimensions. However, it is not always clear which technique is better

to use according to the data structure and task goals, and the relationship between the original data

and the reduced data may be not intuitive. A comprehensive summary of data reduction techniques

can be found in ( Barbar et al. , 1997 ).

When the data allow for arranging attributes through hierarchies, it is possible to reuse

some results that come from OLAP (on-line analytical processing) applications. The term

OLAP ( Codd et al. , 1993 ) refers to end-user applications for interactive exploration of large multi-

dimensional data sets. OLAP applications rely on a multidimensional data model thought to explore

the data from different points of view through the so called data cubes (or data hypercubes), i.e.,

measures arranged through a set of descriptive categories, called dimensions (e.g., sales for city, de-

partment, and week). Hierarchies are defined on dimensions, (e.g., week . . month . . year) to enable

additional aggregation levels. A data cube may hold millions of entries, characterized by tens of

dimensions, and the challenges come from the study of mechanisms able to insure interactivity, e.g.,

precomputing and storing different levels of the hierarchies to speed up the interaction, reducing in

different ways the size of the data (see the next subsection), sacrificing precision for speed, and from

the usability of the system: to gain insights into such huge and complex data, it is needed to project

the hypercube onto bi-dimensional or three-dimensional spaces, thus requiring long and sometime

frustrating explorations. Such approaches can be usefully adopted for reducing data dimensionality

in Infovis applications.

Summarizing, the phase of data extraction and elaboration, while often neglected in Infovis

applications or Infovis textbooks, is a crucial step that could strongly influence the quality of the

overall visualization process.

3.1.2 DATA REPRESENTATION

Data representation corresponds to encoding data values and data relationships on an internal visual

structure. In this phase, the designer is not concerned with the real screen size and capabilities; she

has at her disposal, in principle, a perfect space with no dimension or resolution limitation; mapping

such an ideal space on the final device is the goal of the presentation phase.

Most of the Infovis research in the last decades focused on designing suitable representations

according to data characteristics, user tasks, and user perceptive and cognitive capabilities. Here

we report the main results, considering the most common situation, and focusing on univariate,

bivariate, and multivariate data (i.e., the number of attributes that the visual representation has to

encode).

The simplest case we can consider is the encoding of single value, e.g., the temperature of an

engine or the altitude of a plane. Even if this activity seems quite straightforward, some problems

Search WWH ::

Custom Search

Home