Database Reference
In-Depth Information
technology. But the availability of heterogeneous data not only requires the map-
ping of database schemata but includes also the cleaning and harmonization of
uncertainty and missing data in the volumes of heterogeneous data. Modern ap-
plications require such intelligent data fusion to be feasible in near real-time and
as automatically as possible [32]. New forms of information sources such as data
streams [11], sensor networks [30] or automatic extraction of information from
large document collections (e.g., text, HTML) result in a dicult data analysis
problem which to support is currently in the focus of database research [43].
The relationship between Data Management, Data Analysis and Visualization
is characterized such that Data Management techniques developed increasingly
rely on intelligent data analysis techniques, and also interaction and visualiza-
tion to arrive at optimal results. On the other hand, modern database systems
provide the input data sources which are to be visually analyzed.
3.3 Data Analysis
Data Analysis (also known as Data Mining or Knowledge Discovery) researches
methods to automatically extract valuable information from raw data by means
of automatic analysis algorithms [29,16,31]. Approaches developed in this area
can be best described by the addressed analysis tasks. A prominent such task
is supervised learning from examples: Based on a set of training samples, deter-
ministic or probabilistic algorithms are used to learn models for the classification
(or prediction) of previously unseen data samples [13]. A huge number of algo-
rithms have been developed to this end such as Decision Trees, Support Vector
Machines, Neuronal Networks, and so on. A second prominent analysis task is
that of cluster analysis [18,19], which aims to extract structure from data with-
out prior knowledge being available. Solutions in this class are employed to au-
tomatically group data instances into classes based on mutual similarity, and to
identify outliers in noisy data during data preprocessing for subsequent analysis
steps. Further data analysis tasks include tasks such as association rule mining
(analysis of co-occurrence of data items) and dimensionality reduction. While
data analysis initially was developed for structured data, recent research aims at
analyzing also semi-structured and complex data types such as web documents
or multimedia data [34].
It has recently been recognized that visualization and interaction are highly
beneficial in arriving at optimal analysis results [9]. In almost all data analysis
algorithms a variety of parameters needs to be specified, a problem which is
usually not trivial and often needs supervision by a human expert. Visualization
is also a suitable means for appropriately communicating the results of the au-
tomatic analysis, which often is given in abstract representation, e.g., a decision
tree. Visual Data Mining methods [24] try to achieve exactly this.
3.4 Perception and Cognition
Effective utilization of the powerful human perception system for visual analysis
tasks requires the careful design of appropriate human-computer interfaces. Psy-
chology, Sociology, Neurosciences and Design each contribute valuable results to
Search WWH ::




Custom Search