Database Reference
In-Depth Information
data mining: Astepofthe
knowledge discovery
process that analyzes large amounts of
data to identify unexpected or unknown
patterns
that might be of value to an application.
data postprocessing: Astepofthe
knowledge discovery
process that is applied after
patterns
are extracted by the
data mining
algorithms. This step typically includes
pattern
evaluation, interpretation, and visualization.
data preprocessing: Astepofthe
knowledge discovery
process where data are prepared
before
data mining
algorithms can be applied. This step usually includes data cleaning,
where noise in data is reduced, and data preparation, where data are formatted to be
mined.
data warehouse: A data repository specifically designed to support the decision-making
process. In a data warehouse the information is conceptually represented as a
cube
contain-
ing facts and
measures
organized according to
dimensions
and
hierarchies
.
density map: Amap that shows the distribution of a phenomenon within the space covered
by the map. For example, the distribution of moving objects in a given area may be
represented as the number of objects per area unit (i.e., density). Densities are often
represented by color-coding, where brighter colors correspond to higher densities.
dimension: In
data warehouses
, a dimension materializes a specific viewpoint for ana-
lyzing the facts. For example, space, time, and product are frequently used dimensions.
Dimensions may be composed of
hierarchies
of levels. For example, a time dimension may
be composed of levels hour, day, week, month, and year.
discrete model: A model for representing
time-dependent data types
in a finite represen-
tation. For example, a time-dependent point value can be represented as a polyline in the
(
x,y,t
) space. This is to be contrasted with
abstract model
.
episode: A maximal subsequence of a
trajectory
such that all its
spatio-temporal positions
comply with a given predicate. Examples include stop and move episodes and transportation
means (walk, bus, metro, train, car) episodes.
extraction-transformation-loading (ETL): The process that populates a
data warehouse
from one or several data sources. It is a three-step process that extracts data from the
data sources, transforms the data, and loads the data into a data warehouse. An ETL
process also refreshes the data warehouse at a specified frequency in order to keep it up to
date.
flow: An aggregate of multiple movements all starting from the same location and ending at
the same location. Examples include count of commuting people or amount of transported
goods. A flow can be seen as a vector connecting two locations and associated with
one or more aggregate attributes derived from the individual movements that have been
aggregated.
flow map: A cartographic representation of
flows
shown in a geographic space. Typically,
flows are represented by straight or curved lines connecting the start and end locations
with the thickness proportional to the value of the aggregate attributes. Alternatively, the
attribute values can be represented by varying levels of transparency or by color-coding.
frequent pattern: In
data mining
,a
pattern
that occurs frequently in a data set.
fuzzy spatial object: A spatial object whose spatial extent is represented by a membership
function indicating the membership degree of each point in the extent of the object. The
uncertainty is due to imprecision of the borderline of the spatial object. For example, it is
not possible to define with certainty the line separating a mountain from the valley beneath.
This is to be contrasted with
probabilistic
and
vague spatial objects
.
hierarchy: In
data warehouses
, a set of hierarchically correlated levels of a
dimension
that
define the desired aggregation paths for the
measures
.