Database Reference
In-Depth Information
6.3.4 Trajectory Outliers
The general objective of clustering is to fit each object in data into some category
(and discovering the categories is part of the problem). However, sometimes the
analyst is exactly interested in those objects that deviate from the rest of the data
set, and therefore cannot really fit any category. Such objects are called outliers .
Finding an outlier object means to discover some feature or pattern that holds
for the object, and yet is anomalous or at least very rare in the data set. In
this sense, the problem can be properly seen as a (infrequent) pattern discovery
task. The reason for discussing it now is that most outlier detection methods
in literature actually adopt some clustering procedure, and identify outliers as
those objects that are (or would be) left out of any cluster. Here we provide two
examples.
A basic method for discovering trajectory outliers consists in adopting a
density-based clustering perspective, and therefore computing the number of
neighbors of each trajectory over a reasonably large neighborhood. Then, the
trajectories that have too few neighbors are classified as outliers. As density-
based clustering, the method is parametric on the distance measure adopted,
and therefore, in principle, any distance between trajectories can be applied.
Alternatively, from each trajectory a set of predefined representative features
can be extracted, such as average speed and initial position, and then applied
any standard distance over vector data.
In Section 6.3.2 the TraClass trajectory classification method was presented,
which has the characteristic of working over trajectory segments (obtained by
properly cutting original trajectories) rather than whole trajectories. By cluster-
ing such segments, relevant subtrajectory patterns were extracted and later used
for classification purposes. Following the same idea, outliers can be found within
trajectory segments, therefore focusing on single parts of trajectory that behave
in an anomalous way. In particular, each trajectory segment is compared against
the representative segment of each cluster, and if no representative segment fits
well enough, the input trajectory segment is classified as an outlier.
6.4 Conclusions
We conclude this chapter with a few notes on the topics presented and some of
the open questions in mobility data mining research.
Mobility data mining, as many other instantiations of the general data mining
paradigm into specific contexts, brings with itself the general categorization of
problems and methods it inherited from standard data mining. In particular, the
three main categories - frequent patterns, clustering, and classification - appear
again. However, some specificities of trajectory data emerged and stimulated
the development of new approaches. In particular, the complexity of the data,
Search WWH ::




Custom Search