Mobility Data Mining - Mobility Data

Database Reference

In-Depth Information

6.3.4 Trajectory Outliers

The general objective of clustering is to fit each object in data into some category

(and discovering the categories is part of the problem). However, sometimes the

analyst is exactly interested in those objects that deviate from the rest of the data

set, and therefore cannot really fit any category. Such objects are called outliers .

Finding an outlier object means to discover some feature or pattern that holds

for the object, and yet is anomalous or at least very rare in the data set. In

this sense, the problem can be properly seen as a (infrequent) pattern discovery

task. The reason for discussing it now is that most outlier detection methods

in literature actually adopt some clustering procedure, and identify outliers as

those objects that are (or would be) left out of any cluster. Here we provide two

examples.

A basic method for discovering trajectory outliers consists in adopting a

density-based clustering perspective, and therefore computing the number of

neighbors of each trajectory over a reasonably large neighborhood. Then, the

trajectories that have too few neighbors are classified as outliers. As density-

based clustering, the method is parametric on the distance measure adopted,

and therefore, in principle, any distance between trajectories can be applied.

Alternatively, from each trajectory a set of predefined representative features

can be extracted, such as average speed and initial position, and then applied

any standard distance over vector data.

In Section 6.3.2 the TraClass trajectory classification method was presented,

which has the characteristic of working over trajectory segments (obtained by

properly cutting original trajectories) rather than whole trajectories. By cluster-

ing such segments, relevant subtrajectory patterns were extracted and later used

for classification purposes. Following the same idea, outliers can be found within

trajectory segments, therefore focusing on single parts of trajectory that behave

in an anomalous way. In particular, each trajectory segment is compared against

the representative segment of each cluster, and if no representative segment fits

well enough, the input trajectory segment is classified as an outlier.

6.4 Conclusions

We conclude this chapter with a few notes on the topics presented and some of

the open questions in mobility data mining research.

Mobility data mining, as many other instantiations of the general data mining

paradigm into specific contexts, brings with itself the general categorization of

problems and methods it inherited from standard data mining. In particular, the

three main categories - frequent patterns, clustering, and classification - appear

again. However, some specificities of trajectory data emerged and stimulated

the development of new approaches. In particular, the complexity of the data,

Search WWH ::

Custom Search

Home