Mobility Data Mining - Mobility Data

Database Reference

In-Depth Information

Figure 6.7 Three-dimensional depiction of sample result obtained with time-focused tra-

jectory clustering on a data set of synthetic trajectories.

The simplest solution to the problem is the so-called k-nearest neighbors

(kNN) approach: instead of inferring any classification rule, it directly compares

each new trajectory t against the training set and finds the k labeled trajectories

that are closest to t . The most popular label among the neighbors is then also

assigned to t . The assumption is that the more similar two trajectories are, the

more likely they belong to the same class. Obviously, everything revolves around

a proper choice for the similarity measure applied, which should be as coherent

as possible with the classification problem at hand. As an example, we can

expect that a similarity function that takes into consideration the acceleration of

objects will recognize well the vehicle type - the lighter the vehicle, the easier

it is to reach high accelerations. On the contrary, a measure based only on the

places visited might perform more poorly.

The same idea is also applied in several sampling-based solutions to the

clustering problem: when the data set is too large to process, one approach

consists of randomly sampling a small subset of trajectories and computing

clusters on them. Then, all other trajectories are assigned to the cluster (i.e.,

classified) with a kNN approach or by comparing them against the centroid of

each cluster.

Approaching the problem from a different viewpoint, each class involved in

the classification problem could be modeled through a probabilistic model that

is fitted to the available trajectories in the class. Then, each new trajectory can be

Search WWH ::

Custom Search

Home