Information Technology Reference
In-Depth Information
3 Results and Discussion
3.1 Clustering Performance
First of all we would need to verify if the GNG method fulfills the requirements
of grouping sub-trajectories. To this end we have build an artificial dataset of
segments shown in Figure (2) and each one was assigned to a group. We define
the accuracy of a clustering algorithm as:
k
n
1
n
I i ( j )
1+ O i ( j )
ρ =
.
(9)
i =1
j =1
where n is the number of clusters, in our case nodes, k is the number of classes,
I i ( j ) is the number of samples of class i that are associated to cluster j and
O i ( j ) is the number of elements that do not belong to class i but are present
in cluster j . This criteria is maximized in the case we have the minimal number
of clusters that only contain samples of one class. One sample is said to belong
to a cluster j if the euclidean distance with this cluster is the minimal among
all the distances with clusters and its value is not higher than a threshold. This
definition is necessary in the presence of outliers, such as those that can be
observed in our example. In our case we have set the threshold value of 30.
0
10
20
30
40
50
60
70
80
90
100
0
20
40
60
80
100
Fig. 2. Artificial problem with different segments distributed in groups and noise
We want to check the influence of two important parameters maxiter and λ .
Results of the proposed example are shown in Table (1), where the algorithm
was executed without the final cluster aggregation proposed in the last step of
the algorithm.
From these results, we can extract important conclusions that must be kept
in mind when the GNG algorithm is executed. When the number of iterations is
very high and λ is very low the algorithm degenerates to every single sample is
Search WWH ::




Custom Search