Information Technology Reference
In-Depth Information
6.4.4.3
The Clustering Process
With LEGClust dissimilarity matrix one can use any adequate algorithm to
cluster the data. LEGClust original algorithm described in [204] uses a hi-
erarchical, agglomerative approach based on layered entropic (unweighted)
subgraphs built with the information given by the entropic proximity ma-
trix (EPM). Examples of such subgraphs were already shown in Figs. 6.26b
and 6.27. The subgraph is built by connecting each object with the corre-
sponding object of each layer (column) of the EPM. Using these subgraphs
one can hierarchically build the clusters by joining together the clusters that
correspond to the layer subgraphs with a predefined number of connections
between them.
Example 6.9. As an example to illustrate the clustering procedure we use the
simple two dimensional dataset presented in Fig. 6.29.
This dataset consists of 15 points apparently constituting 2 clusters with
10 and 5 points each.
2
3
7
1
8
10
6
4
9
5
12
13
11
14
15
Fig. 6.29
A simple two dimensional dataset to illustrate the clustering procedure.
Table 6.19 presents the EPM built from the entropic dissimilarity matrix
of Table 6.18.
The EPM defines the connections between each point and those points in
each layer: point 1 is connected with point 2 in the first layer, with point 5 in
the second layer, with point 4 in the third layer and so on (see Table 6.19).
The clustering process starts by defining the elementary clusters obtained
by connecting, with an oriented edge, each point with the corresponding point
of the first layer (Fig. 6.30a). There are 4 elementary clusters in our example.
The second step of the algorithm connects, with an oriented edge, each
point with the corresponding point of the second layer (Fig. 6.30b). In order
to build the second step clusters we apply a rule based on the number of
connections to join each pair of clusters. We can use the simple rules of, a),
 
Search WWH ::




Custom Search