Information Technology Reference
In-Depth Information
Fig. 16.3. The three-layers context
space of the new risky cluster candidate
C p
Fig. 16.4. The three-layers context
space of the new non-risky cluster can-
didate C q
Inter layer. Clusters within this layer also reduce CR as they should not
lead to a quick drop in VI .Forthis,theyhavetobe distant enough from C p ,
therefore, not likely to be merged with C p in next iterations. Further, keep-
ing them outside would contribute to improve (or at least not to deteriorate)
the global inter-cluster separation. This layer is delimited by a first threshold
t 2= Δ (K
NN ( C p )) as the average pairwise inter-
cluster distance between the K -NN of C p , reduced by the standard deviation
of its homologous values obtained following the previous mergings. Getting the
average separation between clusters surrounding C p , will give a hint on the mini-
mum required inter-distance to improve the local inter-cluster separation around
C p , which will most likely improve the global inter-cluster separation.
NN ( C p )). We define Δ (K
Δ ( K
NN ( C p )) = AvgInter ( C p )
Std ( AvgInter ( C n ..C k− 1 ))
AvgInter ( C p )= i =1 j =1 dist ( C i ,C j )
K. ( K
i
= j
1) / 2
We decided to set the same margin for the intra and inter layers in order to
have balanced scores in both layers. Subsequently, we define the other inter-layer
threshold t 3= t 2+ t 1.
Risk layer. Clusters within this layer increase CR because we consider that
they could lead to a fast drop in the global clustering quality, whether on the
inter-cluster or intra-cluster level. Actually, these clusters, if merged with C p in
next iterations, would contribute to a significant degradation in the intra-cluster
compactness since they are not enough close to C p . Further, clusters within this
layer, if not merged with C p in next iterations, would not contribute to any
Search WWH ::




Custom Search