Information Technology Reference
In-Depth Information
Fig. 16.3.
The three-layers context
space of the new risky cluster candidate
C
p
Fig. 16.4.
The three-layers context
space of the new non-risky cluster can-
didate
C
q
Inter layer.
Clusters within this layer also reduce
CR
as they should not
lead to a quick drop in
VI
.Forthis,theyhavetobe
distant
enough from
C
p
,
therefore, not likely to be merged with
C
p
in next iterations. Further, keep-
ing them outside would contribute to improve (or at least not to deteriorate)
the global inter-cluster separation. This layer is delimited by a first threshold
t
2=
Δ
(K
NN
(
C
p
)) as the average pairwise inter-
cluster distance between the
K
-NN of
C
p
, reduced by the standard deviation
of its homologous values obtained following the previous mergings. Getting the
average separation between clusters surrounding
C
p
, will give a hint on the mini-
mum required inter-distance to improve the local inter-cluster separation around
C
p
, which will most likely improve the global inter-cluster separation.
−
NN
(
C
p
)). We define
Δ
(K
−
Δ
(
K
−
NN
(
C
p
)) =
AvgInter
(
C
p
)
−
Std
(
AvgInter
(
C
n
..C
k−
1
))
AvgInter
(
C
p
)=
i
=1
j
=1
dist
(
C
i
,C
j
)
K.
(
K
i
=
j
−
1)
/
2
We decided to set the same margin for the intra and inter layers in order to
have balanced scores in both layers. Subsequently, we define the other inter-layer
threshold
t
3=
t
2+
t
1.
Risk layer.
Clusters within this layer increase
CR
because we consider that
they could lead to a fast drop in the global clustering quality, whether on the
inter-cluster or intra-cluster level. Actually, these clusters, if merged with
C
p
in
next iterations, would contribute to a significant degradation in the intra-cluster
compactness since they are not enough close to
C
p
. Further, clusters within this
layer, if not merged with
C
p
in next iterations, would not contribute to any