Information Technology Reference
In-Depth Information
The Smoothing Parameter
The smoothing parameter
h
is very important when computing the entropy.
In other works, [117,84], using Renyi's quadratic entropy to perform cluster-
ing, it is assumed that the smoothing parameter is experimentally selected
and that it must be fine-tuned to achieve acceptable results. Formula (6.8),
h
fat
=25
c/n
, was proposed in [203] and showed to produce good results
in neural network classification using error entropy minimization, as men-
tioned in Sect. 6.1.1.1. For the LEGClust algorithm we need a formula that
reflects the standard deviation of the data. Following the approach described
in 6.1.1.1, a new formula, inspired on (6.7), was proposed in [198]:
h
op
=2
s
1
d
+4
4
(
d
+2)
n
,
(6.55)
where
s
is the mean value of the sample standard deviations for all
d
di-
mensions. All experiments with the entropic clustering algorithm reported
in [204] were performed using formula (6.55).
Although the value of the smoothing parameter is important, it is not cru-
cial to obtain good results. As we increase the
h
value, the kernel becomes
smoother and the entropic proximity matrix becomes similar to the Euclidian
distance proximity matrix. Extremely small values of
h
will produce undesir-
able behaviors because the entropy will have high variability. Using
h
values
in a small interval, near the
h
fat
value, does not affect the final clustering
results.
Minimum Number of Connections
The minimum number of connections,
k
, to join clusters in consecutive steps
of the algorithm is the third parameter that must be chosen. One should not
use
k
=1to avoid outliers and noise, especially if they are located between
clusters. If the elementary clusters have a small number of points, high values
for
k
are also not recommended because the impossibility of joining clusters
could then arise due to lack of a su
cient number of connections. Experi-
mental evidence provided in [204] shows that good results are obtained when
using either
k
=2or
k
=3.
An alternative is simply to join at each step the two clusters with the
highest number of connections between them.