Database Reference
In-Depth Information
the process by giving examples of the different classes hoping that the classifier
will learn to generalize them, in clustering such supervision is not required.
Example 3 (Clustering) Consider the set of all web-pages returned by a keyword
search “bush” on the Web. The resulting set of documents will contain documents
about the former president Bush Sr., of former president George W. Bush, of a
grunge band named Bush, the brand of beer with the same name, and maybe also
documents about vegetation. A clustering algorithm would divide, without
interaction of the user or a pre-defined taxonomy, group similar documents
together. Partitional clustering methods would do so by dividing the data into
disjoint groups, whereas hierarchical clustering algorithms give a complete
taxonomy.
Closely related to clustering is outlier detection . In outlier detection, one tries to
identify those objects that are unlike many other objects. Such outliers could
indicate for instance, errors in the data (e.g., outside temperatures of over 60
degrees Celsius), or potentially interesting exceptional cases. Conceptually,
outliers could be considered points not belonging to a large cluster, or forming a
cluster by themselves.
An important factor in clustering is the order in which data points are compared
with each other. Some important methods for determining this are hierarchical
clustering, k-means clustering, and neural network clustering. 14 Hierarchical
clustering starts by combining cases and clusters that are similar to each other, one
pair at a time. In each step, a pair of closest cases/clusters is merged. This is
repeated until the closeness of the clusters is larger than the determined threshold.
In k-means clustering , it is assumed that the data falls into a known number (k) of
clusters. First, a random profile is defined for each cluster. These profiles are
called cluster centres. Next, each data point is assigned to the cluster centre to
which it is most similar. Neural network clustering starts from so-called nodes that
work similarly to the neurons in the human brain. Each node computes the
weighted sum of its inputs (e.g., the distance of other nodes) and after a certain
threshold is subtracted, the result is passed to a non-linear function, e.g., a sigmoid
function. 15 The result of this function determines the importance of the node as a
clustering centre. Neural networks are constructed by connecting the output of a
node to the input of one or more other nodes. 16 It is important to select appropriate
weights and thresholds. The network can also 'learn', i.e., weights and thresholds
may be adjusted after several examples are compared with the desired output. In
this way, strong connections are kept and weak connections are disposed of.
14 SPSS Inc. (1999).
=
y
=
f
w
i x
θ
15 Hence,
each
node computes a function
where
i
i
1
1
f
(
x
)
= 1
is the sigmoid function and w i are the weights.
x
+
e
16 Holsheimer, M., and Siebes, A. (1991).
Search WWH ::




Custom Search