Database Reference
In-Depth Information
algorithms for WSN. These are data and node clustering, data classifi-
cation and outlier detection in sensor networks.
The rest of this chapter is organized as follows. Section 2 discusses
several WSN clustering strategies including both node and data cluster-
ing. In the next section (Section 3) we discuss classification techniques
followed by several outlier detection techniques in Section 4. We con-
clude the chapter in Section 5.
2. Clustering in Wireless Sensor Networks
Cluster analysis is the unsupervised learning task of dividing a set
of objects into groups (
clusters
), such that a given quality criterion is
maximized. More formally, given a set of objects
X
=
{
x
1
,...,x
n
}
,we
search for a
clustering
C
=
{
C
1
,...,C
k
}
with
C
i
⊆
X
for
i
=1
,...,k
0
. Partitional clustering
that maximizes a quality function
q
:2
X
→
R
algorithms yield disjunct clusters,
i.e.
C
i
∩
=
j
, while
the clusters produced by non-partitional algorithms may overlap. More-
over, hierarchical clustering algorithms can subdivide clusters further,
resulting in increasingly more detailed groupings of the given objects.
Instead of creating new quality functions and algorithms for each par-
ticular application, clustering algorithms usually try to achieve more
general parameterized objectives. For example, an often stated objec-
tive is that objects in the same cluster should be more similar to each
other than objects from different clusters. With
d
:
X
C
j
=
∅
for
i
+
0
being
a given dissimilarity function between objects, this criterion can be for-
malized as minimizing the
intra-cluster variance
, also called the
sum of
squares within
(SSW), over all possible clusterings for a fixed number of
clusters
k
[32]:
×
X
→
R
k
C
∗
)=
1
2
d
(
x
1
,x
2
)w th
C
i
∈C
∗
min
C
∗
SSW
(
i
=1
x
1
∈C
i
x
2
∈C
i
For algorithms that try to minimize this criterion, like the well-known
k-Means algorithm [39], an application specific clustering can then be
obtained by choosing an appropriate dissimilarity measure. Density-
based algorithms, like DBSCAN [24], form clusters of points that have a
high spatial density. Subspace clustering algorithms specialize on finding
clusters in lower dimensional subspaces of the whole data space. For
a good overview of different types of clustering algorithms and their
objectives, see Han [30] and Kriegel
et al.
[38].
The aforementioned algorithms assume the whole data set to be in
main memory or at least they ignore the time needed for accessing the