Database Reference
In-Depth Information
compared to a pre-selected threshold. Accuracy of these outlier detec-
tion techniques is not relatively high due to the fact that they ignore the
temporal correlation of sensor readings.
Battencourt et al. [8] present a technique for outlier detection in
WSNs for ecosystem monitoring applications. The method exploits
spatio-temporal data distribution to find outliers. The basic idea is to
compare the measurement of one sensor with those in the spatial vicin-
ity and also with its measurements back in time. Then, if the deviation
of these values are greater than a user defined threshold (based on a
statistical significance test), a sensor detects an outlier. The obvious
drawback of this method is the choice of the outlier.
In a set of different approaches, researchers have proposed non-para-
metric methods for anomaly detection. Two such approaches are his-
togram computation and kernel density estimation (KDE). Sheng et al.
[52] present a histogram-based technique to identify global outliers in
WSN. Instead of transmitting raw data back to the base station for pro-
cessing, this technique first builds data histogram at local nodes and the
ships these statistics to the base station (sink). The sink uses this his-
togram information to extract data distribution from the network and
filters out the non-outliers. The identification of outliers is achieved by a
fixed threshold distance or the rank among all outliers. One of the major
drawbacks of this technique is the ability to process only one dimensional
data. Subramaniam et al. [53] and Palpanas et al. [45] present tech-
niques for outlier detection using kernel density estimation. Instead of
comparing all the raw observations, the technique fits kernel densities at
each of the observation points which considerably smooths the values.
Then user defined thresholds are applied in order to identify outliers.
Experimental results show that these techniques achieve high accuracy
in terms of estimating data distribution and high detection rate while
consuming low memory usage and message transmission.
4.2 Nearest neighbor based approaches
Nearest neighbor approaches use distance to other points to compute
an outlier. One of the widely used definitions, based on the original
idea of Knorr et al. [35], is that outliers are those points which are very
far from its nearest neighbors. Many variants of this definition have
been proposed based on the definition of distance and the threshold for
choosing how “far”. One practical definition uses Euclidean distance
and a user defined threshold or the number of desired outliers.
Such a definition has been used by Branch et al. [10] to find global
outliers in WSNs. The basic idea is to use a set of local rules by which a
Search WWH ::




Custom Search