Database Reference
In-Depth Information
of outliers mentioned above), we need to count the number of sensed
values that fall in different regions of the data space. This operation
can be eciently supported by the framework outlined in Section 3.1.1,
and the overall task can be distributed in the entire WSN. Especially
for the distance-based outliers, the following observation holds [93]. In
a (conceptual) hierarchical organization of the sensor network, a parent
node combines in a single pool all the data that its children process.
Consequently, outliers have to be identified with respect to this new pool
of data. Nevertheless, it is not necessary that the parent node reads in
all the data from its children's input data streams, and for each data
value determine whether it is an outlier or not. It suces for the parent
node to examine only the values that have been marked as outliers by
its children. All the other data values can be safely ignored, since they
cannot possibly be outliers. The above approach allows for the effective
distribution of the outlier detection task to the entire WSN, resulting
in significant savings in terms of communication messages.
A recent study [64] proposes the use of the hyperellipsoidal model in
order to model the normal behavior of sensor nodes. Sensor readings
that significant deviate from this model are then declared outliers. The
focus of this study is on devising an iterative approach for building and
maintaining hyperellipsoidal models, which makes them suitable for non-
stationary data distributions.
Node Similarity-based
Zhuang et al. [115] describe an approach for identifying (and cleaning)
outliers in a sensor network. They focus on two kinds of outliers: short
simple outliers , usually represented as an abnormal, sudden burst and
depression; and long segmental outliers , which represents erroneous sen-
sor readings that last for a certain time period. Their approach works as
follows. The Discrete Wavelet Transform (DWT) is applied on the se-
ries of sensor readings. The high-frequency coecients are omitted from
the resulting DWT representation, which is subsequently compared to
the original data series. Data points that are further away than a dis-
tance threshold, d 1 , from their DWT representation are deemed short
outliers. Then, the data series is compared to the series obtained from
other sensors that are geographically close. If no other series is within
some distance threshold, d 2 , then this data series is deemed a long out-
lier (similarity between data series is measured using the dynamic time
warping distance [11]).
A similar problem is addressed by a subsequent study [102], which
targets the identification of outlying sensors. The main observation is
that sensors observing the same phenomenon are spatially correlated,
Search WWH ::




Custom Search