Database Reference
In-Depth Information
In particular, some of the outlier detection methods focus on sensor
data [59, 71, 15]. Zhang et al. [71] offer an overview of such outlier detec-
tion techniques for sensor network applications. Deligiannakis et al. [15]
consider correlation, extended Jaccard coe cients, and regression-based
approximation for model-based data cleaning. Shen et al. [59] propose
to use a histogram-based method to capture outliers. Subramaniam et
al. [62] introduce distance- and density-based metrics that can identify
outliers. In addition, the ORDEN system [23] detects polygonal outliers
using the triangulated wireframe surface model.
3.3 Declarative Data Cleaning Approaches
From the perspective of using a data cleaning system, supporting a
declarative interface is important since it allows users to easily control
the system. This idea is reflected in a wide range of prior work that pro-
poses SQL-like interfaces for data cleaning [32, 46, 54]. These proposals
hide complicated mechanisms of data processing or model utilization
from the users, and facilitate data cleaning in sensor network applica-
tions.
More specifically, Jeffery et al. [31, 32] divide the data cleaning pro-
cess into five tasks: Point, Smooth, Merge, Arbitrate ,and Virtualize .
These tasks are then supported within a database system. For exam-
ple, the SQL statement in Query 2.2 performs anomaly detection within
a spatial granule by determining the average of the sensor values from
different sensors in the same proximity group. Then, individual sensor
values are rejected if they are outside of one standard deviation from the
mean.
As another approach, Rao et al. [54] focus on a systemic solution,
based on rewriting queries using a set of cleansing rules. Specifically,
the system offers the rule grammar shown in Figure 2.8 to define and
execute various data cleaning tasks. Unlike the prior relational database
approaches, Mayfield et al. [46] model data as a graph consisting of
nodes and links. They, then, provide an SQL-based, declarative frame-
DEFINE [rule name]
ON [table name]
FROM [table name]
CLUSTER BY [cluster key]
SEQUENCE BY [sequence key]
AS [pattern]
WHERE [condition]
ACTION [DELETE | MODIFY | KEEP]
Figure 2.8. An example of anomaly detection using a SQL statement.
 
Search WWH ::




Custom Search