Database Reference
In-Depth Information
Data Preprocessing Stage
Query Processing Stage
At the data preprocessing stage, the collected
raw data was first integrated into a single sensor
stream sequenced by their arrival time stamps.
It was then cleaned to round up the detected air
temperature to its nearest integer value which
represents the temperature at each sensor location
during the reported time interval. The cleaned data
was then enriched with the domain information;
in this case, the sensor identifier was integrated
with associated sensor values reported during
different time intervals. The Meta data is used to
generate indicator flags which reflect the data's
effectiveness, i.e. missing, corrupted or not. In
this case we define the data which no reporting
value is missing, and the data reporting value is
beyond the specified temperature range is cor-
rupted.After the preprocessing each sensor reading
is connected by a particular sensor identifier, a
sequence number represents the time interval the
sensor reading was reported, and the Meta data
contains the locations of each sensor represented
by each sensor identifier.
At the query processing stage, which is on the top
level of the domain-driven framework, different
users' query can be fulfilled at the users' speci-
fied query criteria at the same time. In this case,
the temperatures at different locations during the
specified time interval. If the query information is
not missing, it can be directly retrieved from the
sensor network database. Otherwise, the request
was send through the data estimation component
in the data warehousing and mining level and the
estimated results are retrieved for different end
users' requests.
Performance Study
Several different data mining techniques are con-
ducted in order to evaluate the proposed framework
using theAverage Window Size (AWS) approach,
the linear interpolation approach, the linear trend
approach, and the CARM approach (Jiang, 2007).
All these methods are applied to our proposed
framework to answer the user's request for missing
sensor air temperature value. We compared the
estimation accuracy, running time and memory
space usage when applying each method to our
proposed framework.
Data Warehousing and
Data Mining Stage
At the data warehousing and data mining stage,
which is on the third level of the domain-driven
framework, different data warehousing and data
mining tasks can be performed on the preprocessed
data enriched with domain information. In this
case, we perform data association mining task in
each sensor cluster from the Huntington Botani-
cal Garden sensor network application to find out
the interesting patterns and associations between
the sensor readings. We then use the discovered
relationships between these sensor readings to
perform missing sensor data estimation based
on the sensor readings related with the missing
sensor reading.
Performance Study of
Estimation Accuracy
The evaluation of the estimation accuracy of the
missing values is done by using the average Root
Mean Square Error (RMSE).
From Figure 8, we can see that CARM gives
the best result of the above approaches regarding
the estimation accuracy. The AWS, and linear
series approaches perform no better than CARM
approaches. The main reason might be that it only
considers the relationship between the neighbor
nodes, while CARM find out all of the relationships
Search WWH ::




Custom Search