Database Reference
In-Depth Information
Specifically, assuming that sensor data values change monotonically with
distance from the sensed event, violations of this monotonicity property
can be indicative of incorrect data. This insight was used to detect faulty
nodes.
Much work was done on memory safety of TinyOS, including Safe
TinyOS [18], Deputy [17], and Neutron [14], which helps nodes survive
memory safety violations by definiding recovery units that can be in-
dividually restarted, and allowing the programmer to specify precious
state that must be kept across restarts whenever possible. Extensions of
that work to more general model-checking systems ensued [11, 100, 71],
coming close to verifying entire distributed programs written in common
sensor network programming languages such as NesC [31].
Finally, outside sensor networks, using machine learning techniques
to diagnose failures is not new [10, 3, 63, 27, 64, 61]. Examples include
discriminative pattern analysis [27, 66], software behavior graph analy-
sis [63], mining message sequence graphs [57], a Bayesian analysis based
approach [62], and control-flow analysis to identify logical errors [64],
just to name a few. A recent book describes methodologies and appli-
cations of mining software specifications [67].
5. Future Challenges
While a significant amount of work was done on sensor network de-
bugging to date, several interesting opportunities remain. In general,
there is room for improving the scalability of current approaches. Bet-
ter data mining tools can be brought to bear to address more subtle
bug patterns. For example, graphs that describe bug triggers should be
weighted to reflect the fact that certain events (vertices in the graph)
cause multiple instances of other events (other vertices), which can be
compactly represented as an edge weight. Hence, discriminative mining
algorithms are needed for weighted graphs.
The issue of concurrent bugs, where multiple different causes give
rise to common symptoms is another challenge that decreases ecacy
of current discriminative techniques when no single cause has enough
support to definitively account for the observed problem. Diagnosing
rare (anomalous) events is also troublesome, since very little observations
may be available about the anomaly, while a great predominance of data
is obtained during normal operation. Asynchrony (and lack of global
time) in distributed systems further complicates reconstruction of exact
event patterns that cause an anomaly. Since both causes and symptoms
are correlated with the occurrence of problems, disambiguating the two
remains important. It requires the ability to reason about causal links
Search WWH ::




Custom Search