Information Technology Reference
In-Depth Information
performed much better on this task and with just one invariant type, this approach
proved successful at identifying line corruptions within the files. In these experiments,
invariant induction could only identify pairs of readings containing an error, however
the performance of this technique will improve as it is extended to include other in-
variants not described here (such as the sum of the real and reactive powers being
zero and topology dependent range checks).
These results suggest that the best way to detect anomalies within electricity data is
to combine more than one anomaly-detecting technique. Whilst n-grams perform
better on the identification of corrupt files and at pinpointing small numbers of errors
within files, invariant induction has a better overall performance on the identification
of errors within files. The combined results from both methods could be used to adjust
the weighting on data going into the state estimator.
9
Future Work
The first stage of our future work will be to improve the anomaly detectors by extend-
ing the invariant induction to include more sophisticated equations and testing the
ability of the n-gram technique to handle encrypted data. Preliminary experiments
have suggested that although some forms of encryption reduce the ability of the n-
gram technique to provide approximate matches, they do not entirely prevent it from
recording normal sequences and identifying deviations.
The results so far have indicated that anomaly detection can be improved by com-
bining several different anomaly detectors. An effective way of doing this would be to
use a Bayesian network to correlate their outputs with other data sources. This should
reduce the false positive rate and would enable more accurate pinpointing of errors. In
Fig. 9, information from the n-gram and invariant anomaly detectors is brought to-
gether with a range checker using the Bayesian network, which works out which
reading has been corrupted and could suggest a pseudo measurement for that reading.
These correlation techniques can also be extended to integrate information from many
different independent sources and create higher level concepts and beliefs about them.
The work on improving and correlating the anomaly detectors will be used to study
the interactions between the anomaly detectors and the state estimator. As explained
in section 2, the state estimator offers an effective way of evaluating the state of the
network and the purpose here has not been to duplicate its work, but to improve it.
Experiments need to be carried out to evaluate the performance of the state estimator
on corrupted data, determine its limitations and then investigate the extent to which
correlated anomaly detection can support it by adjusting the weights on readings and
suggesting pseudo-measurements.
These experiments have tested the anomaly-detecting techniques using the data
from an electricity network. A logical next step would be to test them on data from
other SCADA systems, such as those controlling water systems and chemical plants.
These experiments could also be extended to include SCADA control signals.
10 Conclusions
Our results suggest that the two anomaly-detecting methods that we have described
could be used to successfully detect deliberate or accidental corruption of data within
Search WWH ::




Custom Search