Information Technology Reference
In-Depth Information
Fraud. In the future, the metering of electricity will be done remotely, probably
over IP. Attackers could fraudulently manipulate these readings.
Overload . In the future, electricity companies are likely to have much more control
over demand, for example switching on water heaters in homes when there is low
demand. An attacker could overload the electricity system by switching on all the
electricity devices across the country at a period of high demand.
The most dangerous scenario is a combination of these disruptions, which could
cause a similar loss of control to that experienced when the Legion of Doom took
over Southern Bell's telephone network in 1989 4 .
4
Detecting Anomalous Events in SCADA Systems
This paper will compare two approaches to modelling SCADA data from an electric-
ity network: one that treats the data as text and learns the normal patterns within this
text (n-gram), the other which treats the data as numbers and looks for invariants,
such as mathematical relationships between the numbers (invariant induction).
4.1
N-Gram
This technique was initially developed by Marc Damashek, who used it to classify
texts independently of errors and the language they were written in. N-gram scanning
works by moving a sliding window of width n along a text and recording the number
of occurrences of each sequence of characters in the window. For example, if the
system has to process “The cat sat on the mat” using a sliding window of width two,
“Th”, “he”, “e ” and so on will be read into the database until the entire document (or
string in this case) has been read in. The result is a representation of the document as
a vector containing the relative frequencies of its distinct constituent n-grams, which
can then be used to measure the similarity between documents.
To apply this technique to the data from an electricity network a number of modifi-
cations need to be made. To begin with, this approach is normally error tolerant and
here it was necessary to detect errors rather than tolerate them. In electricity meas-
urements, if a decimal point is dropped or a sign reversed, a radically different read-
ing can result. The n-gram technique is error tolerant because it is essentially a statis-
tical technique that measures the distribution of n-grams in the data. To make it more
error sensitive it was decided to start with a non-statistical n-gram model of the data,
which simply records whether a particular n-gram occurs in the training data or not.
This is very similar to Forrest's stide technique [7], which was used to model the
normal sequences of system calls within a Linux system. To reduce the size of the
normal model it was decided to work with just the first four characters of each
measurement. These included the sign of the reading, the position of the decimal point
and the most significant digits. Each movement of the sliding window was then ad-
vanced four characters along the data so that each successive n-gram covered a new
reading.
4 See Bruce Stirling, The Hacker Crackdown [18] for more on this.
Search WWH ::




Custom Search