Information Technology Reference
In-Depth Information
One assumption is that anomalous situations might result in clusters of con-
nections among nodes. This was the initial assumption of Gudkov et al., but
there is reason to believe that a cluster is not what one would expect from an
anomaly. A worm, for instance, that was scanning IP addresses for vulnerable
computers, would be indicated not by a cluster in the matrix but by a high den-
sity of nonzeros in the row and/or column for that node. A collection of nodes
infected with a worm would be indicated by a set of denser “lines” in a set of
rows and columns, but not a cluster. On the other hand, the entropy change
from a cluster will be greater than that from a small list of lines, so if changes
caused by clusters cannot be detected reliably, then changes caused by sets of
lines will be even harder to detect.
The argument of the previous paragraph can also be made regarding the
question of what kinds of attacks might be detectable by this approach. An attack
includes some set of machines involved in higher-than-normal communication
with other machines. The extreme end of higher-than-normal communication is
not just a cluster but a solid block of nonzero entries for the nodes involved in
the attack either as attacking or attacked machines.
Further concerns about the utility of this approach come from questions
about whether it would be feasible to collect the everything-to-everything con-
nectivity data in a real network. It would be dicult-indeed probably
impossible-to gather data from every node in a network. Further, the return
of that data to a central node for processing would in itself look very much like
an anomalous event. Also, normal trac is almost certainly not just the random
sending of messages among nodes; there will be daily and weekly fluctuations,
bursts of events, broadcasts to all users, and such. With a very short time win-
dow one would be hard pressed to distinguish an administrative communication
to all machines on a net from an infected machine searching all machines to find
those that might be vulnerable.
Again, we do not attempt to address these questions. If under ideal situations
there is insucient ability to distinguish anomalies from normal behavior with
the proposed entropy metric, then there is little reason to worry about whether
anomalies could be detected under less-than-ideal conditions.
Finally, this paper describes an experiment based on simulated data. We
are in the process of gathering real data for processing. In the event that this
approach shows promise, then it would be necessary to verify simulation results
against real data. However, in an experimental mode it is necessary to begin
with real data so that the input to the processing can be predictable and the
presence and severity of an anomaly can be measured.
2
Entropy Functions
Following the method by which Gudkov et al. [3] address the question of entropy
in the network, we first normalize the connectivity matrix C so that i,j C ij =1 .
For convenience, we will abuse notation and also refer to this as C in this section.
Although our matrix is symmetric, reflecting an undirected graph, we will
intuitively view the values C ij in what follows as representing connections from
Search WWH ::




Custom Search