Information Technology Reference
In-Depth Information
used. What is of interest is the logical set of network connections, that is the
set of point-to-point connections of which use has been made. We thus let the
connections of the network be defined by the data and not by a predetermined
description of the underlying hardware.
We also note two characteristics of the matrix C that we will deal with later,
but that we mention only in passing in this introductory discussion. The first is
the fact that the dynamic connections of the network, as defined by its tra c,
are time-varying, but we cannot hope (for reasons of computational e ciency, if
for no other reason) to view them as connections that vary continuously. We will
of necessity deal with the network data in discrete, perhaps overlapping, time
intervals in order to obtain a sequence of snapshots of the network.
Second, the matrix C can be defined in many ways, depending on one's con-
cept of “network trac.” Perhaps the simplest definition is that it is simply the
adjacency matrix of nodes of the network, representing an undirected graph (and
thus a symmetric matrix) in which nodes are connected if they have exchanged
a message (in either direction) during the time interval during which data has
been collected, and not connected otherwise. More complicated matrices can be
constructed by weighting the adjacency matrix to reflect the number of messages
sent, the number of bytes sent, and so forth. Later in this document, when we
discuss the issues of entropy, we will normalize the entries so that the sum of all
entriesis1.
Finally, we must deal with the diagonal entries of C . In keeping with the
proposal made by Gudkov, Johnson, Madamanchi, and Sidoran [3], we place in
the diagonal entries of the matrix the negative of the row (or column, since the
matrix is assumed symmetric) density off the diagonal. This is done by Gudkov
et al. so as to obtain a matrix that represents a Markov process and thus to be
able to argue that a deeper analysis based on the theory of Markov processes
is relevant. In what follows here we in fact never use the diagonal entries of the
matrix, so the actual values assigned to them are not relevant.
The matrix C will change over time as the dynamic connections change. If
we were to view the network as a graph, and we had a sequence of matrices,
then we could (in theory) view the graphical images of the graphs over time
and detect changes in the network that would represent anomalous behavior
and/or intrusions. The proposal of Gudkov et al. is that one can apply entropy
functions to these matrices, and that the changes in the entropy functions will
reflect changes in the matrix (and by extension, the network) in a useful way.
Caveats About the Real World
There are a number of assumptions about the real world that may or may not
be true and which would affect the ability of an entropy metric as mentioned
here to detect anomalous situations in a network. On the one hand, verifying
that these assumptions were true would be important if one were to determine
that this version of an entropy approach were viable for detecting anomalies
in a network. On the other hand, if our analysis suggests that the approach is
not viable even if the assumptions were true, then the matter of verifying the
assumptions becomes moot.
Search WWH ::




Custom Search