Calibrating Entropy Functions Applied to Computer Networks - Computer Network Security

Information Technology Reference

In-Depth Information

from trac data. To use the entropy functions in a viable system for detect-

ing anomalous behavior, one must calibrate these functions to determine their

predictive capability. In an operational setting, one could imagine a constant

recomputation of entropies and a comparison of the values computed against

a baseline of “normal” behavior. The goal would be to know that abnormal

behavior would change the values computed in a definable, measurable, pre-

dictable, way so that such changes could be used to trigger the alarm bells and

the necessary responses to what would be presumed to be an attack or other

anomaly.

All software was developed and run on a Red Hat Linux system and the gcc

compiler. This is relevant only in that the random numbers used were generated

by the built-in rand() function. We acknowledge that this sequence of pseudo-

random numbers may not satisfy high-grade tests for randomness. Some of our

tests were done again with a better random number generator, and the change

in the results was too small to be considered relevant to our basic conclusion at

the end of this paper.

3.1

Assumptions

In order for the entropy functions of the previous section to be applicable, it is

necessary that the underlying input data be compatible with the computation

of these entropies. Specifically, we assume for the purposes of calibrating these

functions that we have a matrix of n rows and columns, representing n nodes

on the network, and with n

10000 as a ballpark estimate. We would expect

n< 5000 to be too small to be of interest and n> 50000 to be perhaps too

large. The entropy measures are global measures of network behavior; absent

an incremental approach or a method for rapidly determining a subset of the

connectivity matrix on which to focus, we would expect an O ( n 2 )orworse

computation for n> 50000 to be prohibitive for real time. We assume also

that there is a background density of connections between nodes, and we take

that density to be in the range of 5% to perhaps 15%. Finally, the underlying

assumption in the use of these entropies is that, when properly viewed, the

matrix will have a nonrandom structure. In Gudkov et al. and in this work

we look at clusters that could be seen (with an appropriate permutation of the

node subscripts) as denser blocks along the diagonal. Anomalies that scanned, for

example, all the nodes in a subnet local to the infected machine would result in

rows and/or columns of the matrix that were much denser than the background.

We admit that the assumptions of the previous paragraph are in fact just

assumptions. In another part of the larger project of which this work is a part

we are studying real data from networks to determine whether the above as-

sumptions are justified and how the simulated data would have to change in

order to be more realistic. But these assumptions must be expressed in order to

understand why the parameters of our experimental data have been chosen as

they have been. We postulate, however, for the purpose of initial study, that we

could calibrate these entropy functions by studying the following independent

variables.

≈

Computer Network Security

Search WWH ::

Custom Search

Home