Calibrating Entropy Functions Applied to Computer Networks - Computer Network Security

Information Technology Reference

In-Depth Information

1. The overall size of the network. This is the number of rows (also of columns)

in the matrix C . We will study sizes ranging from approximately 100 up

through approximately 10000.

2. The background density. This is the probability that one node will be con-

nected to another at random. We will assume until shown to be in error that

these probabilities will fall in the range 0 . 05 to about 0 . 15.

3. The number of clusters in the network.

4. The sizes of the clusters.

5. The densities of the clusters.

It is the latter three variables that require justification. We assume that in a

large network, such as a university, that departments, colleges, and other units

will appear in the connectivity matrix as clusters, because the nodes in these

units will have reasons to be communicating with each other more frequently

than would be observed for the background random activity. If one were to

have complete information about the network trac (this would require an NP-

complete computation to be done), then one could, for any chosen threshold that

would define a cluster, rearrange the matrix C into a block-diagonal form. In the

absence at present of any real data contradicting the assumption, we will assume

that the number of clusters of a given size will have a Zipf-like distribution and

will vary inversely with the size of the clusters, and we will generate simulated

data accordingly. For our initial experiments we have chosen cluster densities in

the range of 0 . 50 to 0 . 90. We have chosen initially to study two types of cluster

structure. The first is a single cluster of varying size that could in fact be the

entire network. This follows the mode of Gudkov et al. in looking at difference of

entropies for a single cluster as it grows from a small size eventually to become

the entire network. The second study is motivated by an assumption about how

C might change for a network experiencing an anomaly. We begin with a series

of clusters of decreasing size, computing the entropies as we go, to establish the

parameters for a “normal” state. We then introduce a moderately large cluster

(on the order of 10% of the entire network) that we might postulate to arise

from a newly-infected computer that has begun an attack.

3.2 The Software Artifact

A brief description of the software is in order. Our program takes as input a

set of parameters that includes the matrix size, the background density, and the

number, size, and density of the clusters to be simulated. Calls to rand() are

made to fill in the background of a symmetric matrix of the appropriate density,

and the background entropy is computed. Following this, the simulated clusters

are added one at a time and the entropy recomputed. An overall outer loop

controls the number of such tests to be made. Any of the entropy calculations

themselves are simply effected by a double loop through the rows and columns of

the matrix (which is for programming convenience represented in dense matrix

form). The code was written for simplicity and flexibility, not for performance,

and since even for the larger matrices the running times were at worst in minutes,

we made no attempt to improve the eciency of the code if that would have

added complexity and/or decreased the flexibility.

Computer Network Security

Search WWH ::

Custom Search

Home