Passive External Web Surveillance Technique for Private Networks - Computer Network Security

Information Technology Reference

In-Depth Information

assignment would be essentially arbitrary. Fragments that are too ambiguous to

evaluate in the context of this experiment would be similarly confusing to the

analyst in practice. Measuring the quality of such fragments is pointless; they

are all bad. For this reason, the metrics are not applied to ambiguous fragments.

Instead, the fragments are counted separately, and presented as an index of

ambiguity, indicating one aspect of the performance of the LCA overall.

AmbiguousFragments

AllF ragments

Ambiguity =

(4)

4.5 Trivial Fragments

By definition, fragments made up of one connection element always match one

session and have unit accuracy. Their effect is to increase the aggregate accuracy

in a meaningless way. For example, if half of all fragments are trivial, the aggre-

gate accuracy is guaranteed to be at least 0.5. This is an unnaturally inflated

score that does not represent the accuracy of non-trivial fragments. To correct

this, accuracy is not measured for trivial fragments, and aggregate results are

presented with a triviality score.

T rivialF ragments

AllF ragments

T riviality =

(5)

5R su s

5.1 Trivial and Ambiguous Fragments

Trivial fragments accounted for 5.25% to 9.33% of all fragments in Test A and

12.81% to 16.81% in Test B. The larger number of trivial fragments in Test B is

to be expected, as the naive method of Test A chains connections into fragments

much more readily than the discerning heuristic of Test B. It is important to

mention that some fragments were small because the sessions themselves were

small. Specifically, 3.48% to 7.21% of actual user sessions were trivial.

Ambiguous fragments accounted for 2.25% to 4.41% of all fragments in Test

A and 1.14% to 4.02% in Test B. There was no statistically significant difference

in ambiguity between the two methods.

5.2 Coverage

The distributions of coverage scores for Tests A and B are shown in Figure 8

and 9. The coverage of the fragments isolated by the heuristic appear to be

exponentially distributed, with about 75% of them having session coverage less

than 25%. The naively isolated fragments are distributed much differently, with

generalized peaks at coverages less than and greater than 50%.

5.3 Accuracy

The distribution of fragment accuracy for Tests A and B is shown in Figures 10

and 11. The figures show clearly that the heuristic isolates fragments that are

much more accurate than those of the naive method.

Computer Network Security

Search WWH ::

Custom Search

Home