Expert-Based Fusion Algorithm of an Ensemble of Anomaly Detection Algorithms - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

The need for such a fusing system stems from the fact that there are many

ADAs that suffer from a certain percentage of error. By fusing and aggregat-

ing the outputs of multiple ADAs we aim to minimize the error percentage as

much as possible. In other words we intend to maximize the recall rate of the

process. An illustrative example is the case where a computer system adminis-

trator aims to identify and block any offensive attack on his computer system

or any malicious program [19-22]. Another noncriminal example of an anomaly

detection scenario is the case where countries with high typhoon vulnerability

aim to identify approaching storms and to act in such a way that will minimize

potential damage [23].

An ideal ADA satisfies the conditions of (i) having a True Positive Rate (TPR)

equal to 1 (the TPR indicates the portion of accurate positive instances of all

positive instances that were classified as positive; this measurement is also known

as the recall rate or alternatively the sensitivity rate) and (ii) having a False

Positive Rate (FPR) equal to 0 (the FPR is the portion of positive instances

that were misclassified as positive of all positive classified instances (the FPR is

also known as the false alarm rate).

Assuming that a set of ADAs do not overlap and are independent we may

use a simple OR operation among them in order to fuse them. Namely, it is

sucient that a single ADA decides a certain input instance is an outlier in

order to have the final decision of an outlier. Unfortunately, this assumption is

far from applicable. Therefore, in this paper we propose a fusing method that

will achieve a false alarm rate smaller than each of individual ADA.

Against this background, in this paper we propose a two phase mechanism. In

the first step an oine process will take place to classify all the given ADAs into

clusters based on their expertise. In the second phase an online and continuous

process will take place in which we aim to fuse the decision of all the ADAs in a

way that promotes the expert ADAs that were identified in the previous phase

for each given type of outlier.

Next we provide a preliminary experiment that deals with the case of a single

type of outlier. Here we focus on a more basic debate that exists in the literature

about the process of unifying the scores given by different ADAs taken from

different scales and ranges termed the normalization phase. We show a way to

overcome the problem of normalization by using ranking in its place, which was

found to perform better than the normalization.

The paper is structured as follows. In section 2 we describe the current state

of the art. In section 3 we provide details about the general fusing structure.

Then, in section 4 we describe our proposed expertise based fusing mechanism.

In section 5 we describe our simulation and provide initial results. Finally we

conclude in section 6 and discuss future work.

2 Related Work

Information or data fusion has been widely researched in the last decade [3, 4].

According to Ahmed and Pottie [4], data fusion is the process by which a data

Technologies and Applications of Artificial Intelligence

Search WWH ::

Custom Search

Home