Expert-Based Fusion Algorithm of an Ensemble of Anomaly Detection Algorithms - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

1. Point Anomalies: If an individual data instance can be considered as

anomalous with respect to the rest of data, then the instance is termed

a point anomaly. This is the simplest type of anomaly and is the focus of

the majority of research on anomaly detection (e.g. a fraudulent credit card

transaction).

2. Contextual Anomalies: The anomalous behavior is determined using the

values for the behavioral attributes within a specific context. For example,

suppose an individual usually has a weekly shopping bill of 100 except during

the Christmas week, when it reaches 1000. A new purchase of 1000 in a week

in July will be considered a contextual anomaly, since it does not conform to

the normal behavior of the individual in the context of time even though the

same amount spent during the Christmas week would be considered normal.

3. Collective Anomalies: If a collection of related data instances is anoma-

lous with respect to the entire data set, it is termed a collective anomaly .e.g.

a low value for an abnormally long period of time where the low value is not,

in itself, anomalous or a typical Web-based attack by a remote machine fol-

lowed by copying of data from the host computer to a remote destination

via ftp.

In our research we will focus on ADAs of the first type i.e., point anomaly

detection.

This group of ADAs is also termed an ensemble of ADAs. From this ensemble

we aim to fuse scores/decisions to reach the final score/decision. We assume

the input data to be behavioral which is characterized by being temporal and

sequential. We define an outlier to be an input instance that substantially differs

from previous time series data for which no a priori knowledge exists. An example

of an outlier in the Web-based environment is a data package in the flow between

two remote computers that contains a malicious attack. Another example in the

weather monitoring domain system may be out of range attribute values that

may indicate an approaching storm.

The output of each ADA may use a different scoring/ranking range. Therefore,

in order to enable a meaningful integration or fusion of these values we first must

use a normalization phase that will keep a proportion across the ADAs' scores.

The simplest way of bringing outliers scores onto a normalized scale is to apply

a linear transformation such that the minimum (maximum) occurring score is

mapped to 0 (1). [11]. In the experiment section we show that ranking may

replace the normalization and even outperform it.

4 Expertise Based Fusion Algorithm

In this section we present our method for fusing the score/decision of an ensemble

of ADAs. Our leading/base assumption is that there might be multiple types of

outliers and that there may be some ADAs that will be experts in the detection

of a certain type or multiple types of outliers, but with a very low prediction

ability regarding other types of outliers.

Search WWH ::

Custom Search

Home