Expert-Based Fusion Algorithm of an Ensemble of Anomaly Detection Algorithms - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

Similarly, for an “x-vote union” we take the x -th smallest ranking as the score

of the ensemble. e.g. Union-2

A : min2(1,3,1) = 1

B : min2(2,2,3) = 2

C : min2(3,1,2) = 2 Points with the score n appeared in the top- n outlier list of

at least x detectors.

The ALOI Outlier Data-Set was used for the experiment. Some details are given

below:

1. The ALOI [17] dataset is a set of 110250 color images taken from 1000

small objects under varying conditions (i.e., approximately 100 pictures per

object).

2. In order to be appropriate for use as an outlier dataset, the ALOI dataset was

converted into an RGB histogram form, with 3 bins for each color channel,

and the number of images was reduced to 50000, with 1508 outliers.

3. To create these outliers, 1-5 images taken from the photo galleries of 562

objects such that there were a total of 1508 images to be used as the outliers.

While the other image galleries were left intact to serve as non-outliers. The

result was a dataset of 50000 with a dimensionality of 27.

For our candidate algorithms we used KNN, Aggregated KNN, LOF [24],

LDOF[25] and LoOP[11], which all have a single parameter k. k was adjusted

from 3 to 30 for a total of 5*28 = 140 candidates.

In the comparison made we compared our proposed “one union vote”(at least

one ADA has marked it as an outlier), the “140-union vote” (all the ADAs

agree that a certain instance is an outlier, where we use 140 versions of various

ADAs), the greedy fusion proposed by [6], and the simple average of all ADAs

termed the “Mean Ensemble”. We also include the result of initializing the greedy

ensemble method using the labels themselves. This is obviously not possible in

practice and is done to get the upper bound on performance for benchmarking.

The performance of the various methods are measured by the ROC curve. The

Receiver-Operator Curve (ROC) graphically displays a classifier's TPR vs. it's

FPR as the discrimination threshold is varied. It is often used to compare the

goodness of the rankings, scores, or probabilities produced by different classifiers.

The curve always starts at the bottom left (0,0) and ends at (1,1) representing

the extremes of a threshold so high that no instances will be considered positive,

and a threshold so low such that all instances become positive.

The ROC of an ideal classifier reaches TPR=1 when FPR=0 (is tight with the

y-axis and the top left corner), implying that there exists a decision threshold

where the classes are split perfectly. In terms of rankings, this means all instances

of the positive class are ordered before the negative class. The area-under-curve

of this ideal ROC curve is the area of the entire plot and is taken to be 1,

normalized. A classifier that randomly labels instances as positive or negative

will have a ROC curve approaching the diagonal and an AUC of 0.5.

Since there may not be a pre-specified acceptable rate of false positives or a

decision threshold, oftentimes the area-under-curve(AUC) is used as a crude way

Technologies and Applications of Artificial Intelligence

Search WWH ::

Custom Search

Home