Databases Reference
In-Depth Information
Table 2.7.
Fitness functions considering different factors { Con:Conciseness } .
Factors
Examples
References
DR
FPR
Con
H ( C i )
H max ( C i )
×
×
Refs. 83, 78, 75, 74
√√ ×
α
A B
Refs. 60, 81, 71, 70, 30, 29, 59, 77
w 1 × support + w 2 × confidence
Refs. 43, 31, 50, 42, 51
1 −|ϕ p − ϕ|
Refs. 89, 49, 63, 28, 65
√√√ w 1 × sensitivity + w 2
Ref. 27
× specificity + w 3 × length
(1 + Az ) × e −w
Refs. 80, 32, 33
which cover most of the normal data. In this example, H ( C i )represents
the entropy of data points who belong to cluster C i ,and H max ( C i )isthe
theoretical maximum entropy for cluster C i .
Accuracy actually requires both DR and FPR, since ignoring either of
them will cause misclassification errors. A good IDS should have a high DR
and a low FPR. The first example in the second row directly interprets
this principle. α stands for the number of correctly detected attacks,
A the number of total attacks, β the number of false positives, and B the
total number of normal connections. As we know, patterns are sometimes
represented as if-then rules, so in the second example, the support-
confidence framework is borrowed from association rules to determine the
fitness of a rule. By changing weights w 1 and w 2 , the fitness measure can be
used for either simply identifying network intrusions or precisely classifying
thetypeofintrusion. 31 The third example considers the absolute difference
between the prediction of EC ( ϕ p ) and the actual outcome ( ϕ ).
The third row considers another interesting property: conciseness. This
is for two reasons: concise results are easy to understand, and avoid
misclassification errors. The second reason is less obvious. Conciseness can
be restated as the space a model, such as a rule, or a cluster, uses to cover
adataset.Ifrule A and rule B have the same data coverage, but rule A
is more concise than B ,so A uses less space than B does when covering
the same data set. Therefore, the extra space of B is more prone to cause
misclassification errors. Apparently, the first example of this kind considers
all three terms, where length correlates with conciseness. The second
example of this type considers the number of counterexamples covered by
arule( w ), and the ratio between the number of bits equal to one in the
chromosome and the length of chromosome ( z ), which is the conciseness of
Search WWH ::




Custom Search