Databases Reference
In-Depth Information
Table 2.1.
Confusion matrix.
Prediction Result
Not C
C
Actual Result
Not C
True Negative (TN)
False Positive (FP)
C
False Negative (FN)
True Positive (TP)
but is low between different groups; association rules reveal correlations
between different attributes in a data set; sequential rules summarize
frequent sequences or episodes in data.
Once an appropriate DM algorithm is adopted, one needs to divide the
data set being mined into two subsets, the training set and the test set.
Data mining algorithms are trained on the training set for construction
of patterns. These patterns are then verified on the test set. The verified
results are summarized in a confusion matrix such as the one shown in
Table 2.1. C in the table denotes the value of a predicted attribute. Based on
the confusion matrix, the following measures are used to quantify the
performance of a data mining algorithm:
TN
True Negative Rate (TNR):
TN + FP , also known as Specificity.
TP
TP + FN , also known as Detection Rate (DR)
True Positive Rate (TPR):
or Sensitivity.
FP
TN + FP =1
False Positive Rate (FPR):
Specificity, also known as
False Alarm Rate (FAR).
FN
TP + FN =1
False Negative Rate (FNR):
Sensitivity.
TP
TN + TP + FN + FP .
TN
+
Accuracy:
2.2.2. Evolutionary computation
Evolutionary computation, inspired by natural selection and variation of
Darwinian principles, is often viewed as an optimization process, as it favors
the best solutions among a set of randomly varied ones. Nevertheless, EC is
also useful for acquiring knowledge. The learning problem can be formulated
as a search problem by considering it as a search for a good model inside
the space of models. Such a space might consist of if-then rule sets or points
representing cluster centers. Compared with traditional search techniques,
evolutionary algorithms (EAs) are more ecient in that they involve search
Search WWH ::




Custom Search