Information Technology Reference
In-Depth Information
Averaged ROC curve
ROC curve l 1
X 3
ROC curve l 2
X 2
X 3
X
X 2
Z 2
X 1
Y 2
Z 2
Z 1
Y 2
Y 1
Point on ROC curve 1:
Point on ROC curve 2
Point on averaged ROC
curve
FP_rate1
FP_rate2
Figure 7.5 Vertical averaging approach for multiple ROC curves.
it can be averaged with Y 2 to get the corresponding Z 2 point on the averaged
ROC curve.
7.4.2 Results
7.4.2.1 SEA Dataset SEA dataset [15] is a popular artificial benchmark to
assess the performance of stream data-mining algorithms. It has three features
randomized in [0 , 10], where the class label is determined by whether the sum of
the first two features surpasses a defined threshold. The third feature is irrelevant
and can be considered as noise to test the robustness of the algorithm under
simulation. Concept drifts are designed to adjust the threshold periodically such
that the algorithm under simulation would be confronted with an abrupt change
in class concepts after it lives with a stable concept for several data chunks.
Following the original design of the SEA dataset, the whole data streams are
sliced into four blocks. Inside each of these blocks, the threshold value is fixed,
that is, the class concepts are unchanged. However, whenever it comes to the end
of a block, the threshold value will be changed and retained until the end of the
next block. The threshold values of the four blocks are set to be 8, 9, 7, and 9 . 5,
respectively, which again adopt the configuration of Street and Kim [15]. Each
Search WWH ::




Custom Search