Information Technology Reference
In-Depth Information
the activation threshold and the number of active models are initialized (Steps 1-5).
Successively, every time a new model is inserted in the ensemble, the procedure at Step
6 computes how many models will be activated with the current
value. If the number
of models potentially activatable is higher than the old one (Step 7), the threshold is
increased. This situation happens when the data distribution remains stable and the new
inserted model is immediately enabled (Step 7-8). Increasing the threshold value, we
can obtain a better exploitation of the ensemble. On the contrary, if the number of active
models decreases from the previous invocation, the threshold has to be decreased. It is
useless and dangerous to maintain the current value, since a data change might be in
progress (Step 9-10). It is worth observing that, if the number of models does not change
between the two invocations, the threshold does not change, since there is no evidence
of model improving or data change.
From a computational point of view the algorithm does not introduce appreciable
overhead. Only the getActiveModel() procedure requires to access the ensemble struc-
ture. If we consider n as the number of classifiers storable in the ensemble, the com-
plexity of the algorithm is linear in O ( n ) .
The experimental section demonstrates that our system is no more heavily influenced
θ
by
θ
value, since it changes automatically, adapting it to data distribution.
4
Comparative Experimental Evaluation
4.1
Data Sets
Several synthetic data sets and a real one were introduced in our experiments. This
kind of data enables an exhaustive investigation about the reliability of the systems
involving different scenarios. The data behavior can be described exactly, characterizing
the number of concept drifts, the rate between a change to another and the number of
irrelevant attributes, or the percentage of noisy data.
LED24 : Proposed by Breiman et al. in [6], this generator creates data for a display with
7 LEDs. In addition to the 7 necessary attributes, 17 irrelevant boolean attributes
with random values are added, and 10 percent of noise is introduced, to make the
solution of the problem harder. This type of data generates only stable data sets.
Stagger : Introduced by Schlimmer and Granger in [28], this problem consists of
three attributes, namely colour ∈{
green , blue , red
}
, shape ∈{
triangle , circle ,
rectangle
. In its
original formulation, the training set includes 120 instances and consists of three
target concepts occurring every 40 instances. The first set of data is labeled accord-
ingtotheconcept color = red
}
,and size ∈{
small , medium , large
}
,andaclass y ∈{
0 , 1
}
size = small , while the others include color
size = large . For each training
instance, a test set of 100 elements is randomly generated according to the current
concept.
cHyper : Introduced in [7], a data set is generated by using a n -dimensional unit hy-
percube, and an example x is a vector of n -dimensions x i
= green
shape = circle and size = medium
[ 0 , 1 ] . The class bound-
ary is a hyper-sphere of radius r and center c . Concept drifting is simulated by
changing the c position by a value
Δ
in a random direction. This data set generator
 
Search WWH ::




Custom Search