Data Streams Classification: A Selective Ensemble with Adaptive Behavior - Agents and Artificial Intelligence

Information Technology Reference

In-Depth Information

the activation threshold and the number of active models are initialized (Steps 1-5).

Successively, every time a new model is inserted in the ensemble, the procedure at Step

6 computes how many models will be activated with the current

value. If the number

of models potentially activatable is higher than the old one (Step 7), the threshold is

increased. This situation happens when the data distribution remains stable and the new

inserted model is immediately enabled (Step 7-8). Increasing the threshold value, we

can obtain a better exploitation of the ensemble. On the contrary, if the number of active

models decreases from the previous invocation, the threshold has to be decreased. It is

useless and dangerous to maintain the current value, since a data change might be in

progress (Step 9-10). It is worth observing that, if the number of models does not change

between the two invocations, the threshold does not change, since there is no evidence

of model improving or data change.

From a computational point of view the algorithm does not introduce appreciable

overhead. Only the getActiveModel() procedure requires to access the ensemble struc-

ture. If we consider n as the number of classifiers storable in the ensemble, the com-

plexity of the algorithm is linear in O ( n ) .

The experimental section demonstrates that our system is no more heavily influenced

value, since it changes automatically, adapting it to data distribution.

Comparative Experimental Evaluation

4.1

Data Sets

Several synthetic data sets and a real one were introduced in our experiments. This

kind of data enables an exhaustive investigation about the reliability of the systems

involving different scenarios. The data behavior can be described exactly, characterizing

the number of concept drifts, the rate between a change to another and the number of

irrelevant attributes, or the percentage of noisy data.

LED24 : Proposed by Breiman et al. in [6], this generator creates data for a display with

7 LEDs. In addition to the 7 necessary attributes, 17 irrelevant boolean attributes

with random values are added, and 10 percent of noise is introduced, to make the

solution of the problem harder. This type of data generates only stable data sets.

Stagger : Introduced by Schlimmer and Granger in [28], this problem consists of

three attributes, namely colour ∈{

green , blue , red

}

, shape ∈{

triangle , circle ,

rectangle

. In its

original formulation, the training set includes 120 instances and consists of three

target concepts occurring every 40 instances. The first set of data is labeled accord-

ingtotheconcept color = red

}

,and size ∈{

small , medium , large

}

,andaclass y ∈{

0 , 1

}

∧ size = small , while the others include color

∨ size = large . For each training

instance, a test set of 100 elements is randomly generated according to the current

concept.

cHyper : Introduced in [7], a data set is generated by using a n -dimensional unit hy-

percube, and an example x is a vector of n -dimensions x i ∈

= green

∨ shape = circle and size = medium

[ 0 , 1 ] . The class bound-

ary is a hyper-sphere of radius r and center c . Concept drifting is simulated by

changing the c position by a value

in a random direction. This data set generator

Agents and Artificial Intelligence

Search WWH ::

Custom Search

Home