A Scalable Expressive Ensemble Learning Using Random Prism: A MapReduce Approach - Transactions on Large-Scale-Data-and Knowledge-Centered Systems XX

Database Reference

In-Depth Information

The random feature subset selection of a random size is also implemented inside

the base classifier. This takes place for each rule and for each term expansion of

that rule. The resulting base classifier has been termed 'R-PrismTCS', where the

'R' stands for the 'Random' components in the base classifier (random feature

subset selection for each rule term and bagging).

Algorithm 1 shows the steps of R-PrismTCS with the exception of J-pruning.

F denotes the total number of features, D is the original training data and

rule set is an initially empty set of classification rules. The operation

rule.add

Term

(

A x ) adds attribute value pair

A x as a rule term to

rule

and the operation

rule set.add

(

rule

) adds

rule

rule set

.Instep2foreach

A x the conditional

probability

(

class

i|A x ) is calculated, which is the probability with which

A x

covers the target class

Algorithm 1 : R-PrismTCS Algorithm

D = build random sample with replacement from D ;

D = D ;

Step 1: find class i that has the fewest instances in D ;

rule = new empty rule for target class i ;

Step 2: generate a feature subset f of size m ,where( F>m> 0);

calculate for each A x in fp ( class = i|A x );

Step 3: select the A x with the maximum p ( class = i|A x );

rule.addTerm ( A x );

delete all instances in D that do not cover rule;

Step 4: repeat 2 to 3 for D until D only contains instances of target class i ;

Step 5: rule set.add ( rule );

create a new D that comprises all instances of D except those that are

covered by all rules induced so far;

Step 6: IF (number of instances D > 1) { repeat steps 1 to 6 } ;

Figure 1 shows the conceptual architecture of Random Prism. Each R-Prism

TCS base classifier is induced on a training sample of size

from the training

data, where

is also the size of the training data. This sample is drawn using

random sampling with replacement. This statistically results in samples that

contain 63.2 % of the original instances, some of them drawn multiple times.

The remaining 36.8 % of the instances that have not been drawn are used as

validation data to estimate the individual R-PrismTCS classifier's predictive

accuracy ranging from 0 to 1. We call this accuracy the classifier's weight. The

individual classifier's weights are then used to perform weighted majority voting

on unlabelled data instances. The weights can also be used to filter base classi-

fiers, i.e., retain the classifiers with high predictive accuracy and eliminate those

with a poor one according to a user's predefined threshold.

Random Prism's predictive accuracy

has been evaluated empirically on

several datasets of the UCI repository [ 3 , 20 ]; and it has been found that Random

Transactions on Large-Scale-Data-and Knowledge-Centered Systems XX

Search WWH ::

Custom Search

Home