Dealing with Noisy Data - Data Preprocessing in Data Mining

Graphics Reference

In-Depth Information

5.4.2 Addressing Multi-class Classification Problems

by Decomposition

Usually, the more classes in a problem, the more complex it is. In multi-class learn-

ing, the generated classifier must be able to separate the data into more than a pair

of classes, which increases the chances of incorrect classifications (in a two-class

balanced problem, the probability of a correct random classification is 1/2, whereas

in a multi-class problem it is 1/M). Furthermore, in problems affected by noise, the

boundaries, the separability of the classes and therefore, the prediction capabilities

of the classifiers may be severely hindered.

When dealing withmulti-class problems, several works [ 6 , 50 ] have demonstrated

that decomposing the original problem into several binary subproblems is an easy,

yet accurate way to reduce their complexity. These techniques are referred to as

binary decomposition strategies [ 55 ]. The most studied schemes in the literature are:

One-vs-One (OVO) [ 50 ], which trains a classifier to distinguish between each pair of

classes, and One-vs-All (OVA) [ 6 ], which trains a classifier to distinguish each class

from all other classes. Both strategies can be encoded within the Error Correcting

Output Codes framework [ 5 , 17 ]. However, none of these works provide any theoret-

ical nor empirical results supporting the common assumption that assumes a better

behavior against noise of decomposition techniques compared to not using decom-

position. Neither do they showwhat type of noise is better handled by decomposition

techniques.

Consequently, we can consider the usage of the OVO strategy, which generally

out-stands over OVA [ 21 , 37 , 76 , 83 ], and check its suitability with noisy training

data. It should be mentioned that, in real situations, the existence of noise in the

data sets is usually unknown-therefore, neither the type nor the quantity of noise

in the data set can be known or supposed apriori . Hence, tools which are able

to manage the presence of noise in the data sets, despite its type or quantity (or

unexistence), are of great interest. If the OVO strategy (which is a simple yet effective

methodology when clean data sets are considered) is also able to properly (better than

the baseline non-OVO version) handle the noise, its usage could be recommended in

spite of the presence of noise and without taking into account its type. Furthermore,

this strategy can be used with any of the existing classifiers which are able to deal

with two-class problems. Therefore, the problems of algorithm level modifications

and preprocessing techniques could be avoided; and if desired, they could also be

combined.

5.4.2.1 Decomposition Strategies for Multi-class Problems

Several motivations for the usage of binary decomposition strategies in multi-class

classification problems can be found in the literature [ 20 , 21 , 37 , 76 ]:

•

The separation of the classes becomes easier (less complex), since less classes

are considered in each subproblem [ 20 , 61 ]. For example, in [ 51 ], the classes in a

Search WWH ::

Custom Search

Home