Inference of Co-occurring Classes: Multi-class and Multi-label Classification - Computational Intelligence Paradigms in Advanced Pattern Classification

Information Technology Reference

In-Depth Information

i.e. finding the classes most associated with each given sample. The second is

multi-instance classification, identifying the (annotated) samples most similar to

the examined sample (in terms of the feature values), and extract from the labels

associated with them the relevant set of labels for the examined sample. The third

is to use arbitrary patches or automatically and semi-automatically extracted seg-

ments (69, 74, 79] of the samples, find the labels relevant to each of them accord-

ing to their features and then combine these findings to describe the entire sample.

Instead of area patches (images) or temporal segments (music, video), an interme-

diate level of abstraction can be used, in which parameters of the domain can be

defined, such as rhythm in music or shape in images, and characterize classes ac-

cording to them. Another approach to multi-label classification is to compromise

accuracy for efficiency, using various methods to reduce the number of classes

and labels by removing redundant labels which rarely appear in the training set

[29]. A large number of labels mean both longer processing time and a more com-

plicated taxonomy for human users to deal with.

The next paragraphs review a few of the methods published in recent years for

various knowledge domains, modalities and applications. Applications of annota-

tion and retrieval and of gene function recognition have many similarities, because

a set of gene functions or a “bag of words” can have a binary representation, they

either appear or not. Dekel et al. [13] present four variations of a boosting-based

algorithm, ranking the goodness of labels. Montejo-Raez et al. [44], suggest an

Adaptive Selection of Base Classifiers approach, for multi-label classification of

text documents based on independent binary classifiers, which provides the possi-

bility to choose the best of a given set of binary classifiers. They give a higher

weight when only a few documents are available for a class. The factor over-

weighs positive samples in comparison to negative samples of a class. In order to

make the classification process fast, they add a threshold for the minimum per-

formance allowed for a weak binary classifier, below this value both the classifier

and the class are discarded, hence discarding rare classes. For the base algorithms

they use either the Rocchio or the PLAUM algorithms [45]. Rak et al. [49] present

an associative classification method which they try on medical documents.

Thabatah et al . [63] present MMAC, a method based on associative classification.

This assumes that for each instance that frequently occurs in the training data and

passes a certain frequency and confidence threshold, there is a rule associated with

each of the class labels. Hence, each instance is associated with a ranked list of la-

bels. The algorithm involves a single pass on the training data in order to generate

the rules, and then a regression process that generates more rules, and sets the rela-

tions between them. Three evaluation methods are also presented: top-label , which

examines only the label best associated with the data as in the multi-class case;

any-label , how many times all the expected labels where recognized for all the in-

stances in the test set, and label-weight , each label in the classification results is

assigned a rank according to the number of times it is associated with the same in-

stance in the training set.

Computational Intelligence Paradigms in Advanced Pattern Classification

Search WWH ::

Custom Search

Home