Information Technology Reference
In-Depth Information
Chapter 9
Inference of Co-occurring Classes: Multi-class
and Multi-label Classification
Tal Sobol-Shikler
Ben-Gurion University of the Negev
Abstract. The inference of co-occurring classes, i.e. multi-class and multi-label
classification, is relevant to various aspects of human cognition, human-machine in-
teractions and to the analysis of knowledge domains and processes that have tradi-
tionally been investigated in the social sciences, life sciences and humanities. Hu-
man knowledge representations usually comprise multiple classes which are rarely
mutually exclusive. Each instance (sample) can belong to one or more of these
classes. However, full labeling is not always possible, and the size of the consistent-
ly labeled is often limited. The level of existence of a class often varies between in-
stances or sub-classes. The features that distinguish the classes are not always
known, and can be different between classes. Hence, methods should be devised to
perform multi-class and multi-label classification, and to approach the challenges
entailed in the complex knowledge domains. This chapter surveys current approach-
es to multi-class and multi-label classification in various knowledge domains, and
approaches to data annotation (labeling). In particular, it presents a classification al-
gorithm designed for inferring the levels of co-occurring affective states (emotions,
mental states, attitudes etc.) from their non-verbal expressions in speech.
1 Introduction
Large volumes of domain knowledge are available but they are not always con-
structed in a manner that can be processed by machines. The selection of classes
and the relations between the classes have an immense effect on the classification
goals, design and capabilities [22, 46, 57, 61]. A large number of labels often
means that each label is represented by only a small number of samples. Manual
annotation and the number of samples per label in the training data pose limitation
on the robustness and generality of the “ground truth” for the entire classification
system. The consistency of the annotation defines the reliability of the system and
its applicative scope. The data samples are represented mostly by a single modali-
ty or by multiple (synchronized or aligned) modalities, such as text, images, audio,
video, data from multiple sensors and from various measurement equipments
[1, 3, 50, 59, 62]. The selection of the application and of the modality to be used
 
Search WWH ::




Custom Search