Introduction - On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Information Technology Reference

In-Depth Information

recognition as a search for structure in data [ 2 ]. The principal task of pattern

recognition that we are interested in is in dividing a manifold of data into cate-

gories that have a meaning under an application context. We will approach this

task from two perspectives: classification and clustering. The first one considers a

predefined set of categories to which data items can be assigned; the second one

discovers significant groups that are present in a data set with no predefined classes

and no examples that would show what kind of desirable relations should be valid

among the data. The structures searched for in the data consist of the rules that

describe the relationships in the data. These structures can be defined through a

probabilistic model that can provide a reasonable explanation of the process

generating the data. We assume that the set of observed variables that are

explicitly defined in the data are generated from a set of hidden variables of an

underlying model. Thus, the data are denoted by formulae or models that describe

their principal characteristics. The ratio of the complexity of the data set to the

complexity of the formulae is defined as parsimony. In order to propose a model

for the data, it is necessarily assumed that there exist patterns or rules in the data.

In this case, the data are redundant, and the patterns may be used to provide a

parsimonious description that is more concise than the data themselves [ 3 ].

Pattern recognition is frequently achieved by using features extracted from the

raw data either because the stream of the measured data is large or because

processing raw data does not allow patterns to be distinguished. Thus, the selection

and estimation of the features should lead to adequately characterizing the con-

spicuous properties of the data. An appropriate set of features allows the data to be

separated in different groups or clusters, where the data in one group are the most

similar to each other, and are also the most dissimilar to the data in others groups.

The groups of data extracted from the original data manifold represent particular

patterns with particular meanings for which an explicit label could be assigned.

Once the rules for the patterns are learned from a dataset, they can be applied to

classify new datasets.

There can be different degrees of completeness of the knowledge of the labels for

the dataset employed in the learning process. The degree of knowledge available

determines the kind of learning, i.e., supervised (all the data-label pairs are known),

semi-supervised (labels are available for a subset of the data), and unsupervised (no

labels are available). The kind of learning must be encompassed within the com-

plexity of the real-world problem that imposes a minimum level of labelling in

order to learn the geometry of the data. Increasing the data labels available in order

to reach an adequate level is restricted for several applications. Frequently,

obtaining unlabelled data may be relatively easy whereas obtaining labelled data

may be difficult and costly. However, the latter can be alleviated by considering that

the performance of some algorithms is significantly improved with a small number

of labelled data [ 4 , 5 ]. Therefore, semi-supervised learning has been increasingly

studied (for its capability to incorporate different proportions of unlabelled and

labelled data) as a suitable method for many complex problems [ 6 ].

Intelligent signal processing algorithms provide an important tool to support

automatic

pattern

recognition,

to

gain

insight

into

problem-solving,

and

to

On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Search WWH ::

Custom Search

Home