Introduction - On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Information Technology Reference

In-Depth Information

Table 1.1

Some kernel definitions

Kernel

K ð t Þ

Gaussian

2 p e ð 1 = 2 Þ t 2

Rectangular

for j t j \1, 0 otherwise

Triangular

1 j t j for j t j \1, 0 otherwise

= 5

Epanechnikov

1 1

5 t 2

for j t j \

;

0 otherwise

Biweight

16 ð 1 t 2 Þ 2

for j t j \1 ;

0 otherwise

1.1.2 Semi-Supervised Learning

Traditionally, there are two different types of tasks in machine learning: supervised

and unsupervised learning. For supervised learning, there is a sample x fg of

patterns that are independently and identically distributed (i.i.d.) from some

unknown data distribution with density P ð x Þ that has to be estimated. Supervised

learning consists of estimating a functional relationship x ! y between a covariate

x and a class variable y 2 1 ; ... ; f g; with the goal of minimizing a functional of

the joint data distribution P ð x ; y Þ such as the probability of classification error. The

marginal data distribution P ð x Þ is referred to as input distribution. Classification

can be treated as a special case of estimating the joint density P ð x ; y Þ:

Unsupervised learning can be considered as a density estimation technique.

Many techniques for density estimation usually propose a latent (unobserved) class

variable y and estimate P ð x Þ as mixture distribution P i ¼ 1 P ð x j y Þ P ð y Þ: Note that

the role of y in unsupervised learning is for modelling instead of being a role that is

related to observable reality, which is the usual role of y in classification.

The semi-supervised learning (SSL) problem could be considered as belonging to

any of the two previous categories of learning. If the goal is to minimize the clas-

sification error, semi-supervised learning would be a supervised task; however if the

goal is to estimate P ð x Þ , it would be an unsupervised task. In this latter case, the

problem imposes more significance on the density estimation, and the labelled data

are treated as an auxiliary resource. Thus, ''semi-unsupervised learning'' would be a

more suitable name for this task. The difference with a standard classification setting

is that along with a labelled sample D l ¼fð x i ; y i Þj i ¼ 1 ; ... ; n g that is drawn i.i.d.

from P ð x ; y Þ there is also has access to an additional unlabelled sample D u ¼

fð x n þ j j j ¼ 1 ; ... ; m g from the marginal P ð x Þ . Of special interest are the cases where

m n ; which may arise in situations where obtaining an unlabelled sample is cheap

and easy, while labelling the sample is expensive or difficult. We denote X l ¼

ð x 1 ; ... ; x n Þ; Y l ¼ð y 1 ; ... ; y n Þ and X u ¼ð x n þ 1 ; ... ; x n þ m Þ: The unobserved labels are

denoted Y u ¼ð y n þ 1 ; ... ; y n þ m Þ [ 6 ].

On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling

Search WWH ::

Custom Search

Home