Information Technology Reference
In-Depth Information
Table 1.1
Some kernel definitions
Kernel
K ð t Þ
Gaussian
2 p e ð 1 = 2 Þ t 2
1
Rectangular
1
2
for j t j \1, 0 otherwise
Triangular
1 j t j for j t j \1, 0 otherwise
= 5
p
p
5
Epanechnikov
3
4
1 1
5 t 2
for j t j \
;
0 otherwise
Biweight
15
16 ð 1 t 2 Þ 2
for j t j \1 ;
0 otherwise
1.1.2 Semi-Supervised Learning
Traditionally, there are two different types of tasks in machine learning: supervised
and unsupervised learning. For supervised learning, there is a sample x fg of
patterns that are independently and identically distributed (i.i.d.) from some
unknown data distribution with density P ð x Þ that has to be estimated. Supervised
learning consists of estimating a functional relationship x ! y between a covariate
x and a class variable y 2 1 ; ... ; f g; with the goal of minimizing a functional of
the joint data distribution P ð x ; y Þ such as the probability of classification error. The
marginal data distribution P ð x Þ is referred to as input distribution. Classification
can be treated as a special case of estimating the joint density P ð x ; y Þ:
Unsupervised learning can be considered as a density estimation technique.
Many techniques for density estimation usually propose a latent (unobserved) class
variable y and estimate P ð x Þ as mixture distribution P i ¼ 1 P ð x j y Þ P ð y Þ: Note that
the role of y in unsupervised learning is for modelling instead of being a role that is
related to observable reality, which is the usual role of y in classification.
The semi-supervised learning (SSL) problem could be considered as belonging to
any of the two previous categories of learning. If the goal is to minimize the clas-
sification error, semi-supervised learning would be a supervised task; however if the
goal is to estimate P ð x Þ , it would be an unsupervised task. In this latter case, the
problem imposes more significance on the density estimation, and the labelled data
are treated as an auxiliary resource. Thus, ''semi-unsupervised learning'' would be a
more suitable name for this task. The difference with a standard classification setting
is that along with a labelled sample D l ¼fð x i ; y i Þj i ¼ 1 ; ... ; n g that is drawn i.i.d.
from P ð x ; y Þ there is also has access to an additional unlabelled sample D u ¼
x n þ j j j ¼ 1 ; ... ; m g from the marginal P ð x Þ . Of special interest are the cases where
m n ; which may arise in situations where obtaining an unlabelled sample is cheap
and easy, while labelling the sample is expensive or difficult. We denote X l ¼
ð x 1 ; ... ; x n Þ; Y l ¼ð y 1 ; ... ; y n Þ and X u ¼ð x n þ 1 ; ... ; x n þ m Þ: The unobserved labels are
denoted Y u ¼ð y n þ 1 ; ... ; y n þ m Þ [ 6 ].
Search WWH ::




Custom Search