Graphics Reference
In-Depth Information
Kernel Machines in the Framework
of an RKHS
10.2
he goal of this section is twofold. First, it serves as an introduction to some ba-
sic RKHS theory that is relevant to kernel machines. Secondly, it provides a unified
framework for kernelizing some classical linear methods, such as PCA, CCA, sup-
port vector clustering (SVC), etc., to allow for nonlinear structure exploration. For
further details, we refer the reader to Aronszajn (
) for the theory of reproduc-
ing kernels and reproducing kernel Hilbert spaces and Berlinet and homas-Agnan
(
)fortheirusageinprobability, statistics andmachinelearning. Listedbeloware
some definitions and basic properties.
Let
R
p
bethesamplespaceofthedata,whichserveshereasanindexset.
A real symmetric function κ
X⊂
R is said to be positive definite if,for any
positive integer m, any sequence of numbers
XX
a
, a
,...,a
m
R
,andpoints
m
i, j
=
a
i
a
j
κ
.
AnRKHSisaHilbertspaceofrealvaluedfunctions on
x
, x
,...,x
m
X
,wehave
(
x
i
, x
j
)
thatsatisfy theproperty
thatallevaluation functionals areboundedlinearfunctionals. Notethat anRKHS
isaHilbertspaceofpointwise-definedfunctions,wherethe
X
H
-normconvergence
implies pointwise convergence.
For every positive definite kernel κ on
there is a corresponding unique
XX
RKHS,denotedby
κ
,ofrealvaluedfunctionson
.Conversely,foreveryRKHS
H
X
H
there is a unique positive definite kernel κ such that
f
(ċ)
, κ
(
x,
ċ)
=
f
(
x
)
,
H
∀
, which is known as the reproducing property. We say that this
RKHS admits the kernel κ. A positive definite kernel is also termed a “reproduc-
ing kernel.”
A reproducing kernel κ that satisfies the condition
∫
XX
κ
f
H
,
∀
x
X
(
x,u
)
dxdu
<
has
a countable discrete spectrum given by
κ
(
x,u
)=
q
λ
q
ϕ
q
(
x
)
ϕ
q
(
u
)
,orκ
=
q
λ
q
ϕ
q
ϕ
q
for short.
(
.
)
hemainideabehindkernelmachinesistofirstmapthedataintoanEuclideanspace
X⊂
R
p
into an infinite-dimensional Hilbertspace. Next, a particular classical statis-
ticalprocedure,suchasPCA,iscarriedoutinthisfeatureHilbertspace.Suchahybrid
model of a classical statistical procedure and a kernel machine is nonparametric in
nature, butwhen fitting the data it uses the underlying parametric procedure(forex-
ample,thePCAfindssomeofthemain linear components). heextra effortinvolved
is the preparation of the kernel data before they are fed into some classical proce-
dures. Below we will introduce two different but isomorphic maps that are used to
embed the underlying Euclidean sample space into a feature Hilbertspace. Consider
the transformation
′
Φ
x
(
λ
ϕ
(
x
)
,
λ
ϕ
(
x
)
,...,
λ
q
ϕ
q
(
x
)
,...
)
.
(
.
)