Information Technology Reference
In-Depth Information
Λ
H l (8.11)
(2) We use the annealed VC entropy to define the following equation describing
a sufficient condition for consistency of the ERM principle.
(
l
)
lim
=
0
l
Λ
H
(
l
)
ann
lim
=
0
(8.12)
l
l
where the annealed VC entropy
).
Λ
Λ
H
(
l
)
=
ln
EN
(
z
,
?
,
z
(3) We use the growth function to define the following equation describing a
sufficient condition for consistency of the ERM principle.
ann
l
Λ
G
(
l
)
lim
=
0
(8.13)
l
l
where the growth function
Λ
Λ
G
(
l
)
=
ln
sup
N
(
z
,
?
,
z
).
l
z
,
?
,
z
l
1
8.3 Structural Risk Minimization Inductive Principle
Statistical learning theory systematically analyze relationship of inhomogeneous
function set, empirical risk and actual risk, the bounds on the generalization
ability of learning machines (Vapnik, 1995). Here we will only consider
functions that correspond to the two-class pattern recognition case: For the set of
indicator functions (including the function minimizing empirical risk.)Now
choose some η such that 0 η 1. Then for losses taking empirical risk
R
emp (
w
)
and actual risk
R
(
w
) with probability 1-η, the following bound holds (Burges,
1998):
h
(ln(
2
l
/
h
)
+
1
ln(
η
/
4
)
R
(
w
)
R
(
w
)
+
emp
l
where
is a non-negative integer called the Vapnik Chervonenkis (VC)
dimension. And l is the total of samples.
It follows that statistical learning's actual risk
h
R
(
w
) have two parts: one is
empirical risk
) defined to be just the measured mean error rate on the
training set (for a fixed, finite number of observations); another is VC confidence.
The confidence interval reflects the maximal difference of between actual risk
R
emp (
w
Search WWH ::




Custom Search