Information Technology Reference
In-Depth Information
(
a)
(b)
1.0
1.0
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
x
1
x
1
(d)
(c)
1.0
1
y
0.5
y
1.0
1.0
0.0
0
0.5
1
1.0
x
2
x
2
0.5
0.0
0.0
0.5
x
1
x
1
0.5
0.5
1.0
0.0
0.0
1.0
Fig. 4.
System and data for classification (a, b), regression estimation (c, d)
In the case of
infinite
sets with continuum of elements, the learning machine was
trained by the least-squares criterion. We remark that obviously other learning ap-
proaches can be used in this place e.g. maximum likelihood, SVM criterion [2,1,26]. If
we denote the bases exp
−
x
−
µ
k
σ
k
2
by
g
k
(
x
)
and calculate the matrix of bases at data
2
2
points
⎛
⎞
1
g
1
(
x
1
)
g
2
(
x
1
)
···
g
K
(
x
1
)
⎝
⎠
1
g
1
(
x
2
)
g
2
(
x
2
)
···
g
K
(
x
2
)
G
=
(51)
.
.
.
.
.
.
.
1
g
1
(
x
I
)
g
2
(
x
I
)
···
g
K
(
x
I
)
we can find the optimal vector of
w
coefficients by the pseudo-inverse operation as
follows:
(
w
0
,
w
1
,...,
w
K
)
T
=(
G
T
G
)
−
1
G
T
Y
,
(52)
where
Y
=(
y
1
,
y
2
,...,
y
I
)
T
is a vector of training target values.
4.4
Experiment Results and Comments
Experiments involved trying out different settings on all relevant constants such as:
number of terms in approximating functions (
K
), number of functions (
N
) in the case
of finite sets or VC dimension (
h
) in case of infinite sets, sample size (
I
), number of
cross-validation folds (
n
). For each fixed setting of the constants, an experiment with
repetitions was performed, during which we measured the cross-validation outcome
C
after each repetition. The range of these outcomes was then compared to the interval
implied by the theorems we proved.