Geology Reference
In-Depth Information
Fig. 4.15 Separation of
hyperplane in SVM modeling
As we are interested only in support vector regression (SVR), we can look into
its mathematical details.
Let data examples [(x 1 , y 1 ), (x 2 , y 2 ),
,(xi, yi)] be a given training set, where
xi denotes an input vector, and yi is its corresponding output data set. SVR maps
x into a feature space via a nonlinear function
Φ
(x), and then
finds a regression
function f(x)=w
· Φ
(x)+b which can best approximate the actual output y with an
error tolerance
, where w and b are the regression function parameters known as
weight vector and bias value, respectively.
ε
is known as a nonlinear mapping
function [ 17 ]. Based on the SVM theorem, f(x) can solve problems as shown below:
Φ
1
2 j
C X
N
minimize
w
2
1 ðn k þ n k Þ
j j
w
þ
ð 4 : 55 Þ
; n; n;
;
b
k
¼
subject to
8
<
:
9
=
;
y k w T
/ X ðÞ b eþn k
e þ n k
w T
/
X ðÞþ
b
y k
k
¼
1
;
2
;
3
...
N
ð 4 : 56 Þ
n k n k
0
In the above equation, C is a positive trade-off parameter which determines the
degree of error in the training stage and
ξ
and
ξ
* are slack variables which penalize
training errors by Vapnik
'
s
ξ—
insensitivity loss function. In the optimization
2 improves the generalization of the SVM by regulating the
degree of model complexity. The C P k¼1 n k þ n k
equation, the term 2 j
j j
w
controls the degree of
empirical risk. Figure 4.16 illustrates the concept of SVR. The above equation can
be reformulated to dual form using Lagrangian multipliers (
ʱ
,
ʱ
*):
Search WWH ::




Custom Search