Environmental Engineering Reference
In-Depth Information
This empirical error now only uses the values M( x ( i ) that are already available from
Equation 6.30 and is thus readily computable. Note that the normalized quantity
·
Var Y ,
Err E
R
2
=− []
(6.37)
is the well-known coefficient of determination in regression analysis, where Var [ Y ] is the
empirical variance of the set of response quantities in Equation 6.30 .
However, Err · usually underestimates (sometimes severely) the real generalization error
Err G . As an example, in the limit case, when an interpolating polynomial would be fitted to
the ED, Err · would be exactly zero while Err G in Equation 6.34 would probably not: this
phenomenon is known as overfitting.
6.3.5.2 Leave-one-out cross-validation
A compromise between fair error estimation and affordable computational cost may be
obtained by leave-one-out (LOO) cross-validation, which was originally proposed by
Allen (1971); Geisser (1975). The idea is to use different sets of points to (i) build a PC
expansion and (ii) compute the error with the original computational model. Starting
from the full ED X, LOO cross-validation sets one point apart, say x ( i ) and builds a PC
expansion denoted by M PC\ i (.) from the n − 1 remaining points, that is, from the ED
X \
def
x
()
i
=
{
x xx x
()
1
,
,
(
i
1
)
,
(
i
+
1
)
,
,
()
n
}.
The predicted residual error at that point reads:
def
(6.38)
i
=
MM
()
x
()
i
PC
\
i
( .
x
( )
i
The PRESS coefficient (predicted residual sum of squares ) and the LOO error respectively
read:
n
2
1
PRESS
=
,
(6.39)
i
i
=
n
1
· =
Err
∆ .
2
(6.40)
LOO
i
n
i
=
1
Similar to the determination coefficient in Equation 6.37 , the Q 2 indicator defined by
·
Var Y ,
Err LOO
Q
2
=−
(6.41)
[]
is a normalized measure of the accuracy of the metamodel. From the above equations,
one could think that evaluating Err LO · is computationally demanding since it is based
on the sum of n different predicted residuals, each of them obtained from a different PC
expansion. However, algebraic derivations may be carried out to compute Err LO · from
a single PC expansion analysis using the full original ED X (details may be found in
 
 
Search WWH ::




Custom Search