Geoscience Reference
In-Depth Information
Furthermore, the change in the estimate that occurs when the
i
th observation is
deleted is
S
ii
.1
S
ii
/
r
i
y
i
y
.
i/
D
(4.9)
i
y
.
i/
i
where
is the LS estimate of
y
i
obtained by leaving-out the
i
th observation
of the vector
y
and the
th row of the matrix
X
. The method is useful to assess
the quality of the analysis by using the discarded observation, but impractical for
large systems. The formula shows that the impact of deleting
i
.y
i
;
x
i
/
on
y
i
can be
computed by knowing only the residual
r
i
and the diagonal element
S
ii
- the nearer
the self-sensitivity
S
ii
is to one, the more impact on the estimate
y
i
. A related result
concerns the so-called cross-validation (CV) score: that is, the LS objective function
obtained when each data point is in turn deleted (
Wahba 1990
, Theorem 4.2.1):
X
X
.y
i
y
i
/
2
.1
S
ii
/
2
.y
i
y
.
i
i
/
2
D
(4.10)
i
D
1
i
D
1
This theorem shows that the CV score can be computed by relying on the all-data
estimate
separate LS
regressions on the leaving-out-one samples. Moreover, (
4.9
) shows how to compute
self-sensitivities by the leaving out one experiment.
The definitions of influence matrix (
4.4
) and self-sensitivity (
4.5
) are rather
general and can be applied also to non-LS and nonparametric statistics. In spline
regression, for example, the interpretation remains essentially the same as in
ordinary linear regression and most of the results, like the CV-theorem above, still
apply. In this context,
Craven and Wahba
(
1979
) proposed the generalized-CV score,
replacing in (
4.10
)
y
and the self-sensitivities, without actually performing
m
. For further applications of influence
diagnostics beyond usual LS regression (and further references) see
Ye
(
1998
)and
Shen et al.
(
2002
). The notions related to the influence matrix that it has introduced
here will in the following section be derived in the context of a statistical analysis
scheme used for data assimilation in numerical weather prediction (NWP).
S
ii
by the mean tr
.
S
/=q
4.3
Observational Influence and Self-Sensitivity
for a DA Scheme
4.3.1
Linear Statistical Estimation in Numerical Weather
Prediction
Data assimilation systems for NWP provide estimates of the atmospheric state
x
by combining meteorological observations
y
with prior (or background) informa-
tion
x
b
. A simple Bayesian Normal model provides the solution as the posterior
expectation for
x
,given
y
and
x
b
. The same solution can be achieved from a classical
Search WWH ::
Custom Search