Geoscience Reference
In-Depth Information
error is necessary to introduce the observation error correlation but also observation
and background error variances must be of similar size. Incorrect specifications of
background and observation error covariance matrices can be identified, interpreted
and better understood by the use of influence matrix diagnostics for the variety of
observation types and observed variables used in the data assimilation system.
4.1
Introduction
Over the years, data assimilation schemes have evolved into very complicated sys-
tems, such as the four-dimensional variational system (4D-Var) ( Rabier et al. 2000 )
at the European Centre for Medium-Range Weather Forecasts (ECMWF). The
scheme handles a large variety of both space and surface-based meteorological
observations. It combines the observations with prior (or background) information
of the atmospheric state and uses a comprehensive (linearized) forecast model to
ensure that the observations are given a dynamically realistic, as well as statistically
likely response in the analysis.
Effective monitoring of such a complex system, with the order of
10 9 degrees of
10 7 observations per 12-h assimilation cycle, is a necessity.
The monitoring cannot be restricted to just a few indicators, but a complex set of
measures is needed to indicate how different variables and regions influence the
data assimilation (DA) scheme. Measures of the observational influence are useful
for understanding the DA scheme itself: How large is the influence of the latest data
on the analysis and how much influence is due to the background? How much would
the analysis change if one single influential observation were removed? How much
information is extracted from the available data? It is the aim of this work to provide
such analytical tools.
We turn to the diagnostic methods that have been developed for monitoring
statistical multiple regression analyses. In fact, 4D-Var is a special case of the
Generalized Least Square (GLS) problem ( Talagrand 1997 ) for weighted regression,
thoroughly investigated in the statistical literature.
The structure of many regression data sets makes effective diagnosis and
fitting a delicate matter. In robust (resistant) regression, one specific issue is to
provide protection against distortion by anomalous data. In fact, a single unusual
observation can heavily distort the results of ordinary (non-robust) LS regression
( Hoaglin et al. 1982 ). Unusual or influential data points are not necessarily bad
data points: they may contain some of the most useful sample information. For
practical data analysis, it helps to judge such effects quantitatively. A convenient
diagnostic measures the effect of a (small) change in the observation
freedom and more than
y i on the
corresponding predicted (estimated) value y i . In LS regression this involves a
straightforward calculation: any change in
y i has a proportional impact on y i .The
desired information is available in the diagonal of the hat matrix ( Velleman and
Welsch 1981 ), which gives the estimated values y i as a linear combination of the
observed values
y i .Theterm hat matrix was introduced by J.W. Tukey ( Tukey 1972 )
because the matrix maps the observation vector y into
y
, but it is also referred to as
Search WWH ::




Custom Search