Information Technology Reference
In-Depth Information
The prediction error is larger when the predictor data are far from their
calibration-period means, and vice versa. For simple linear regression, the
standard error of the estimate s e and standard error of prediction s y* are
related as follows:
(
) +-
n
+
1
Â
n
(
)
2
(
)
2
s
* =
s
xx
xx
-
y
e
p
i
n
i
=
1
where n is the number of observations and x i is the ith value of the predic-
tor in the calibration sample, and x p is the value of the predictor used for
the prediction.
The relation between s y* and s e is easily generalized to the multivariate
case. In matrix terms, if Y = AX + E and y * = AX p , then s y* = s e {1 +
x T p ( X T X ) -1 x p }.
This equation is only applicable if the vector of predictors lies inside the
multivariate cluster of observations on which the model was based. An
important question is how “different” can the predictor data be from
the values observed in the calibration period before the predictions are
considered invalid.
LONG-TERM STABILITY
Time is a hidden dimension in most economic models. Many an airline
has discovered to its detriment that what was an optimal price today leads
to half-filled planes and markedly reduced profits tomorrow. A careful
reading of the newspapers lets them know a competitor has slashed prices,
but more advanced algorithms are needed to detect a slow shifting in
tastes of prospective passengers. The public, tired of being treated no
better than hogs, 2 turns to trains, personal automobiles, and
teleconferencing.
An army base, used to a slow seasonal turnover in recruits, suddenly
finds that all infirmary beds are occupied and the morning lineup for sick
call stretches the length of a barracks.
To avoid a pound of cure:
Treat every model as tentative, best described, as any lawyer will
advise you, as subject to change without notice.
Monitor continuously.
2 Or somewhat worse, because hogs generally have a higher percentage of fresh air to
breathe.
Search WWH ::




Custom Search