Information Technology Reference
In-Depth Information
extremely small values. One can think of many examples from physics such
as Boyle's Law, which fails at high pressures, and particle symmetries that
are broken as the temperature falls. In medicine, radio immune assay fails
to deliver reliable readings at very low dilutions and for virtually every
drug there will always be an increasing portion of nonresponders as the
dosage drops. In fact, almost every measuring device—electrical, elec-
tronic, mechanical, or biological—is reliable only in the central portion of
its scale.
We need to recognize that while a regression equation may be used for
interpolation within the range of measured values, we are on shaky ground
if we try to extrapolate, to make predictions for conditions not previously
investigated. The solution is to know the range of application and to rec-
ognize, even if we do not exactly know the range, that our equations will
be applicable to some but not all possibilities.
Ambiguous Relationships
Think why rather than what.
The exact nature of the formula connecting two variables cannot be deter-
mined by statistical methods alone. If a linear relationship exists between
two variables X and Y , then a linear relationship also exists between Y and
any monotone (nondecreasing or nonincreasing) function of X . Assume
that X can only take positive values. If we can fit Model I: Y = a + b X + e
to the data, we also can fit Model II: Y = a¢+b¢log [ X ] + e, and Model
III: Y = a≤+b≤ X + g X 2 + e. It can be very difficult to determine which
model, if any, is the “correct” one in either a predictive or mechanistic
sense.
A graph of Model I is a straight line (see Figure 9.1). Because Y
includes a stochastic or random component e, the pairs of observations
( x 1 , y 1 ), ( x 2 , y 2 ), . . . will not fall exactly on this line but above and below
it. The function log[ X ] does not increase as rapidly as X does; when we fit
Model II to these same pairs of observations, its graph rises above that of
Model I for small values of X and falls below that of Model I for large
values. Depending on the set of observations, Model II may give just as
good a fit to the data as Model I.
How Model III behaves will depend upon whether b≤ and a≤ are both
positive or whether one is positive and the other negative. If b≤ and a≤
are both positive, then the graph of Model III will lie below the graph of
Model I for small positive values of X and above it for large values. If b≤
is positive and a≤ is negative, then Model III will behave more like Model
II. Thus Model III is more flexible than either Models I or II and can
usually be made to give a better fit to the data—that is, to minimize some
Search WWH ::




Custom Search