Biomedical Engineering Reference
In-Depth Information
In this case, the correlation coefficient is very close to unity. Since the regression model is
fundamentally sound, we shall say that the experimental data is very good.
If the regression is carried out by minimizing d 2 instead, the definition of R is not as
straightforward. Using Eqn (7.27) hides the fact that the data contain a relative error. One
is not to replace
2 with d 2 in Eqn (7.27) either. An alternative definition can be used is to
normalize the total variation with the mean value of y i 's. That is,
s
t
n n d 2
n P i ¼ 1 y i
P i ¼ 1
R 0 ¼
1
(7.29)
y i 2 1
The correlation coefficient defined by Eqn (7.29) should give a better estimate of the quality of
the fit for the data containing relative error.
7.5. COMMON ABUSES OF REGRESSION
Correlation/regression analysis is widely used and frequently misused; several common
abuses of regression are briefly mentioned here. The most unforgiving misuse is the sloppi-
ness in selecting variables for correlation. Care should be taken in selecting variables with
which to construct regression equations and in determining the form of the model. It is
possible to develop statistical relationships among variables that are completely unrelated
in practical sense. For example, we might attempt to relate the room temperature with
number of boxes of computer paper used in a lab. A straight line may even appear to provide
a good fit to the data, but the relationship is an unreasonable one on which to rely. A strong
observed association between variables does not necessarily imply that a causal relationship
exists between those variables. Designed experiments are the only way to determine causal
relationships. Whenever possible, fundamentally sound relationships are to be sought and
only parametric estimation is applied. Because of the unknown nature of the experimental
error, a slightly smaller variance cannot be used as a criterion either to endorse or to reject
a regression model.
The most common misuse of parametric estimation/regression is not minimizing the error
variance of data around the model that is generic to the data. It is common not to enquire
whether the experimental data is obtained with either an absolute certainty (an absolute error
is present, or the data is accurate to within
ε
) or a relative certainty (relative error is present,
d 0 %). Linearization of regression model is commonly applied
irrespective of the error structure. Most common excuse for knowingly taking the incorrect
step is only when liner regression was applied that the accompanying numerical problem
could be solved in a reasonable time frame. While this was a problem in the computer stone
age, one can no longer use it as an excuse today. In engineering, on the other hand, we often
consider the linear regression model because a straight line is visually pleasing when data are
scattered around it.
or the data is accurate to with
Search WWH ::




Custom Search