Information Technology Reference
In-Depth Information
of the significant variables are (almost) constant. (Alternately,
gather additional data for which the remainder of the significant
variables are almost constant.) Decide on a generalized linear
model form which best fits your knowledge of the causal relations
among the few variables on which you are now focusing. (A stan-
dard multivariate linear regression may be viewed as just another
form, albeit a particularly straightforward one, of generalized
linear model.) Fit this model to the data.
5. Select a second subset of the existing data (or gather an additional
data set) for which the remainder of the significant variables are
(almost) equal to a second constant. For example, if only men
were considered at stage four, then you should focus on women at
this stage. Attempt to fit the model you derived at the preceding
stage to these data.
6. By comparing the results obtained at stages four and five, you can
determine whether to continue to ignore or to include variables
previously excluded from the model. Only one or two additional
variables should be added to the model at each iteration of steps 4
through 6.
7. Always validate your results as described in the next chapter.
If all this sounds like a lot of work, it is. It takes several years to develop
sound models, even or despite the availability of lightning fast, multifunc-
tion statistical software. The most common error in statistics is to assume
that statistical procedures can take the place of sustained effort.
TO LEARN MORE
Inflation of R 2 as a consequence of multiple tests also was considered by
Rencher [1980].
Osborne and Waters [2002] review tests of the assumptions of multi-
variable regression. Harrell, Lee, and Mark [1996] review the effect of
violation of assumptions on GLMs and suggest the use of the bootstrap
for model validation. Hosmer and Lemeshow [2001] recommend the use
of the bootstrap or some other validation procedure before accepting the
results of a logistic regression.
Diagnostic procedures for use in determining an appropriate functional
form are described by Mosteller and Tukey [1977], Therneau and
Grambsch [2000], Hosmer and Lemeshow [2001], and Hardin and Hilbe
[2003].
Search WWH ::




Custom Search