Information Technology Reference
In-Depth Information
hood argument, the within subject correlation is introduced in the esti-
mating equation itself. The correlation parameters are then nuisance para-
meters and can be estimated separately. (See also Hardin and Hilbe,
2003.)
Underlying the population-averaged GEE is the assumption that one is
able to specify the correct correlation structure. If one hypothesizes an
exchangeable correlation and the true correlation is time-dependent, the
resulting regression coefficient estimator is inefficient. The naive variance
estimates of the regression coefficients will then produce incorrect confi-
dence intervals. Analysts specify a correlation structure to gain efficiency in
the estimation of the regression coefficients, but typically calculate the
sandwich estimate of variance to protect against misspecification of the
correlation. 3 This variance estimator is more variable than the naive vari-
ance estimator, and many analysts do not pay adequate attention to the
fact that the asymptotic properties depend on the number of subjects (not
the total number of observations).
HLM. This includes hierarchical linear models, linear latent models, and
others. While previous models are limited for the most part to a single
effect, HLM allows more than one. Unfortunately, most commercially
available software requires one to assume that each random effect is
Gaussian with mean zero. The variance of each random effect must be
estimated.
Mixed Models. These allow both linear and nonlinear mixed effects
regression (with various links). They allow you to specify each level of
repeated measures. Imagine: districts: schools: teachers: classes: students.
In this description, each of the sublevels is within the previous level and
we can hypothesize a fixed or random effect for each level. We also
imagine that observations within same levels (any of these specific levels)
are correlated.
The caveats revealed in this and the previous chapter apply to the
GLMs. The most common sources of error are the use of an inappropriate
or erroneous link function, the wrong choice of scale for an explanatory
variable (for example, using x rather than log[ x ]), neglecting important
variables, and the use of an inappropriate error distribution when comput-
ing confidence intervals and p values. Firth [1991, pp. 74-77] should be
consulted for a more detailed analysis of potential problems.
REPORTING YOUR RESULTS
In reporting the results of your modeling efforts you need to be explicit
about the methods used, the assumptions made, the limitations on your
3
See Hardin and Hilbe [2003, p. 28] for a more detailed explanation.
Search WWH ::




Custom Search