Geoscience Reference
In-Depth Information
where L max is the likelihood for the data evaluated at the point where it is
maximized, and K is the number of estimated parameters for a model. The
AIC value is found for a range of plausible models, and the model selected
is the one for which AIC is a minimum. Models with many parameters are
penalized by the term 2 K . To be selected, they must have a much higher like-
lihood than models with a small number of parameters.
8.6.6 Overdispersion
Overdispersion is a well-known phenomenon with count data. It occurs when
the variation in the data is more than can be expected on the basis of the model
being considered. With mark-recapture data, overdispersion can arise because
the recapture patterns are not independent for different animals or because
parameters that are assumed to be constant are really varying.
The simplest way to allow for overdispersion involves assuming that all
variances are increased by the same variance inflation factor (VIF) c , but
the assumed model is otherwise correct. Then, c can be estimated from the
data, and simple adjustments can be made to various types of analysis. In the
mark-recapture context, this can be done by taking the estimate
ˆ = X ²/ df
(8.26)
where X ² is the overall test statistic obtained from the goodness-of-fit TEST 2
and TEST 3 (described in the next section), with df degrees of freedom.
An explanation for why the use of a VIF may be effective is based on the
idea of what would happen if a data set was artificially increased in size
by cloning. If the mark-recapture pattern of each animal is entered into the
data set twice instead of once, then all of the variances of parameters will be
halved and ˆ will be doubled in comparison with what is obtained for the
original data set. More generally, if each animal is entered into the data set R
times, then all of the variances will be divided by R and ˆ will be multiplied
by R in comparison with what is obtained from the original data set. The
VIF can therefore be interpreted as the amount by which the size of a data
set is larger than it would be for a set for which all animals provide inde-
pendent results. The use of Equation (8.26) to estimate the VIF is justified by
the fact that the expected value of ˆ is approximately one if the assumptions
of the fitted model are correct and all animals provide independent data.
Therefore, if the assumptions of the model are correct but the animals do not
provide independent data, then the expected value of ˆ is c .
In reality, the reasons for overdispersion will usually be much more
complicated than just the duplication of the results for individual animals.
Nevertheless, it can be hoped that use of ˆ will be effective in allowing for
heterogeneity in data in cases for which this is needed.
Search WWH ::




Custom Search