Geology Reference
In-Depth Information
used as a tool for model selection. For a given data set, several competing models
may be ranked according to their AIC, with the one having the lowest AIC being
the best.
The general expression of AIC is
AIC
¼
2k
2 ln ð
L
Þ
ð 3 : 35 Þ
where k is the number of parameters in the statistical model, and L is the maximized
value of the likelihood function for the estimated model.
Let n be the number of observations, RSS is the residual sum of squares, and the
model errors are normally and independently distributed. Then
X
n
i¼1 e
2
i
RSS
¼
ð 3 : 36 Þ
when we incorporate an assumption that model errors is unknown but equal for
them all. When maximizing the likelihood with respect to this variance, the AIC
becomes as follows:
2
p
RSS
n
AIC
¼
2k
þ
in ln
þ
1
ð 3 : 37 Þ
where k is the number of parameters in the statistical model, n is the number of
observations, and RSS is the residual sum of squares.
Increasing the number of free parameters to be estimated, it would improve the
quality of
fit, regardless of the number of free parameters in the data generating
process. Hence, AIC not only rewards quality of
fit but also includes a penalty
which is an increasing function of the number of estimated parameters. This penalty
discourages over
tting. The preferred model is the one with the lowest AIC value.
The AIC methodology attempts to
find the model which best explains the data with
a minimum of free parameters. By contrast, the more traditional approaches to
modeling start from a null hypothesis. The AIC penalizes free parameters less
strongly than does the Schwarz criterion.
Kennedy [ 44 ]de
nes AIC as
AIC
¼ ln SSE
ð
=
n
Þþ
2k
=
n
ð 3 : 38 Þ
where k is the number of regressors in the model, n is the sample size (observa-
tions), and SSE is the Sum of Squares of the Residuals.
The usefulness of the training data is increased when the number of parameters
in the model is increased but it might result in an overtraining problem if the
number of parameters is too large. In order to overcome this problem, one can use
the BIC (parametric method), which is a statistical criterion for model selection.
 
Search WWH ::




Custom Search