Geology Reference
In-Depth Information
The BIC is an asymptotic result derived under the assumptions that the data dis-
tribution is in the exponential family. Let
þ
RSS
n
BIC
¼
n ln
K ln n
ðÞ
ð 3 : 39 Þ
t, or both.
The BIC penalizes free parameters more strongly than does the Akaike information
criterion. AIC and BIC values are used here for model input selection, since the
more inputs, the more parameters of the model would have.
Kennedy [ 44 ]de
Hence, lower BIC implies either fewer explanatory variables, better
nes the BIC as
BIC
¼ ln SSE
ð
=
n
Þþ
ð
k ln n
Þ=
n
ð 3 : 40 Þ
where k is the number of regressors in the model, n is the sample size (observa-
tions), and SSE is the Sum of Squares of the Residuals.
3.4 Implementation of Cluster Analysis
Cluster analysis is an investigative data analysis tool widely used for solving
classi
fields, including biology, statistics,
pattern recognition, astronomy, archaeology, medicine, chemistry, education, psy-
chology, hydrology, linguistics, sociology, machine learning, and data mining. It
works on the principle of degree of association. Degree of association will be high
between members of the same cluster and low among members of different clusters.
Formal de
cation problems in different scienti
c
nition of a cluster, group, or class is dif
cult and is often down to the
judgment of the user [ 8 ].
Cormack [ 19 ] and Gordon [ 28 ] talk of internal cohesion and external isolation in
de
ning clusters. No single de
nition is suitable for all situations since the nature of
the clusters can vary signi
cantly. The term cluster analysis (CA) was coined by
Tryon [ 76 ] to emphasis different methodical algorithms and approaches for
grouping objects of the same manner into respective categories. The usefulness of
clustering depends on the goal of the data analysis. There are different notions of a
cluster to prove its usefulness in practice. Different cluster types could be classi
ed
as well-separated, prototype-based, graph-based, density-based, and shared-prop-
erty based. The two-dimensional illustration of different clusters are shown in
Fig. 3.3 . In reviews of the general categories of CA methods, one can
nd three
major kinds of clustering, namely hierarchical clustering, partitional clustering
(k-means clustering), and two-way clustering (co-clustering or bolstering).
 
Search WWH ::




Custom Search