Biology Reference
In-Depth Information
not involve the possibility of misclassification of the new observation
into the 'wrong' class, whereas the probabilistic classification includes
the possibility of misclassification. This feature makes probabilistic
classification more general, less certain, and more interesting.
Let us consider the probabilistic classification approach for land-
mark coordinate data. Suppose we have C different classes. The classes
might correspond, for example, to different classes (families, clades,
species, etc.) of Early Eocene Notharctinae, each individual represent-
ed by a fossilized first mandibular molar. Suppose further that for each
class a mean form and variance exists representing the first molar. Let
us assume that the variance can be written in the Kronecker product
form: V=
K,i ) denote the mean and variance of the i -
th class. Now, suppose we discover an additional first molar in a
museum drawer and suspect that the individual represented by this
tooth might belong to one of the known classes. In order to classify this
tooth, we collect landmark coordinate data from surface features. The
question is: to which of the C classes does this individual most likely
belong? Notice several important features of this situation. First, we
assume that the individual has to come from one of the C classes and
no other class. Second, we assume that the mean and the variance for
each class are completely known a priori . The first assumption may be
justified based on the knowledge and expertise of the scientist. The sec-
ond assumption may be justified based on the estimated means and
variances of the previously classified fossil samples whose phylogenet-
ic relationships and class membership are confirmed. However, in
practice one or both of these assumptions may be violated.
I D . Let ( M i ,
K
6.2 Methods of classification
1.
Likelihood-based classification
Suppose we accept the two assumptions as reasonable.
Suppose further that there are two known classes. Now sup-
pose, under the usual Gaussian perturbation model, we
calculate the probability that the new individual belongs to
the first class. Let us say, for example, this probability is 0.3.
Similarly we calculate the probability that the new individu-
al belongs to the second class and find this probability to
be 0.8. Naturally we would assign that individual to the cate-
gory that has the highest probability, in this case we would
Search WWH ::




Custom Search