Biology Reference
In-Depth Information
FORDISC 3.0
(FD3) ( Jantz and Ousley, 2005 ). One part of properly utilizing FD3 is appreci-
ating what, exactly, FD3 is doing. Fordisc uses discriminant function analysis (DFA) to clas-
sify an unknown individual into one of several reference populations and is, by and large, the
most widely used classification statistic in forensic anthropology, particularly when the data
are continuous.
Giles and Elliot (1962, 1963) first used a DFA on crania to determine sex and race for Amer-
ican White, American Black, and Amerindian 18 crania. Linear discriminant function analysis
was developed as a means to classify a target individual (e.g., unknown crania) into one of
several reference groups by incorporating a similar mathematical approach to regression
analysis ( Krzanowski, 2002 ). Whereas regression analysis uses a weighted combination of
predictor variables to calculate some object's value (e.g., stature from measurements of the
postcranial skeleton), DFA uses a weighted combination of those predictor variables to clas-
sify an unknown object into a reference group based on a distance statistic. The discriminant
function score is a derived variable ( Krzanowski, 2002 ), which is equal to the weighted sum
of values for each variable.
The most common distance statistic employed in forensic anthropological research and
classification is Mahalanobis distance (D 2 ), which is a distance measure similar in practice
to Euclidean distance (the “ordinary” distance between two points as one would measure
with a ruler), but that is not affected by scale or correlation ( Krzanowski, 2002 ). Unlike
Euclidean distance, D 2 is based on the covariance between variables and is used to measure
the similarity (as the distance from a group centroid 19 ) between unknown and known indi-
viduals. When interpreting the D 2
value, smaller distances equate to more similar
individuals.
The statistical assumptions associated with DFA include multivariate normality and homo-
geneity of variances/covariances. Multivariate normality is one of the most common assump-
tions in statistics, as many tests and statistics are related to the normal distribution (think
bell curve here). Generally, testing for multivariate normality is testing for univariate and
bivariate normality, that is, testing to see that each variable is normally distributed and, like-
wise, that all pairs of variables are bivariate normal using one- and two-dimensional plots
(i.e., histograms and scatterplots). In practice, this is generally sufficient for testing for multi-
variate normality, especially when using DFA as that method is relatively robust against
deviations from multivariate normality. Other more robust methods to test for multivariate
normality exist, but are beyond the scope of this work (cf., Mardia's statistic of multivariate
skewness/kurtosis [ Mardia, 1970 ] or the Doornik-Hansen multivariate normality test [Door-
nik-Hansen, 2008]).
The second assumption involves whether there is homogeneity of variances/covariances
(or, testing that the level of variation in each group is relatively similar) and testing for this is
also relatively straightforward. There are a variety of tests for homogeneity. In FD3, homoge-
neity among samples is tested using the Kullback (1959) test for homogeneity. If the level of
18 These were the terms originally used by Giles and Elliot and also are terms used by FORDISC and the
Forensic Databank. We will use the same terms when referring to FORDISC in this section to stay consistent
with its terminology.
19 The group centroid is the point that represents the mean for all variables in the multivariate space defined
by the variables in the model.
Search WWH ::




Custom Search