Biology Reference
In-Depth Information
where the sums are taken over all specimens. When variables are highly correlated we can predict one from the
other (e.g. Y from X ), and the more highly correlated they are, the better our predictions will be. Uncorrelated
variables are considered independent. See also Covariance.
Covariance Like correlation, a measure of the association between variables. The sample estimate of the covari-
ance between X and Y is:
X ð
1
S XY 5
X
2
X mean Þð
Y
2
Y mean Þ
N
1
2
where the summation is over all N specimens.
Curved space A metric space in which the distance measure is not linear. The ordinary rules of Euclidean geome-
try do not apply in such spaces. The consequences of the curvature depend upon the distance between points; we
can treat the surface of the earth as flat as long as the maps cover only small areas, but in long-distance naviga-
tion, the curvature must be taken into account. Shape space is curved, so the rules of Euclidean geometry do not
apply, which is why shapes are mapped onto a Euclidean space tangent to shape space. The reference form is the
point of tangency of the linear tangent space to the underlying curved space, so using the mean of all specimens
as the reference forms acts to minimize the differences between the two spaces.
D A generalized statistical distance between means of two groups (X1 and X2) relative to the variance within the
groups:
q
ð
T S 2 1
p
D
X1
X2
Þ
ð
X1
X2
Þ
5
2
2
where ( ) T refers to the transpose of the enclosed matrix, and S 2 1
p
covariance
matrix. This distance takes into account the correlations among variables when computing the distance between
means. The generalized distance is used in Hotelling's T 2 -test. Also known as the Mahalanobis' distance.
D 2 The squared generalized distance, D. See D.
Deformation A smooth, continuous mapping or transformation; in morphometrics, it is usually the transforma-
tion of one shape into another. The deformation refers not only to the change in positions of landmarks, but also
to the interpolated changes in locations of unanalyzed points between landmarks (Chapter 5).
Degrees of freedom In general, the number of independent pieces of information. In statistical analyses, the total
degrees of freedom are approximately the product of the number of variables and the number of individuals (the
total may be partitioned into separate components for some tests). If every measurement on every individual
were completely independent, the degrees of freedom would be the product of the number of variables and the
number of individuals, but if one statistic is known (or estimated), the number of degrees of freedom that remain
to estimate a second statistic will be reduced. For example, the estimate of the mean height of N individuals in a
sample will have N
is the inverse of the pooled variance
N degrees of freedom, because all N measurements are needed and there is only one
measured variable. In contrast, the estimate of the variance in height will have N
1
3
5
1 degrees of freedom because
2
only N
1 deviations from mean height are independent (the deviation of the N th individual can be calculated
from the mean and the other N
2
1 observed heights). In geometric morphometrics, when configurations of land-
marks are superimposed, degrees of freedom are lost for a different reason; namely, information that is not rele-
vant to comparison of shapes (location, scale and rotation) is removed from the coordinates.
Dilation Opposite of Contraction.
Dimensionality reduction There is a common need to reduce the dimensionality of a data set, both for display
and to reduce the number of variables used to less than the degrees of freedom in the data, thus allowing inver-
sion of a variance
2
covariance matrix. If a PCA is performed on the data, and the scores corresponding to all PCs
with non-zero eigenvalues are retained, and the rest discarded, the degrees of freedom in the remaining scores
will equal the degrees of freedom in the data.
Discriminant function The linear combination of variables optimally discriminating between two groups. It is
produced by discriminant function analysis. Scores on the discriminant function can be used to identify members
of the groups (Chapter 14).
Discriminant function analysis A two-group canonical variates analysis. See Canonical variates analysis
(Chapter 14).
Search WWH ::




Custom Search