Agriculture Reference
In-Depth Information
8.5.4 Canonical variate analysis
In a univariate ANOVA, an F-test establishes evidence of differences among groups,
and pairwise comparisons can then be applied to identify which groups are different
and the pattern of the differences. However, in a multivariate case, this process
becomes more complex and the representation of the differences may require two
or more dimensions. Canonical variate analysis is a useful approach to determine
the extent and nature of differences among groups in the multivariate case (Hair
et al. , 1998). Suppose there are p disease-related variables, y 1 , y 2 ,…, y p and
consider a linear transformation of these variables to a new variable,
c 1
a 1 p y p . The c 1 can be thought of as the first of a series of new
axes that provide an alternative coordinate system for defining the position of the
data points. The c 1 is called the first canonical variate if the set of a 11 , a 12 ,…, a 1p
values maximise the between group (or treatments) F-statistics on the c 1 -axis. This
process is repeated to form a second canonical variate c 2 on the condition that c 2
must use only information in the data that has not been used in the formation of c 1 .
This process continues until all canonical variates are computed. The important
aspect of canonical variate analysis is to identify and interpret the significant
canonical variables. Identification of significant canonical variables is usually
achieved by the between group F-statistics and the cumulative percentage of
variance explained. Interpretation of canonical variables in terms of original
variables is via a 11 , a 12 ,…, and a 1p . If the first two canonical variables explained
most variation, differences between groups can be easily visualised by plotting c 1
against c 2 .
=
a 11 y 1
+
a 12 y 2
+
+
8.5.5 Discriminant function/logistic regression
Often temporal disease data for several treatments are collated. These data could
refer to the development of the same disease in different management systems
(chemical, cultural etc.), or on different cultivars, or in different locations; these data
may also be obtained from different diseases subjected to the same treatment
(conditions). We are interested not only in whether there are significant differences
in the temporal disease patterns but also in whether it is possible to assign epidemics
correctly to the predefined treatment categories. There are two distinct approaches
that have the same aims and provide the same type of information but come from
different statistical viewpoints. These are the discriminant analysis approach, which
has the same basis as that underlying multivariate analysis of variance, and the
logistic regression approach.
The presumption underlying the discriminant analysis approach is that the set of
measured or derived disease data define the subjects. The probability of membership
of the defined group is estimated on the basis of canonical variables: it is calculated
as the total Mahandian distance in all canonical variables of the subject with the
group concerned relative to the total Mahandian distance in all canonical variables
of the subject with all the groups. Discriminant function has been used in plant
pathological research to describe the relationship of physical and biological
Search WWH ::




Custom Search