Biology Reference
In-Depth Information
Graphs of F-distribution densities for different choices of M and N are
shown in Figure 4-2(C).
We now give an intuitive explanation of how these probability
definitions relate to common problems of statistical testing. First,
statisticians typically assume that their observations are collected by
independent measurements of a random variable having a normal
distribution. In this case, it is said the data have been derived from a
population described by a normally distributed random variable. There
are numerous statistical procedures that validate this assumption, and it
is always important to check the normality of your sample before
applying statistical tests, such as the t-test or F-test described below.
If the normality assumption does not hold, the results from some
statistical tests could be misleading.
II. RELATION OF PROBABILITY DISTRIBUTIONS
TO STATISTICAL TESTING
Returning to the question of quantitative traits and the polygenic
hypothesis, suppose that an agricultural company is attempting to alter
a type of corn to produce a new variety (B) that will be superior to the
original variety (A). First, we need to specify what we mean by superior.
It could mean higher yield, greater resistance to drought or disease,
less-intensive soil preparation or cultivation, or easier harvest. In this
case, let us suppose we wish to improve the yield of the plants. Like
height and weight in humans, we have reason to suspect corn yield is
well described by a normal distribution; thus, we shall assume that the
normality assumption holds.
Plant
No.
A
(Yield)
Plant
No.
B
(Yield)
1
2.1
1
2.4
2
2.6
2
2.5
TABLE 4-2.
The trial 1 data from an imaginary experiment
involving two plants each of the varieties A and B.
The next step would be to design an experiment and collect data. This
may sound simple, but several factors need to be considered. Would you
take one of each type, plant them side-by-side under the specified
conditions, and see which produces a higher yield? That may sound
appealing (it is certainly simple), but one can never be sure that every
plant of the same type will have exactly the same yield. In fact, it is very
likely there will be some variance between plants of the same type,
even under the same conditions.
Plant
No.
A
(Yield)
Plant
No.
B
(Yield)
1
2.2
1
2.4
2
2.8
2
2.8
3
1.9
3
3.1
4
3.2
4
2.6
Suppose, then, that we run trials with different numbers of plants and
record the data in Tables 4-2 and 4-3.
5
2.6
5
2.5
6
2.1
6
2.8
Would you have more confidence making a conclusion based on the first
or second trial? What we have done in both trials is take a sample from
each of the plant varieties. What we are attempting to do with the
sample is estimate the yields from the population. The fact we used more
plants in the second trial would probably engender more confidence in
its representation of the population. In other words, sample size is an
important factor in determining our confidence in the outcome (the
7
2.7
7
3.2
8
2.4
8
3.4
9
2.5
9
2.9
10
2.0
10
2.7
TABLE 4-3.
The trial 2 data from an imaginary experiment
involving 10 plants each of the varieties A and B.
Search WWH ::




Custom Search