Genetics and Plant Breeding - An Introduction to Plant Breeding

Agriculture Reference

In-Depth Information

Variation between data sets

Two basic procedures are frequently used in quantita-

tive genetics to interpret the variation and relationship

that exists between characters, or between one character

evaluated in different environments. These are simple

linear regression and correlation.

A straight line regression can be adequately described

by two estimates, the slope or gradient of the line ( b )

and the intercept on the y -axis ( a ). These are related by

the equation:

The comparable equations for SS( x )are:

(

) =

1 (

x i −¯

)

x i ) −

(

) =

1 (

x i )

Notice that a sum of squares is really a special case of

a sum of products. You should also note that if every y

value is exactly equal to every x value, then the equation

used to estimate b becomes, SS( x )/SS( x

Having determined b , the intercept value is found

by substituting the mean values of x and y into the

rearranged equation.

) =

It can be seen that b is the gradient of the line, because

a change of one unit on the x -axis results in a change of

b units on the y -axis. If x and y both increase (or both

decrease) together, the gradient is positive. If, however,

x increases while y decreases or vice versa, then the gra-

dient is negative. When x

=¯

−

In regression analysis it is always assumed that one

character is the dependant variable and the other is

independent. For example, it is common to com-

pare parental performance with progeny performance

(see Chapter 6) and in this case then progeny perfor-

mance would be considered the dependant variable and

parental performance independent. The performance

of progeny is obviously dependent on the performance

of their parents, and not vice versa .

The degree of association between any two, or a num-

ber of different characters can be examined statistically

by the use of correlation analysis . Correlation analysis

is similar in many ways to simple regression but in cor-

relations there is no need to assign one set of values to

be the dependant variable while the other is said to be

the independent variable . Correlation coefficients ( r

0, the equation for y

reduces to:

and a is therefore the point at which the regression line

crosses the y -axis. This intercept value may be equal to,

greater than or less than zero.

The formulation and theory behind regression anal-

ysis will not be described here and are not within

the scope of this topic. However, the gradient of the

best fitting straight line (also known as the regression

coefficient ) for a collection of points whose coordinates

are x and y is estimated as:

(

x , y

(

) ]

)

are calculated from the equation:

where, SP( x , y )isthe sum of products of the deviations

of x and y from their respective means (

(

)

x , y

x and

y ) and

√ [

) ]

where SP( x , y ) is again the sum of products between the

two variables, SS( x ) is the sum of squares of one variable

( x ) and SS( y ) is the sum of squares of the second variable

( y ), and:

(

) ×

(

)

is the sum of the squared deviations of x from its

mean. It will be useful to have an understanding of the

regression analysis and to remember the basic regression

equations.

Now, SP( x , y ) is given by the equation:

(

x , y

) =

1 (

x i −¯

)(

y i −¯

)

−

)

(

x , y

) =

1 (

x i −¯

)(

y i −¯

)

although in practice it is usually easier to calculate it

using the equation:

(

) =

1 (

x i −¯

)

(

−

)

(

x , y

) =

1 (

x i y i ) −

1 (

x i )

1 (

y i )

(

) =

1 (

y i −¯

)

(

−

)

An Introduction to Plant Breeding

Search WWH ::

Custom Search

Home