Statistical Issues in Nutritional Modelling - Nutritional Modelling for Pigs and Poultry

Agriculture Reference

In-Depth Information

They are generally assumed to have a multi-

variate normal distribution, although other

multivariate distributions could be handled,

at least theoretically. The multivariate normal

assumption implies that the conditional dis-

tributions are normal and that the elements

of B are possibly correlated. Thus, the pre-

diction error of any model includes a compo-

nent associated with b , but also the variances

and covariances associated with the esti-

mated B . The analytical derivation or the

simulation of the model errors, even in a fre-

quentist paradigm, must include the variance

and covariance in the B as well as the vari-

ance in the e . This is easily illustrated using

a simple example.

Suppose we want to predict the value of

a variable Y using measurements on another

variable X . Suppose also that a simple linear

model is adequate in this case. The model for

the whole population is:

with the values of the p regressor variables

for all n observations; and σ 2 is the residual

variance.

In Eqn 5.8, ( X T X ) −1 is a p × p matrix con-

taining the variance of B on the diagonal,

and their covariances on the off-diagonals.

Unless the design matrix X has a unique

structure (orthogonal), the elements of B are

correlated.

The variance of an individual observa-

tion in a linear model is calculated as:

VAR ( Y 0 ) = X T ( X T X ) −1 X 0 σ 2 + σ 2

(5.9)

In Eqn 5.9, the first term on the left-hand side

represents the variance due to the B , whereas

the second term represents the variance due

to the residual error. The covariances in the

first term can be made equal to zero by the

selection of an orthogonal design (Mead and

Pike, 1975). This is useful when an experi-

ment is specifically designed for model para-

meterization.

The Bayesian view is different from the

frequentist view, in that the b are explicitly

considered random parameters. Bayesian

statistics allow the calculation of the posterior

distribution of the b from a prior distribu-

tion coupled with a set of observations. In

the Bayesian framework, the stochasticity of

the parameters is implicit. The frequentist

approach that we just described is in fact

equivalent to a Bayesian approach with a

non-informative prior distribution (i.e. all

values of the parameters are equally likely).

In this instance, the posterior distribution is

entirely determined by the observations

(data). In Bayesian statistics, the posterior is

nothing more than a conditional distribution

for the parameters given the data. In a frequen-

tist framework, it is because the parameters

have to be estimated from data that we must

account for the uncertainty and the distri-

bution of these parameter estimates when

simulating data. In a Bayesian framework,

parameters themselves have a distribution.

Consequently, any simulation of the model

must therefore include their distributional

properties.

The third source of stochasticity in Eqn 5.4

comes from the errors in the predictors (or

input variables) X . This source of errors can

usually be modelled using the Monte Carlo

Y = b 0 + b 1 X + e

(5.5)

A set of n observations of Y and X are made.

From this set, estimates of b 0 and b 1 are

made, labelled B 0 and B 1 , respectively. The

model for the sample data is:

Y = B 0 + B 1 X + e

(5.6)

The estimates B 0 and B 1 are calculated from

the data. The variance of a prediction (mean

value) is then given by the following equa-

tion (Draper and Smith, 1998):

( )

å-

() =

VARY S

(5.7)

(

)

At the mean value of X (i.e. X ), the variance

of prediction is minimized at a value of S 2 / n .

The last term in Eqn 5.7 gets larger the fur-

ther away the prediction is from the mean of

the predictor variable. This explains the

double funnel shape for the prediction error

of any linear models (Draper and Smith,

1998). Equation 5.7 applies only to simple

(single predictor) linear regression models.

A general equation is available for the predic-

tion (mean values) of any linear model:

VAR ( Y 0 ) = X T ( X T X ) −1 X 0 σ 2

(5.8)

Where X 0 is a p × 1 vector with the values of

the regressor variables; X is an n × p matrix

Nutritional Modelling for Pigs and Poultry

Search WWH ::

Custom Search

Home