Univariate Regression - Common Errors in Statistics

Information Technology Reference

In-Depth Information

ESTIMATING COEFFICIENTS

Write down and confirm your assumptions before you begin.

In this section we consider problems and solutions associated with three

related challenges:

1. Estimating the coefficients of a model.

2. Testing hypotheses concerning the coefficients.

3. Estimating the precision of our estimates.

The techniques we employ will depend upon the following:

1. The nature of the regression function (linear, nonlinear, logistic).

2. The nature of the losses associated with applying the model.

3. The distribution of the error terms in the model—that is, the e' s.

4. Whether these error terms are independent or dependent.

The estimates we obtain will depend upon our choice of fitting func-

tion. Our choice should not be dictated by the software but by the nature

of the losses associated with applying the model. Our software may specify

a least-squares fit—most commercially available statistical packages do—

but our real concern may be with minimizing the sum of the absolute

values of the prediction errors or the maximum loss to which one will be

exposed.

Algorithms for least absolute deviation (LAD) regression are given in

Barrodale and Roberts [1973]. The qreg function of Stata provides for

LAD regression. The Blossom package available as freeware from

http://www.mesc.usgs.gov/blossom/blossom.html includes procedures for

LAD and quantile regression.

In the univariate linear regression model, we assume that

= (

) +

yE x

Y

e

where E denotes the mathematical expectation of Y given x and could be

any deterministic function of x in which the parameters appear in linear

form. e, the error term, stands for all the other unaccounted for factors

that make up the observed value y .

How accurate our estimates are and how consistent they will be from

sample to sample will depend upon the nature of the error terms. If none

of the many factors that contribute to the value of e make more than a

small contribution to the total, then e will have a Gaussian distribution. If

the {e i } are independent and normally distributed (Gaussian), then the

ordinary least-squares estimates of the coefficients produced by most

statistical software will be unbiased and have minimum variance.

Search WWH ::

Custom Search

Home