Databases Reference
In-Depth Information
Turns out that no matter how the ϵ s are distributed, the least
squares estimates that you already derived are the optimal
estimators for β s because they have the property of being un‐
biased and of being the minimum variance estimators. If you
want to know more about these properties and see a proof
for this, we refer you to any good topic on statistical inference
(for example, Statistical Inference by Casella and Berger).
So what can you do with your observed data to estimate the variance
of the errors? Now that you have the estimated line, you can see how
far away the observed data points are from the line itself, and you can
treat these differences, also known as observed errors or residuals ,as
observations themselves, or estimates of the actual errors, the ϵ s.
Define e i = y i y i = y i β 0 + β 1 x i
for i = 1, . . . , n .
Then you estimate the variance ( σ 2 ) of ϵ , as:
i e i 2
n −2
Why are we dividing by n -2? A natural question. Dividing
by n -2, rather than just n , produces an unbiased estimator .
The 2 corresponds to the number of model parameters. Here
again, Casella and Berger's topic is an excellent resource for
more background information.
This is called the mean squared error and captures how much the pre‐
dicted value varies from the observed. Mean squared error is a useful
quantity for any prediction problem. In regression in particular, it's
also an estimator for your variance, but it can't always be used or in‐
terpreted that way. It appears in the evaluation metrics in the following
section.
Evaluation metrics
We asked earlier how confident you would be in these estimates and
in your model. You have a couple values in the output of the R function
that help you get at the issue of how confident you can be in the esti‐
mates: p-values and R-squared. Going back to our model in R, if we
Search WWH ::




Custom Search