Information Technology Reference
In-Depth Information
1.23 Diagnosing a Linear Regression
Problem
You have performed a linear regression. Now you want to verify the model's quality
by running diagnostic checks.
Solution
Start by plotting the model object, which will produce several diagnostic plots:
> m <- lm(y ~ x)
> plot(m)
Next, identify possible outliers either by looking at the diagnostic plot of the residuals
or by using the outlier.test function of the car package:
> library(car)
> outlier.test(m)
Finally, identify any overly influential observations (by using the influence.measures
function, for example).
Discussion
R fosters the impression that linear regression is easy: just use the lm function. Yet fitting
the data is only the beginning. It's your job to decide whether the fitted model actually
works and works well.
Before anything else, you must have a statistically significant model. Check the F sta-
tistic from the model summary ( Recipe 1.22 ) and be sure that the p -value is small
enough for your purposes. Conventionally, it should be less than 0.05, or else your
model is likely meaningless.
Simply plotting the model object produces several useful diagnostic plots:
> m <- lm(y ~ x)
> plot(m)
Figure 1-7 shows diagnostic plots for a pretty good regression:
• The points in the Residuals vs Fitted plot are randomly scattered with no particular
pattern.
• The points in the Normal Q-Q plot are more-or-less on the line, indicating that
the residuals follow a normal distribution.
• In both the Scale-Location plot and the Residuals vs Leverage plots, the points are
in a group with none too far from the center.
 
Search WWH ::




Custom Search