Information Technology Reference
In-Depth Information
Discussion
Multiple linear regression is the obvious generalization of simple linear regression. It
allows multiple predictor variables instead of one predictor variable and still uses OLS
to compute the coefficients of a linear equation. The three-variable regression just
shown corresponds to this linear model:
y i = β 0 + β 1 u i + β 2 v i + β 3 w i + ε i
R uses the lm function for both simple and multiple linear regression. You simply add
more variables to the righthand side of the model formula. The output then shows the
coefficients of the fitted model:
> lm(y ~ u + v + w)
Call:
lm(formula = y ~ u + v + w)
Coefficients:
(Intercept) u v w
1.4222 1.0359 0.9217 0.7261
The data parameter of lm is especially valuable when the number of variables increases,
since it's much easier to keep your data in one data frame than in separate variables.
Suppose your data is captured in a data frame, such as the dfrm variable shown here:
> dfrm
y u v w
1 6.584519 0.79939065 2.7971413 4.366557
2 6.425215 -2.31338537 2.7836201 4.515084
3 7.830578 1.71736899 2.7570401 3.865557
4 2.757777 1.27652888 0.4191765 2.547935
5 5.794566 0.39643488 2.3785468 3.265971
6 7.314611 1.82247760 1.8291302 4.518522
7 2.533638 -1.34186107 2.3472593 2.570884
8 8.696910 0.75946803 3.4028180 4.442560
9 6.304464 0.92000133 2.0654513 2.835248
10 8.095094 1.02341093 2.6729252 3.868573
.
. (etc.)
.
When we supply dfrm to the data parameter of lm , R looks for the regression variables
in the columns of the data frame:
> lm(y ~ u + v + w, data=dfrm)
Call:
lm(formula = y ~ u + v + w, data = dfrm)
Coefficients:
(Intercept) u v w
1.4222 1.0359 0.9217 0.7261
 
Search WWH ::




Custom Search