Information Technology Reference
In-Depth Information
In this case, the regression equation is:
y
i
= 17.72 + 3.25
x
i
+
ε
i
It is quite common for data to be captured inside a data frame, in which case you want
to perform a regression between two data frame columns. Here,
x
and
y
are columns
of a data frame
dfrm
:
>
dfrm
x y
1 0.04781401 5.406651
2 1.90857986 19.941568
3 2.79987246 23.922613
4 4.46755305 32.432904
5 3.76490363 44.259268
6 5.92364632 61.151480
7 8.04611587 26.305505
8 7.11097986 43.606087
9 9.73645966 58.262112
10 9.19324543 57.631029
.
.
(etc.)
.
The
lm
function lets you specify a data frame by using the
data
parameter. If you do,
the function will take the variables from the data frame and not from your workspace:
>
lm(y ~ x, data=dfrm)
# Take x and y from dfrm
Call:
lm(formula = y ~ x, data = dfrm)
Coefficients:
(Intercept) x
17.72 3.25
1.21 Performing Multiple Linear Regression
Problem
You have several predictor variables (e.g.,
u
,
v
, and
w
) and a response variable (
y
). You
believe there is a linear relationship between the predictors and the response, and you
want to perform a linear regression on the data.
Solution
Use the
lm
function. Specify the multiple predictors on the righthand side of the for-
mula, separated by plus signs (
+
):
>
lm(y ~ u + v + w)