Database Reference
In-Depth Information
.
.
.
+ WestVirginia,
+ Wisconsin,
income_input)
The input file would have 49 columns added for these variables representing each
of the first 49 states. If a person was from Alabama, the Alabama variable would
be equal to 1, and the other 48 variables would be set to 0. This process would be
applied for the other state variables. So, a person from Wyoming, the one state not
explicitly stated in the model, would be identified by setting all 49 state variables
equal to 0. In this representation, Wyoming would be considered the reference
case, and the regression coefficients of the other state variables would represent
the difference in income between Wyoming and a particular state.
Confidence Intervals on the Parameters
Once an acceptable linear regression model is developed, it is often helpful to use
it to draw some inferences about the model and the population from which the
observations were drawn. Earlier, we saw that t -tests could be used to perform
hypothesis tests on the individual model parameters, , j = 0, 1, …, p - 1.
Alternatively, these t -tests could be expressed in terms of confidence intervals
on the parameters. R simplifies the computation of confidence intervals on the
parameters with the use of the confint() function. From the Income example,
the following R command provides 95% confidence intervals on the intercept and
the coefficients for the two variables, Age and Education .
confint(results2, level = .95)
2.5 % 97.5 %
(Intercept) 2.9777598 10.538690
Age 0.9556771 1.036392
Education 1.5313393 1.985862
Based on the data, the earlier estimated value of the Education coefficient was
1.76. Using confint() , the corresponding 95% confidence interval is (1.53, 1.99),
which provides the amount of uncertainty in the estimate. In other words, in
repeated random sampling, the computed confidence interval straddles the true
Search WWH ::




Custom Search