Database Reference
In-Depth Information
7 7 75 53 15 0
8 8 76 56 13 0
9 9 56 42 15 1
10 10 53 33 11 1
Each person in the sample has been assigned an identification number, ID .
Income is expressed in thousands of dollars. (For example, 113 denotes $113,000.)
As described earlier, Age and Education are expressed in years. For Gender , a 0
denotes female and a 1 denotes male. A summary of the imported data reveals that
the incomes vary from $14,000 to $134,000. The ages are between 18 and 70 years.
The education experience for each person varies from a minimum of 10 years to a
maximum of 20 years.
summary(income_input)
ID Income Age Education
Min. : 1.0 Min. : 14.00 Min. :18.00 Min. :10.00
1st Qu.: 375.8 1st Qu.: 62.00 1st Qu.:30.00 1st Qu.:12.00
Median : 750.5 Median : 76.00 Median :44.00 Median :15.00
Mean : 750.5 Mean : 75.99 Mean :43.58 Mean :14.68
3rd Qu.:1125.2 3rd Qu.: 91.00 3rd Qu.:57.00 3rd Qu.:16.00
Max. :1500.0 Max. :134.00 Max. :70.00 Max. :20.00
Gender
Min. :0.00
1st Qu.:0.00
Median :0.00
Mean :0.49
3rd Qu.:1.00
Max. :1.00
As described in Chapter 3, a scatterplot matrix is an informative tool to view the
pair-wise relationships of the variables. The basic assumption of a linear regression
model is that there is a linear relationship between the outcome variable and the
input variables. Using the lattice package in R, the scatterplot matrix in Figure
6.4 is generated with the following R code:
library(lattice)
splom(˜income_input[c(2:5)], groups=NULL, data=income_input,
Search WWH ::




Custom Search