Database Reference
In-Depth Information
EXERCISE
For this chapter's exercise, you will compile your own data set based on people you know and the
cars they drive, and then create a linear discriminant analysis of your data in order to predict
categories for a scoring data set. Complete the following steps:
1) Open a new blank spreadsheet in OpenOffice Calc. At the bottom of the spreadsheet
there will be three default tabs labeled Sheet1, Sheet2, Sheet3. Rename the first one
Training and the second one Scoring. You can rename the tabs by double clicking on their
labels. You can delete or ignore the third default sheet.
2) On the training sheet, starting in cell A1 and going across, create attribute labels for six
attributes: Age, Gender, Marital_Status, Employment, Housing, and Car_Type.
3) Copy each of these attribute names except Car_Type into the Scoring sheet.
4) On the Training sheet, enter values for each of these attributes for several people that you
know who have a car. These could be family members, friends and neighbors, coworkers
or fellow students, etc. Try to do at least 20 observations; 30 or more would be better.
Enter husband and wife couples as two separate observations, so long as each spouse has a
different vehicle. Use the following to guide your data entry:
a. For Age, you could put the person's actual age in years, or you could put them in
buckets. For example, you could put 10 for people aged 10-19; 20 for people aged
20-29; etc.
b. For Gender, enter 0 for female and 1 for male.
c. For Marital_Status, use 0 for single, 1 for married, 2 for divorced, and 3 for
widowed.
d. For Employment, enter 0 for student, 1 for full-time, 2 for part-time, and 3 for
retired.
e. For Housing, use 0 for lives rent-free with someone else, 1 for rents housing, and 2
for owns housing.
f. For Car_Type, you can record data in a number of ways. This will be your label, or
the attribute you wish to predict. You could record each person's car by make (e.g.
 
Search WWH ::




Custom Search