Database Reference
In-Depth Information
REVIEW QUESTIONS
1) What is the appropriate data type for independent variables (predictor attributes) in logistic
regression? What about for the dependent variable (target or label attribute)?
2) Compare the predictions for Row 15 and 669 in the chapter's example model.
a. What is the single difference between these two people, and how does it affect their
predicted 2nd_Heart_Attack risk?
b. Locate other 67 year old men in the results and compare them to the men on rows
15 and 669. How do they compare?
c. Can you spot areas when the men represented on rows 15 and 669 could improve
their chances of not suffering a second heart attack?
3) What is the difference between confidence(Yes) and confidence(No) in this chapter's
example?
4) How can you set an attribute's role to be 'label' in RapidMiner without using the Set Role
operator? What is one drawback to doing it that way?
EXERCISE
For this chapter's exercise, you will use logistic regression to try to predict whether or not young
people you know will eventually graduate from college. Complete the following steps:
1) Open a new blank spreadsheet in OpenOffice Calc. At the bottom of the spreadsheet
there will be three default tabs labeled Sheet1, Sheet2, Sheet3. Rename the first one
Training and the second one Scoring. You can rename the tabs by double clicking on their
labels. You can delete or ignore the third default sheet.
2) On the training sheet, starting in cell A1 and going across, create attribute labels for five
attributes: Parent_Grad, Gender, Income_Level, Num_Siblings, and Graduated.
3) Copy each of these attribute names except Graduated into the Scoring sheet.
 
 
Search WWH ::




Custom Search