Database Reference
In-Depth Information
In Figure 9-10, we can see that each person has been given a predication of 'No' (they won't suffer
a second heart attack), or 'Yes' (they will). It is critically important to remember at this point of
our evaluation that if this were real, and not a textbook example, these would be real people, with
names, families and lives. Yes, we are using data to evaluate their health, but we shouldn't treat
these people like numbers. Hopefully our work and analysis will help our imaginary client Sonia in
her efforts to serve these people better. When data mining, we should always keep the human
element in mind, and we'll talk more about this in Chapter 14.
So we have these predictions that some people in our scoring data set are on the path to a second
heart attack and others are not, but how confident are we in these predictions? The
confidence(Yes) and confidence(No) attributes can help us answer that question. To start, let's
just consider the person represented on Row 1. This is a single (never been married) 61 year old
man. He has been classified as overweight, but has lower than average cholesterol (the mean
shown in our meta data in Figure 9-9 is just over 178). He scored right in the middle on our trait
anxiety test at 50, and has attended stress management class. With these personal attributes,
compared with those in our training data, our model offers us an 86.1% level of confidence that
the 'No' prediction is correct. This leaves us with 13.9% worth of doubt in our prediction. The
'No' and 'Yes' values will always total to 1, or in other words, 100%. For each person in the data
set, their attributes are fed into the logistic regression model, and a prediction with confidence
percentages is calculated.
Let's consider one other person as an example in Figure 9-10. Look at Row 11. This is a 66 year
old man who's been divorced. He's above the average values in every attribute. While he's not as
old as some in our data set, he is getting older, and he's obese. His cholesterol is among the
highest in our data set, he scored higher than average on the trait anxiety test and hasn't been to a
stress management class. We're predicting, with 99.2% confidence, that this man will suffer a
second heart attack. The warning signs are all there, and Sonia can now see them fairly easily.
With an understanding of how to read the output, Sonia can now proceed to…
Search WWH ::




Custom Search