Database Reference
In-Depth Information
predictions have been generated. This will be demonstrated in an upcoming chapter's
example.
Chapter 14 of this topic will spend some time talking about ethics in data mining. As previously
mentioned, Gill's use of these predictions is going to require some thought and discussion. Is it
ethical to push one of his young clients in the direction of one specific sport based on our model's
prediction that that activity as a good match for the boy? Simply because previous academy
attendees went on to specialize in one sport or another, can we assume that current clients would
follow the same path? The final chapter will offer some suggestions for ways to answer such
questions, but it is wise for us to at least consider them now in the context of the chapter
examples.
It is likely that Gill, being experienced at working with young athletes and recognizing their
strengths and weaknesses, will be able to use our predictions in an ethical way. Perhaps he can
begin by grouping his clients by their predicted Prime_Sports and administering more 'sport-
specific' drills—say, jumping tests for basketball, skating for hockey, throwing and catching for
baseball, etc. This may allow him to capture more specific data on each athlete, or even to simply
observe whether or not the predictions based on the data are in fact consistent with observable
performance on the field, court, or ice. This is an excellent example of why the CRISP-DM
approach is cyclical : the predictions we've generated for Gill are a starting point for a new round of
assessment and evaluation, not the ending or culminating point. Discriminant analysis has given
Gill some idea about where his young proteges may have strengths, and this can point him in
certain directions when working with each of them, but he will inevitably gather more data and
learn whether or not the use of this data mining methodology and approach is helpful in guiding
his clients to a sport in which they might choose to specialize as they mature.
CHAPTER SUMMARY
Discriminant analysis helps us to cross the threshold between Classification and Prediction in data
mining. Prior to Chapter 7, our data mining models and methodologies focused primarily on
categorization of data. With Discriminant Analysis, we can take a process that is very similar in
nature to k-means clustering, and with the right target attribute in a training data set, generate
 
Search WWH ::




Custom Search