Database Reference
In-Depth Information
Stress_Management : A binary attribute indicating whether or not the person has
previously attended a stress management course: 0 for no; 1 for yes.
Trait_Anxiety : A score on a scale of 0 to 100 measuring the level of each person's natural
stress levels and abilities to cope with stress. A short time after each person in each of the
two data sets had recovered from their first heart attack, they were administered a standard
test of natural anxiety. Their scores are tabulated and recorded in this attribute along five
point increments. A score of 0 would indicate that the person never feels anxiety, pressure
or stress in any situation, while a score of 100 would indicate that the person lives in a
constant state of being overwhelmed and unable to deal with his or her circumstances.
2nd_Heart_Attack : This attribute exists only in the training data set. It will be our label,
the prediction or target attribute. In the training data set, the attribute is set to 'yes' for
individuals who have suffered second heart attacks, and 'no' for those who have not.
DATA PREPARATION
Two data sets have been prepared and are available for you to download from the companion web
site. These are labeled Chapter09DataSet_Training.csv, and Chapter09DataSet_Scoring.csv. If
you would like to follow along with this chapter's example, download these two datasets now, and
complete the following steps:
1) Begin the process of importing the training data set first. For the most part, the process
will be the same as what you have done in past chapters, but for logistic regression, there
are a few subtle differences . Be sure to set the first row as the attribute names. On the
fourth step, when setting data types and attribute roles, you will need to make at least one
change. Be sure to set the 2nd_Heart_Attack data type to 'nominal' , rather than binominal.
Even though it is a yes/no field, and RapidMiner will default it to binominal because of
that, the Logistic Regression operator we'll be using in our modeling phase expects the
label to be nominal. RapidMiner does not offer binominal-to-nominal or integer-to-
nominal operators, so we need to be sure to set this target attribute to the needed data type
of 'nominal' as we import it. This is shown in Figure 9-1:
 
Search WWH ::




Custom Search