Database Reference
In-Depth Information
ORGANIZATIONAL UNDERSTANDING
Sonia's desire is to expand her data mining activities to determine what kinds of programs she
should develop to help victims of heart attacks avoid suffering a recurrence. She knows that
several risk factors such as weight, high cholesterol and stress contribute to heart attacks,
particularly in those who have already suffered one. She also knows that the cost of providing
programs developed to help mitigate these risks is a fraction of the cost of providing medical care
for a patient who has suffered multiple heart attacks. Getting her employer on board with funding
the programs is the easy part. Figuring out which patients will benefit from which programs is
trickier. She is looking to us to provide some guidance, based on data mining, to figure out which
patients are good candidates for which programs. Sonia's bottom line is that she wants to know
whether or not something (a second heart attack) is likely to happen, and if so, how likely it is that
it will or will not happen. Logistic regression is an excellent tool for predicting the likelihood of
something happening or not.
DATA UNDERSTANDING
Sonia has access to the company's medical claims database. With this access, she is able to
generate two data sets for us. This first is a list of people who have suffered heart attacks, with an
attribute indicating whether or not they have had more than one; and the second is a list of those
who have had a first heart attack, but not a second. The former data set, comprised of 138
observations, will serve as our training data; while the latter, comprised of 690 peoples' data, will be
for scoring. Sonia's hope is to help this latter group of people avoid becoming second heart attack
victims. In compiling the two data sets we have defined the following attributes:
Age : The age in years of the person, rounded to the nearest whole year.
Marital_Status : The person's current marital status, indicated by a coded number: 0-
Single, never married; 1-Married; 2-Divorced; 3-Widowed.
Gender : The person's gender: 0 for female; 1 for male.
Weight_Category : The person's weight categorized into one of three levels: 0 for normal
weight range; 1 for overweight; and 2 for obese.
Cholesterol : The person's cholesterol level, as recorded at the time of their treatment for
their most recent heart attack (their only heart attack, in the case of those individuals in the
scoring data set.)
 
 
Search WWH ::




Custom Search