Database Reference
In-Depth Information
4) On the Training sheet, enter values for each of these attributes for several adults that you
know who are at the age that they could have graduated from college by now. These could
be family members, friends and neighbors, coworkers or fellow students, etc. Try to do at
least 20 observations; 30 or more would be better. Enter husband and wife couples as two
separate observations. Use the following to guide your data entry:
a. For Parent_Grad, enter a 0 if neither of the person's parents graduated from college,
a 1 if one parent did, and a 2 if both parents did. If the person's parents went on to
earn graduate degress, you could experiment with making this attribute even more
interesting by using it to hold the total number of college degrees by the person's
parents. For example, if the person represented in the observation had a mother
who earned a bachelor's, master's and doctorate, and a father who earned a
bachelor's and a master's, you could enter a 5 in this attribute for that person.
b. For Gender, enter 0 for female and 1 for male.
c. For Income_Level, enter a 0 if the person lives in a household with an income level
below what you would consider to be below average, a 1 for average, and a 2 for
above average. You can estimate or generalize. Be sensitive to others when
gathering your data—don't snoop too much or risk offending your data subjects.
d. For Num_Siblings, enter the number of siblings the person has.
e. For Graduated, put 'Yes' if the person has graduated from college and 'No' if they
have not.
5) Once you've compiled your Training data set, switch to the Scoring sheet in OpenOffice
Calc. Repeat the data entry process for at least 20 (more is better) young people between
the ages of 0 and 18 that you know. You will use the training set to try to predict whether
or not these young people will graduate from college, and if so, how confident you are in
your prediction. Remember this is your scoring data, so you won't provide the Graduated
attribute, you'll predict it shortly.
6) Use the File > Save As menu option in OpenOffice Calc to save your Training and Scoring
sheets as CSV files.
7) Import your two CSV files into your RapidMiner respository. Be sure to give them
descriptive names.
Search WWH ::




Custom Search