Database Reference
In-Depth Information
3) Import both of your data sets into your RapidMiner repository. Be sure to give them
descriptive names. Drag and drop them into a new process, and rename them as Training
and Scoring so that you can tell them apart.
4) Use a Set Role operator to designate the Salary attribute as the label for the training data.
5) Add a linear regression operator and apply your model to your scoring data set.
6) Run your model. In results perspective, examine your attribute coefficients and the
predictions for the athletes' salaries in your scoring data set.
7) Report your results:
a. Which attributes have the greatest weight?
b. Were any attributes dropped from the data set as non-predictors? If so, which ones
and why do you think they weren't effective predictors?
c. Look up a few of the salaries for some of your scoring data athletes and compare
their actual salary to the predicted salary. Is it very close? Why or why not, do you
think?
d. What other attributes do you think would help your model better predict
professional athletes' salaries?
Search WWH ::




Custom Search