Database Reference
In-Depth Information
Exercises
1. In the Income linear regression example, consider the distribution of the
outcome variable Income . Income values tend to be highly skewed to the
right (distribution of value has a large tail to the right). Does such a
non-normally distributed outcome variable violate the general assumption
of a linear regression model? Provide supporting arguments.
2. In the use of a categorical variable with n possible values, explain the
following:
a. Why only n - 1 binary variables are necessary
b. Why using n variables would be problematic
3. In the example of using Wyoming as the reference case, discuss the effect
on the estimated model parameters, including the intercept, if another state
was selected as the reference case.
4. Describe how logistic regression can be used as a classifier.
5. Discuss how the ROC curve can be used to determine an appropriate
threshold value for a classifier.
6. If the probability of an event occurring is 0.4, then
a. (a)What is the odds ratio?
b. What is the log odds ratio?
7. If is an estimated coefficient in a linear regression model, what is
the effect on the odds ratio for every one unit increase in the value of
Search WWH ::




Custom Search