Database Reference
In-Depth Information
It is important to state the null hypothesis and alternative hypothesis, because
misstating them is likely to undermine the subsequent steps of the hypothesis
testing process. A hypothesis test leads to either rejecting the null hypothesis in
favor of the alternative or not rejecting the null hypothesis.
Table 3.5
includes some examples of null and alternative hypotheses that should
be answered during the analytic lifecycle.
Table 3.5
Example Null Hypotheses and Alternative Hypotheses
Application
Null Hypothesis
Alternative Hypothesis
Accuracy
Forecast
Model X
does not predict
better
than the existing model.
Model X
predicts
better than
the existing model.
Recommendation
Engine
Algorithm Y
does not produce
better recommendations than
the current algorithm being
used.
Algorithm Y
produces
better
recommendations than the
current algorithm being
used.
Regression
Modeling
This variable
does not affect
the
outcome because its coefficient
is
zero
.
This variable
affects
outcome because its
coefficient is not
zero
.
Once a model is built over the training data, it needs to be evaluated over the
testing data to see if the proposed model predicts better than the existing model
currently being used. The null hypothesis is that the proposed model does not
predict better than the existing model. The alternative hypothesis is that the
proposed model indeed predicts better than the existing model. In accuracy
forecast, the null model could be that the sales of the next month are the same
as the prior month. The hypothesis test needs to evaluate if the proposed model
provides a better prediction. Take a recommendation engine as an example. The
null hypothesis could be that the new algorithm does not produce better
recommendations than the current algorithm being deployed. The alternative
hypothesis is that the new algorithm produces better recommendations than the
old algorithm.
When evaluating a model, sometimes it needs to be determined if a given input
variable improves the model. In regression analysis (Chapter 6), for example, this
is the same as asking if the regression coefficient for a variable is zero. The null
hypothesis is that the coefficient is zero, which means the variable does not have
an impact on the outcome. The alternative hypothesis is that the coefficient is
nonzero, which means the variable does have an impact on the outcome.