Information Technology Reference
In-Depth Information
2 ) of two populations,
parameters such as an experiment investigating the variance (
σ
the proportion (
) between two variables. For example, the
correlation between project size and design effort on the job would test the null
hypothesis that the population correlation (
π
), and the correlation (
ρ
0.
Sometimes it is required for the design team to compare more than two alterna-
tives for a system design or an improvement plan with respect to a given performance
measure. Most practical studies tackle this challenge by conducting multiple paired-
comparisons using several paired- t confidence intervals, as discussed. Bonferroni's
approach is another statistical approach for comparing more than two alternative
software packages in some performance metric or a functional requirement. This
approach also is based on computing confidence intervals to determine whether the
true mean performance of a functional requirement of one system (
ρ
) is 0. Symbolically, H 0 :
ρ =
µ i ) is significantly
different from the true mean performance of another system (
µ i ' ) in the same require-
ment. ANOVA is another advanced statistical method that often is used for comparing
multiple alternative software systems. ANOVA's multiple comparison tests are used
widely in experimental designs.
To draw the inference that the hypothesized value of the parameter is not the true
value, a significance test is performed to determine whether an observed value of
a statistic is sufficiently different from a hypothesized value of a parameter (null
hypothesis). The significance test consists of calculating the probability of obtaining
a sample statistic that differs from the null hypothesis value (given that the null
hypothesis is correct). This probability is referred to as a p value. If this probability
is sufficiently low, then the difference between the parameter and the statistic is
considered to be “statistically significant.” The probability of a Type I error (
)is
called the significance level and is set by the experimenter. The significance level (
α
)
commonly is set to 0.05 and 0.01. The significance level is used in hypothesis testing
to:
α
- Determine the difference between the results of the statistical experiment and
the null hypothesis.
- Assume that the null hypothesis is true.
- Compute the probability ( p value) of the difference between the statistic of the
experimental results and the null hypothesis.
- Compare the p value with the significance level (
). If the probability is less
than or equal to the significance level, then the null hypothesis is rejected and
the outcome is said to be statistically significant.
α
The lower the significance level, therefore, the more the data must diverge from
the null hypothesis to be significant. Therefore, the 0.01 significance level is more
conservative because it requires a stronger evidence to reject the null hypothesis then
that of the 0.05 level.
Two kinds of errors can be made in significance testing: Type I error (
α
), where a
β
true null hypothesis can be rejected, incorrectly and Type II error (
), where a false
null hypothesis can be accepted incorrectly. A Type II error is only an error in the
Search WWH ::




Custom Search