Information Technology Reference
In-Depth Information
TABLE 6.6
The Two Types of Test Errors
Statistical
True state of null hypothesis (H 0 )
Decision
H 0 is true
H 0 is false
Reject H 0
Type I error ( α )
Correct
β
Accept H 0
Correct
Type II error (
)
sense that an opportunity to reject the null hypothesis correctly was lost. It is not an
error in the sense that an incorrect conclusion was drawn because no conclusion is
drawn when the null hypothesis is accepted. Table 6.6 summarized the two types of
test errors.
A type I error generally is considered more serious than a Type II error because
it results in drawing a conclusion that the null hypothesis is false when, in fact, it
is true. The experimenter often makes a tradeoff between Type I and Type II errors.
A software DFSS team protects itself against Type I errors by choosing a stringent
significance level. This, however, increases the chance of a Type II error. Requiring
very strong evidence to reject the null hypothesis makes it very unlikely that a true
null hypothesis will be rejected. However, it increases the chance that a false null
hypothesis will be accepted, thus lowering the hypothesis test power . Test power is the
probability of correctly rejecting a false null hypothesis. Power is, therefore, defined
as: 1
is the Type II error probability. If the power of an experiment
is low, then there is a good chance that the experiment will be inconclusive. There
are several methods for estimating the test power of an experiment. For example,
to increase the test power, the team can be redesigned by changing one factor that
determines the power, such as the sample size, the standard deviation (
β
, where
β
σ
), and the
size of difference between the means of the tested software packages.
6.4.2
Experimental Design
In practical Six Sigma projects, experimental design usually is a main objective for
building the transfer function model . Transfer functions models are fundamentally
built with an extensive effort spent on data collection, verification, and validation to
provide a flexible platform for optimization and tradeoffs. Experimentation can be
done in hardware and software environments.
Software experimental testing is any activity aimed at evaluating an attribute or
capability of a program or system and at determining that it meets its required results.
The difficulty in software testing stems from the complexity of software. Software
experimental testing is more than just debugging. The purpose of testing can be
quality assurance, verification and validation, or reliability estimation. Testing can
be used as a generic metric as well. Correctness testing and reliability testing are two
major areas of testing. Software testing is a tradeoff among budget, time, and quality.
Experimenting in a software environment is a typical practice for estimating
performance under various running conditions, conducting “what-if” analysis, testing
hypothesis, comparing alternatives, factorial design, and optimization. The results of
 
Search WWH ::




Custom Search