Information Technology Reference
In-Depth Information
failures result in system outages. Note that for the remainder of this chapter,
the term “failure” will refer only to the failure of essential functionality, unless
otherwise stated.
There are three types of run-time defects/failures:
1. Defects/failures that are never executed (so they do not trigger faults)
2. Defects/failures that are executed and trigger faults that do NOT result in
failures
3. Defects/failures that are executed and trigger faults that result in failures
Typically, we focus solely on defects that have the potential to cause failures by
detecting and removing defects that result in failures during development and by
implementing fault-tolerance techniques to prevent faults from producing failures or
mitigating the effects of the resulting failures. Software fault tolerance is the ability of
software to detect and recover from a fault that is happening or already has happened
in either the software or hardware in the system where the software is running to
provide service in accordance with the specification. Software fault tolerance is a
necessary component to construct the next generation of highly available and reliable
computing systems from embedded systems to data warehouse systems. Software
fault tolerance is not a solution unto itself, however, and it is important to realize that
software fault tolerance is just one piece in the design for reliability.
Software reliability is an important attribute of software quality as well as all
other abilities such as functionality, usability, performance, serviceability, capability,
maintainability, and so on. Software reliability is hard to achieve as complexity
increases. It will be hard to reach a certain level of reliability with any system of
high complexity. The trend is that system developers tend to push complexity into
the software layer with the rapid growth of system size and ease of doing so by
upgrading the software. Although the complexity of software is inversely related to
software reliability, it is directly related to other important factors in software quality,
especially functionality, capability, and so on. Emphasizing these features will tend
to add more complexity to software (Rook, 1990).
Across time, hardware exhibits the failure characteristics shown in Figure 14.1(a),
known as the bathtub curve. 7 The three phases in a bathtub curve are: infant mortality
phase, useful life phase, and end-of-life phase. A detailed discussion about the curve
can be found in (Kapur & Lamberson, 1977). Software reliability, however, does
not show the same characteristics. A possible curve is shown in Figure 14.1(b) if
we depict software reliability on the same axes. There are two major differences
between hardware and software bath tub curves: 1) In the last phase, software does
not have an increasing failure rate as hardware does because software is approaching
obsolescence, and usually there are no motivations for any upgrades or changes to
the software. As a result, the failure rate will not change; 2) In the useful-life phase,
7 The name is derived from the cross-sectional shape of the eponymous device. It does not hold water!
Search WWH ::




Custom Search