SOFTWARE DESIGN FOR X - Software Design for Six-Sigma: A Roadmap for Excellence

Information Technology Reference

In-Depth Information

failures result in system outages. Note that for the remainder of this chapter,

the term “failure” will refer only to the failure of essential functionality, unless

otherwise stated.

There are three types of run-time defects/failures:

1. Defects/failures that are never executed (so they do not trigger faults)

2. Defects/failures that are executed and trigger faults that do NOT result in

failures

3. Defects/failures that are executed and trigger faults that result in failures

Typically, we focus solely on defects that have the potential to cause failures by

detecting and removing defects that result in failures during development and by

implementing fault-tolerance techniques to prevent faults from producing failures or

mitigating the effects of the resulting failures. Software fault tolerance is the ability of

software to detect and recover from a fault that is happening or already has happened

in either the software or hardware in the system where the software is running to

provide service in accordance with the specification. Software fault tolerance is a

necessary component to construct the next generation of highly available and reliable

computing systems from embedded systems to data warehouse systems. Software

fault tolerance is not a solution unto itself, however, and it is important to realize that

software fault tolerance is just one piece in the design for reliability.

Software reliability is an important attribute of software quality as well as all

other abilities such as functionality, usability, performance, serviceability, capability,

maintainability, and so on. Software reliability is hard to achieve as complexity

increases. It will be hard to reach a certain level of reliability with any system of

high complexity. The trend is that system developers tend to push complexity into

the software layer with the rapid growth of system size and ease of doing so by

upgrading the software. Although the complexity of software is inversely related to

software reliability, it is directly related to other important factors in software quality,

especially functionality, capability, and so on. Emphasizing these features will tend

to add more complexity to software (Rook, 1990).

Across time, hardware exhibits the failure characteristics shown in Figure 14.1(a),

known as the bathtub curve. 7 The three phases in a bathtub curve are: infant mortality

phase, useful life phase, and end-of-life phase. A detailed discussion about the curve

can be found in (Kapur & Lamberson, 1977). Software reliability, however, does

not show the same characteristics. A possible curve is shown in Figure 14.1(b) if

we depict software reliability on the same axes. There are two major differences

between hardware and software bath tub curves: 1) In the last phase, software does

not have an increasing failure rate as hardware does because software is approaching

obsolescence, and usually there are no motivations for any upgrades or changes to

the software. As a result, the failure rate will not change; 2) In the useful-life phase,

7 The name is derived from the cross-sectional shape of the eponymous device. It does not hold water!

Search WWH ::

Custom Search

Home