Information Technology Reference
In-Depth Information
the analysis of parts and components (e.g., objects and classes) in an effort to predict
and calculate the rate at which an item will fail. A reliability prediction is one of the
most common forms of reliability analyses for calculating failure rate and MTBF. If
a critical failure is identified, then a reliability block diagram analysis can be used to
see whether redundancy should be considered to mitigate the effect of a single-point
failure. A reliable design should anticipate all that can go wrong. We view DFR as a
means to maintain and sustain the Six Sigma capability across time.
The software designs should evolve using a multitiered approach such as 17 :
System architecture 18 : Identify all essential system-level functionality that re-
quires software and identify the role of software in detecting and handling hard-
ware failure modes by performing system-level failure mode analysis. These
can be obtained from quality function deployment (QFD) and axiomatic design
deployments.
High-level design: Identify modules based on their functional importance and
vulnerability to failures. Essential functionality is executed most frequently.
Critical functionality is executed infrequently but implements key system op-
erations (e.g., boot or restart, shutdown, backup, etc.). Vulnerability points are
points that might flag defect clusters (e.g., synchronization points, hardware and
module interfaces, initialization and restart code, etc.). Identify the visibility and
access major data objects outside of each module.
Low-level design: Define the availability behavior of the modules (e.g., restarts,
retries, reboots, redundancy, etc.). Identify vulnerable sections of functionality
in detail.
Functionality is targeted for fault-tolerance techniques. Focus on simple imple-
mentations and recovery actions. For software DFSS belts, the highest return on
investment (ROI) for defect and failure detection and removal is low-level design. It
defines sufficient module logic and flow-control details to allow analysis based on
common failure categories and vulnerable portions of the design. Failure handling
behavior can be examined in sufficient detail. Low-level design bridges the gap be-
tween traditional design specs and source code. Most design defects that were caught
previously during code reviews now will be caught in the low-level Design review.
We are more likely to fix correctly design defects because the defect is caught in the
conceptualize phase. Most design defects found after this phase are not fixed properly
because the scheduling costs are too high. Design defects require returning to the
design phase to correct and review the design and then correcting, rereviewing, and
unit testing the code! Low-level design also can be reviewed for testability. The goal
17 See Silverman and De La Fuente, http://www.opsalacarte.com/pdfs/Tech Papers/Software Design for
Reliability - Paper.pdf.
18 See Chapters 4 and 13.
Search WWH ::




Custom Search