Information Technology Reference
In-Depth Information
in terms of what is included and excluded. In SFMEA, for example, potential
failure modes may include the delivery of “No” FR delivered, partial and
degraded FR delivery over time, intermittent FR delivery, and unintended FR
(not intended in the mapping).
2. Identify potential failure modes: Failure modes indicate the loss of at least one
software FR. The DFSS team should identify all potential failure modes by
asking “in what way does the software fail to deliver its FRs?” as identified in
the mapping. A potential failure mode can be a cause or an effect in a higher
level subsystem, causing failure in its FRs. A failure mode may occur, but it
must not necessarily occur. Potential failure modes may be studied from the
baseline of past and current data, tests, and current baseline FMEAs.
For the software components, such information does not exist, and failure
modes are unknown (if a failure mode would be known, then it would be
corrected). Therefore, the definition of failure modes is one of the hardest
parts of the FMEA of a software-based system (Haapanen et al., 2000). The
analysts have to apply their own knowledge about the software and postulate
the relevant failure modes. Reifer (1979) suggested failure modes in major
categories such as computational, logic, data I/O, data handling, interface, data
definition, and database. Ristord and Esmenjaud (2001) proposed five general
purpose failure modes at a processing unit level: 1) the operating system stops,
2) the program stops with a clear message, 3) the program stops without a clear
message, 4) the program runs, producing obviously wrong results, and 5) the
program runs, producing apparently correct but, in fact, wrong results. Lutz and
Woodhouse (1999) divide the failure modes concerning either the data or the
processing of data. For each input and each output of the software component,
they considered four major failure modes classification: 1) missing data (e.g.,
lost message or data loss resulting from hardware failure), 2) incorrect data
(e.g., inaccurate or spurious data), 3) timing of data (e.g., obsolete data or data
arrives too soon for processing), and 4) extra data (e.g., data redundancy or
overflow). For step in processing, they consider of the following four failure
modes: 1) halt/abnormal termination (e.g., hung or deadlocked, at this point),
2) omitted event (e.g., event does not take place, but execution continues), 3)
incorrect logic (e.g., preconditions are inaccurate; event does not implement
intent), and 4) timing/order (e.g., event occurs in wrong order; event occurs
too early or too late). Becker and Flick (1996) give the following classes of
failure modes: 1) hardware or software stop, 2) hardware or software crash,
3) hardware or software hang, 4) slow response, 5) startup failure, 6) faulty
message, 7) checkpoint file failure, 8) internal capacity exceeded, and 9) loss
of service. They also listed a detection method based on Haapanen et al.
(2002):
A task heartbeat monitor is coordination software that detects a missed
function task heartbeat
A message sequence manager checks the sequence numbers for messages to
flag messages that are not in order
Search WWH ::




Custom Search