Errors: An Inherent Part of Human-System Performance - Foundations for Designing User-Centered Systems

Information Technology Reference

In-Depth Information

We will illustrate our points using examples taken from a range of incidents and

accidents, large and small. In doing so, we hope to show how errors do not just

arise because of any inherent error-proneness or maliciousness of the users.

Instead, errors are usually the result of an interaction of several contributing

factors (people, technological, and contextual). Once we accept this state of affairs

we can begin to move away from the need to find someone to blame, and start to

learn

from

erroneous

performance

as

a

way

of

improving

future

system

performance.

10.1.1 What is Error?

Errors are generally regarded as precursors to accidents. The error triggers a set of

events—often referred to as a chain or sequence, although it is not always a linear

set of events—ultimately leading to an outcome that has serious consequences

involving significant loss of life, money, or machinery. Causal analyses of acci-

dents usually highlight the fact that there were many contributory factors. There

are obviously exceptions, where a single catastrophic failure leads directly to an

accident, but generally accidents involve a series of several individually minor

events. This process is sometimes described as a domino effect, or represented by

the Reason's ( 1990 ) Swiss cheese model in which there are holes in the various

layers of the system, and an accident only occurs when the holes line up across all

the layers.

A similar idea is encapsulated in Randell's ( 2000 ) fault-error-failure model that

comes from the field of dependability. A failure is defined as something that occurs

when the service that is delivered is judged to have deviated from its specification.

An error is taken to be the part of the system state that may lead to a subsequent

failure, and the adjudged cause of the error is defined as a fault.

It is very important to note that identifying whether something is a fault, error,

or failure involves making judgments. The fault-error-failure triples can link up so

that you effectively end up with a chain of triples. This is possible because a failure

at one level in the system may constitute a fault at another level. This does not

mean that errors inevitably lead to failures, however. The link between an error

and a failure can be broken either by chance or by taking appropriate design steps

to contain the errors and their effects.

Those errors that have immediate (or near-immediate) effects on system per-

formance are sometimes called active errors (Reason 1990 ). This is to distinguish

them from latent errors, which can lie dormant within a system for some con-

siderable time without having any adverse effect on system performance. The

commission that investigated the nuclear accident at Three Mile Island, for

example, found that an error that had occurred during maintenance (and hence was

latent in the system) led to the emergency feed water system being unavailable

(Kemeny (chairman) 1979 ). Similarly, the vulnerability of the O-ring seals on the

Challenger Space Shuttle was known about beforehand and hence latent in the

Foundations for Designing User-Centered Systems

Search WWH ::

Custom Search

Home