Information Technology Reference
In-Depth Information
Chapter 2
Device Driver Reliability
Michael M. Swift
University of Wisconsin—Madison, USA
abStract
Despite decades of research in extensible operating system technology, extensions such as device drivers
remain a significant cause of system failures. In Windows XP, for example, drivers account for 85% of
recently reported failures. This chapter presents Nooks, a layered architecture for tolerating the failure
of drivers within existing operating system kernels. The design consists techniques for isolating drivers
from the kernel and for recovering from their failure. Nooks isolates drivers from the kernel in a light-
weight kernel protection domain, a new protection mechanism. By executing drivers within a domain,
the kernel is protected from their failure and cannot be corrupted. Shadow drivers recover from device
driver failures. Based on a replica of the driver's state machine, a shadow driver conceals the driver's
failure from applications and restores the driver's internal state to a point where it can process requests
as if it had never failed. Thus, the entire failure and recovery is transparent to applications.
introduction
end, an hour of downtime from a system failure can
lead to losses in the millions.
Computer system reliability remains a crucial
but unsolved problem. This problem has been ex-
acerbated by the adoption of commodity operating
systems, designed for best-effort operation, in en-
vironments that require high availability. While the
cost of high-performance computing continues to
drop because of commoditization, the cost of failures
(e.g., downtime on a stock exchange or e-commerce
server, or the manpower required to service a help-
Improving reliability is one of the greatest challenges
for commodity operating systems, such as Windows
and Linux. System failures are commonplace and
costly across all domains: in the home, in the server
room, and in embedded systems, where the existence
of the OS itself is invisible. At the low end, failures
lead to user frustration and lost sales. At the high
Search WWH ::




Custom Search