Information Technology Reference
In-Depth Information
restarting the driver and replaying past requests
and hence, can only recover from failures that are
both transient and fail-stop. Deterministic failures
may recur when the driver recovers, again causing
a failure. Recoverable failures must be fail-stop,
because shadow drivers must detect a failure in
order to conceal it from the OS and applications.
Hence, shadow drivers require an isolation sub-
system to detect and stop failures before they are
visible to applications or the operating system.
Once the driver has restarted, the active-mode
shadow returns the driver to its pre-failure state.
For example, the shadow re-establishes any
configuration state and then replays pending re-
quests. Shadow drivers rely on the state machine
model of drivers. Whereas the default and restart
recovery managers seek to restore the driver to
its unloaded state or initialized state, shadow
drivers seek to restore drivers to their state at the
time of failure.
A shadow driver is a class driver , aware of the
interface to the drivers it shadows but not of their
implementations. The class orientation has two
key implications. First, a single shadow driver
implementation can recover from a failure of any
driver in its class, meaning that a handful of dif-
ferent shadow drivers can serve a large number of
device drivers. As previously mentioned, Linux,
for example, has only 20 driver classes. Second,
implementing a shadow driver does not require
a detailed understanding of the internals of the
drivers it shadows. Rather, it requires only an
understanding of those drivers' interactions with
the kernel. Thus, they can be implemented by
kernel developers with no knowledge of device
specifics and have no dependencies on individual
drivers. For example, if a new network interface
card and driver are inserted into a PC, the exist-
ing network shadow driver can shadow the new
driver without change. Similarly, drivers can be
patched or updated without requiring changes to
their shadows.
Shadow Driver Operation
Shadow drivers execute in one of two modes: pas-
sive or active . Passive mode is used during normal
(non-faulting) operation, when the shadow driver
monitors all communication between the kernel
and the device driver it shadows. This monitoring
is achieved via replicated procedure calls, called
taps : a kernel call to a device driver function causes
an automatic, identical call to the corresponding
shadow driver function. Similarly, a driver call to
a kernel function causes an automatic, identical
call to a corresponding shadow driver function.
These passive-mode calls are transparent to the
device driver and the kernel and occur only to track
the state of the driver as necessary for recovery.
Based on the calls, the shadow tracks the state
transitions of the shadowed device driver.
Active mode is used during recovery from a
failure. Here, the shadow performs two functions.
First, it impersonates the failed driver, intercept-
ing and responding to calls for service. Therefore,
the kernel and higher-level applications continue
operating as though the driver had not failed. Sec-
ond, the shadow driver restarts the failed driver
and brings it back to its pre-failure state. While
the driver restarts, the shadow impersonates the
kernel to the driver, responding to its requests
for service. Together, these two functions hide
recovery from the driver, which is unaware that
a shadow driver is restarting it after a failure, and
from the kernel and applications, which continue
to receive service from the shadow.
Taps
As previously described, a shadow driver moni-
tors communication between a functioning driver
and the kernel and impersonates one to the other
during failure and recovery. This is made possible
by a new mechanism, called a tap . Conceptually,
a tap is a T-junction placed between the kernel
and its drivers. It is implemented as a callout
from wrapper stubs. During a shadow's passive-
mode operation, the tap: (1) invokes the original
Search WWH ::




Custom Search