Information Technology Reference
In-Depth Information
unable to provide enough juice to spin up dozens of disks at once. Old solder joints shrink
and crack, leading to mysterious failures. Components from the same manufacturing batch
have similar mortality curves, resulting in a sudden rush of failures.
With our discussion of the many potential malfunctions and failures, we hope we
haven't scared you away from the field of system administration!
6.2.2 The Traditional Approach
Traditional software assumes a perfect, malfunction-free world. This leaves the hardware
systems engineer with the impossible task of delivering hardware that never fails. We fake
it by using redundant array of inexpensive [independent] disks (RAID) systems that let the
software goonpretending that disks never fail. Sheltered from the reality ofa world full of
malfunctions, we enable software developers to continue writing software that assumes a
perfect, malfunction-free world (which, of course, does not exist).
For example, UNIX applications are written with the assumption that reading and writ-
ing files will happen without error. As a result, applications do not check for errors when
writing files. If they did, it would be a waste of time because the blocks may not be written
to disk until later, possibly after the application has been exited. Microsoft Word is written
with the assumption that the computer it runs on will continue to run.
Hyperbole Warning
The previous paragraph included two slight exaggerations. The application layer
of UNIX assumes a perfect file system but the underlying layers do not assume
perfectdisks.MicrosoftWordcheckpointsdocumentssothattheuserdoesnotlose
data in the event of a crash. However, during that crash the user is unable to edit
the document.
Attempts to achieve this impossible malfunction-free world cause companies to spend
a lot of money. CPUs, components, and storage systems known for high reliability are
demonstrablymoreexpensivethancommodityparts. AppendixB detailsthehistoryofthis
strategy and explains the economic benefits of distributed computing techniques discussed
in this chapter.
6.2.3 The Distributed Computing Approach
Distributed computing, in contrast to the traditional approach, embraces components' fail-
ures and malfunctions. It takes a reality-based approach that accepts malfunctions as a fact
of life. Google Docs continues to let a user edit a document even if a machine fails at
Google: another machine takes over and the user does not even notice the handoff.
Search WWH ::




Custom Search