Information Technology Reference
In-Depth Information
Buying Failed Memory
Early in its history, Google tried to see how far it could push the limits of using
intelligent software to manage unreliable hardware. To do so, the company pur-
chased failed RAM chips and found ways to make them useful.
Google was purchasing terabytes of RAM for machines that ran software that
was highly resilient to failure. If a chip failed, the OS would mark that area of
RAM as unusable and kill any process using it. The killed processes would be re-
started automatically. The fact that the chip was bad was recorded so that it was
ignored by the OS even after reboot.
As a result, a machine didn't need to be repaired just because one chip had
failed. The machine could run until the machine's capacity was reduced below us-
able limits.
To understand what happened next, you must understand that the difference
between high-quality RAM chips and normal-quality chips is how much testing
they pass. RAM chips are manufactured and then tested. The ones that pass the
most QA testing are sold as “high quality” at a high price. The ones that pass the
standard QA tests are sold as normal for the regular price. All others are thrown
Google's purchasing people are formidable negotiators. Google had already
been saving money by purchasing the normal-quality chips, relying on the custom
software Google wrote to work around failures. One day the purchasing depart-
ment thought toaskifit was possible topurchase the chips that were being thrown
away. The manufacturers had never received such a request before and were will-
ing to sell the defective chips for pennies on the dollar.
from the start and others failed soon after. However, services were able to keep
to build servers with enormous amounts of RAM for less money than any of its
competitors. When your business is charging pennies for advertisements, saving
dollars is a big advantage!
Search WWH ::

Custom Search