Hardware Reference
In-Depth Information
Dependability via redundancy —The long-running nature of Internet services means that the
hardware and software in a WSC must collectively provide at least 99.99% of availability;
that is, it must be down less than 1 hour per year. Redundancy is the key to dependability
for both WSCs and servers. While server architects often utilize more hardware offered at
higher costs to reach high availability, WSC architects rely instead on multiple cost-efect-
ive servers connected by a low-cost network and redundancy managed by software. Fur-
thermore, if the goal is to go much beyond “four nines” of availability, you need multiple
WSCs to mask events that can take out whole WSCs. Multiple WSCs also reduce latency
for services that are widely deployed.
Network I/O —Server architects must provide a good network interface to the external
world, and WSC architects must also. Networking is needed to keep data consistent
between multiple WSCs as well as to interface to the public.
Both interactive and batch processing workloads —While you expect highly interactive work-
loads for services like search and social networking with millions of users, WSCs, like serv-
ers, also run massively parallel batch programs to calculate metadata useful to such ser-
vices. For example, MapReduce jobs are run to convert the pages returned from crawling
the Web into search indices (see Section 6.2 ).
Not surprisingly, there are also characteristics not shared with server architecture:
Ample parallelism —A concern for a server architect is whether the applications in the tar-
geted marketplace have enough parallelism to justify the amount of parallel hardware and
whether the cost is too high for sufficient communication hardware to exploit this parallel-
ism. A WSC architect has no such concern. First, batch applications benefit from the large
number of independent datasets that require independent processing, such as billions of
Web pages from a Web crawl. This processing is data-level parallelism applied to data in stor-
age instead of data in memory, which we saw in Chapter 4 . Second, interactive Internet
service applications, also known as software as a service ( SaaS ), can benefit from millions of
independent users of interactive Internet services. Reads and writes are rarely dependent
in SaaS, so SaaS rarely needs to synchronize. For example, search uses a read-only index
and email is normally reading- and writing-independent information. We call this type of
easy parallelism request-level parallelism , as many independent efforts can proceed in paral-
lel naturally with litle need for communication or synchronization; for example, journal-
based updating can reduce throughput demands. Given the success of SaaS and WSCs,
more traditional applications such as relational databases have been weakened to rely on
request-level parallelism. Even read-/write-dependent features are sometimes dropped to
offer storage that can scale to the size of modern WSCs.
Operational costs count —Server architects usually design their systems for peak perform-
ance within a cost budget and worry about power only to make sure they don't exceed the
cooling capacity of their enclosure. They usually ignore operational costs of a server, as-
suming that they pale in comparison to purchase costs. WSCs have longer lifetimes—the
building and electrical and cooling infrastructure are often amortized over 10 or more
years—so the operational costs add up: Energy, power distribution, and cooling represent
more than 30% of the costs of a WSC in 10 years.
Scale and the opportunities/problems associated with scale —Often extreme computers are ex-
tremely expensive because they require custom hardware, and yet the cost of customiz-
ation cannot be effectively amortized since few extreme computers are made. However,
when you purchase 50,000 servers and the infrastructure that goes with it to construct a
single WSC, you do get volume discounts. WSCs are so massive internally that you get
economy of scale even if there are not many WSCs. As we shall see in Sections 6.5 and
Search WWH ::




Custom Search