The Origins and Future of Distributed Computing and Clouds - The Practice of Cloud System Administration

Information Technology Reference

In-Depth Information

To make the system more resilient to failures, each fraction could be stored on two dif-

ferent leaves. If there were 10 fractions, there would be 20 leaves. The root would divide

the traffic for a particular fraction among the two leaves as long as both were up. If one

failed, the root would send all requests related to that fraction to the remaining leaf. The

chance of a simultaneous failure by two leaves holding the same data was unlikely. Even if

it did happen, users might not notice that their web searches returned slightly fewer results

until the replacement algorithms loaded the missing data onto a spare machine.

Scaling was also achieved through replication. If the system did not process requests

fast enough, it could be scaled by adding leaves. A particular fraction might be stored in

three or more places.

The algorithms got more sophisticated over time. For example, rather than splitting the

corpus into 10 fractions, one for each machine, the corpus could be split into 100 fractions

and each machine would store 10. If a particular fraction was receiving a particularly large

numberofhits(itwas“hot”),thatfractioncouldbeplacedonmoremachines,bumpingout

lesspopularfractions.Betteralgorithmsresultedinbetterplacement,diversity,anddynam-

ically updatable corpus data.

Applicability

These algorithms were particularly well suited for web search and similar applications

where the data was mostly static (did not change) except for wholesale replacements when

a new corpus was produced. In contrast, they were inappropriate for traditional applica-

tions. After all, you wouldn't want your payroll system built on a database that dealt with

machine failures by returning partial results. Also, these systems lacked many of the fea-

tures of traditional databases related to consistency and availability.

New distributed computing algorithms enabled new applications one by one. For ex-

ample,thedesiretoprovideemailasamassiveweb-basedserviceledtobetterstoragesys-

tems. Over time more edge cases were conquered so that distributed computing techniques

could be applied to more applications.

Search WWH ::

Custom Search

Home