Information Technology Reference
In-Depth Information
Rack-sized replicas are sometimes called pods . A pod is self-contained and often forms
its own security domain. For example, a billing system may be made up of pods, each one
self-contained and designed to handle bill processing for a specific group of customers.
Clos Networking
It is reasonable to expect that eventually there will be network products on the
open market that provide non-blocking, full-speed connectivity between any two
machines in an entire datacenter. We've known how to do this since 1953 ( Clos
1953 ) . When this product introduction happens, it will change how we design ser-
vices.
6.6.5 Datacenters
Datacenters can also be failure domains. An entire datacenter can go down due to natural
disasters, cooling failures, power failures, or an unfortunate backhoe dig that takes out all
network connections in one swipe.
Similar to rack diversity and rack locality, datacenter diversity and datacenter locality
also exist. Bandwidth within a datacenter is generally fast, though not as fast as within a
rack. Bandwidth between datacenters is generally slower and, unlike with data transmitted
within a datacenter, is often billed for by the gigabyte.
Each replica of a service should be self-contained within a datacenter but the entire ser-
vice should have datacenter diversity. Google requires N + 2 diversity as a minimum re-
quirementforuser-facingservices.Thatway,whenonedata-centerisintentionallybrought
down for maintenance, another can go down due to unforeseen circumstances without im-
pacting the service.
6.7 Overload Failures
Distributedsystemsneedtoberesilientwhenfacedwithhighlevelsofloadthatcanhappen
as the result of a temporary surge in traffic, an intentional attack, or automated systems
querying the system at a high rate, possibly for malicious reasons.
6.7.1 Traffic Surges
Systems should be resilient against temporary periods of high load. For example, a small
servicemaybecomeoverloadedafterbeingmentionedinapopularwebsiteornewsbroad-
cast.Evenalargeservicecanbecomeoverloadedduetoloadbeingshiftedtotheremaining
replicas when one fails.
Search WWH ::




Custom Search