Information Technology Reference
In-Depth Information
being awakened in the middle of the night because a machine is down, we are alerted only
if the needle of a gauge gets near the danger zone.
6.3.2 Load Sharing versus Hot Spares
Inthe previous examples, the replicas are load sharing: all are active, are sharing the work-
load equally (approximately), and have equal amounts of spare capacity (approximately).
Another strategy is to have primary and secondary replicas. In this approach, the primary
replica receives the entire workload but the secondary replica is ready to take over at any
time. This is sometimes called the hot spare or “hot standby” strategy since the spare is
connected to the system, running (hot), and can be switched into operation instantly. It is
alsoknownasanactive-passiveormaster-slavepair.Oftentherearemultiplesecondaries.
Because there is only one master, these configurations are 1 + M configurations.
Sometimestheterm“active-active”or“master-master”pairwillbeusedtorefertotwo
replicas that are load sharing. “Active-active” is more commonly used with network links.
“Master-master” ismorecommonly usedinthedatabase worldandinsituations wherethe
two are tightly coupled.
6.4 Failure Domains
A failure domain is the bounded area beyond which failure has no impact. For example,
when a car fails on a highway, its failure does not make the entire highway unusable. The
impact of the failure is bounded to its failure domain.
The failure domain of a fuse in a home circuit breaker box is the room or two that is
coveredbythatcircuit.Ifapowerlineiscut,thefailuredomainaffectsanumberofhouses
or perhaps a city block. The failure domain of a power grid might be the town, region, or
county that it feeds (which is why some datacenters are located strategically so they have
access to two power grids).
A failure domain may be prescriptive—that is, a design goal or requirement. You might
plan that two groups of servers are each their own failure domain and then engineer the
system to meet that goal, assuring that the failure domains that they themselves rely on
are independent. Each group may be in different racks, different power circuits, and so on.
Whether theyshouldbeindifferent datacenters dependsonthescopeofthefailure domain
goal.
Alternatively, a failure domain may be descriptive. Often we find ourselves exploring
a system trying to determine, or reverse-engineer, what the resulting failure domain has
become. Due to a failed machine, a server may have been moved temporarily to a spare
machine in another rack. We can determine the new failure domain by exploring the im-
plications of this move.
Search WWH ::




Custom Search