Operations in a Distributed World - The Practice of Cloud System Administration

Information Technology Reference

In-Depth Information

Chapter 7. Operations in a Distributed World

The rate at which organizations learn may soon become the only sustainable

source of competitive advantage.

—Peter Senge

Part I of this topic discussed how to build distributed systems. Now we discuss how to run

such systems.

The work done to keep a system running is called operations . More specifically, oper-

ations is the work done to keep a system running in a way that meets or exceeds operating

parameters specified by a service level agreement (SLA). Operations includes all aspects

of a service's life cycle: from initial launch to the final decommissioning and everything in

between.

Operational work tends to focus on availability, speed and performance, security, capa-

city planning, and software/hardware upgrades. The failure to do any of these well results

in a system that is unreliable. If a service is slow, users will assume it is broken. If a sys-

temisinsecure,outsiderscantakeitdown.Withoutpropercapacityplanning,itwillbecome

overloaded and fail. Upgrades, done badly, result in downtime. If upgrades aren't done at

all,bugswillgounfixed.Becausealloftheseactivitiesultimatelyaffectthereliabilityofthe

system, Googlecalls itsoperations team Site Reliability Engineering (SRE).Manycompan-

ies have followed suit.

Operationsisateamsport.Operationsisnotdonebyasinglepersonbutratherbyateam

of people working together. For that reason much of what we describe will be processes and

policies that help you work as a team, not as a group of individuals. In some companies,

processes seem to be bureaucratic mazes that slow things down. As we describe here—and

more important, in our professional experience—good processes are exactly what makes it

possible to run very large computing systems. In other words, process is what makes it pos-

sible for teams to do the right thing, again and again.

Search WWH ::

Custom Search

Home