Information Technology Reference
In-Depth Information
Terms to Know
Innovate: Doing (good) things we haven't done before.
Machine: A virtual or physical machine.
Oncall: Being available as first responder to an outage or alert.
Server: Software that provides a function or API. (Not a piece of hardware.)
Service: A user-visible system or product composed of one or more servers.
Soft launch: Launching a new service without publicly announcing it. This way
traffic grows slowly as word of mouth spreads, which gives operations some
cushion to fix problems or scale the system before too many people have seen it.
SRE: Site Reliability Engineer, the Google term for systems administrators who
maintain live services.
Stakeholders: People and organizations that are seen as having an interest in a
project's success.
This chapter starts with some operations management background, then discusses the
operations service life cycle, and ends with a discussion of typical operations work
strategies. All of these topics will be expanded upon in the chapters that follow.
7.1 Distributed Systems Operations
To understand distributed systems operations, one must first understand how it is different
from typical enterprise IT. One must also understand the source of tension between opera-
tions and developers, and basic techniques for scaling operations.
7.1.1 SRE versus Traditional Enterprise IT
System administration is a continuum. On one end is a typical IT department, responsible
for traditional desktop and client-server computing infrastructure, often called enterprise
IT. On the other end is an SRE or similar team responsible for a distributed computing en-
vironment, often associated with web sites and other services. While this may be a broad
generalization, it serves to illustrate some important differences.
SRE is different from an enterprise IT department because SREs tend to be focused on
providing a single service or a well-defined set of services. A traditional enterprise IT de-
partment tends to have broad responsibility for desktop services, back-office services, and
everything in between (“everything with a power plug”). SRE's customers tend to be the
product management of the service while IT customers are the end users themselves. This
Search WWH ::




Custom Search