Information Technology Reference
In-Depth Information
The people who are in the oncall rotation should include operations people and deve-
lopers,soastoalignoperationalpriorities.Therearemanywaystodesignanoncallsched-
ule: weekly, daily, or multiple shifts per day. The goal should be to have no more than a
few alerts per shift so that follow-up work can be completed.
An oncall person can be notified many ways. Generally alerts are shared by sending a
message to a hand-held device such as a phone plus one other mechanism for redundancy.
Before a shift, an oncall preparedness checklist should be completed. People should be
reachable while oncall in case there is an alert; otherwise, they should work and sleep as
normal.
Once alerted, the oncall person's top priority is to resolve the situation, even if it means
implementing a quick fix and reserving the long-term fix for later.
An oncall playbook documents actions to be taken in response to various alerts. If the
documentation is insufficient, the issue should be escalated to the service owner or other
escalation rotation.
Alerts should be logged. For major alerts, a postmortem report should be written to re-
cord what happened, what was done to fix the problem, and what can be done in the future
to prevent the problem.
Alertlogsandpostmortemsshouldbereviewedperiodicallytodeterminetrendsandse-
lect projects that will solve systemic problems and reduce the number of alerts.
Exercises
1. What are the primary design elements of an oncall system?
2. Describe your current oncall policy. Are you part of an oncall rotation?
3. How do priorities change for an oncall staffer when alerted?
4. What are the prerequisites for oncall duty at your organization?
5. Name 10 things that you monitor. For each of them, which type of notification is
appropriate and why?
6. Which four elements go into a postmortem, and which details are required with
each element?
7. Write a postmortem report for an incident in which you were involved.
8. How should an oncall system improve operations over time?
Search WWH ::




Custom Search