Information Technology Reference
In-Depth Information
15.3 Team Training: Fire Drills
Fire drills exercise a particular disaster preparedness process. In these situations actual
failures are triggered to actively test both the technology and the people involved.
The key to building resilient systems is accepting that failure happens and making a
commitment to being prepared to respond quickly and effectively to those failures. An un-
tested disaster recovery plan isn't really a plan at all. Fire drills are processes where we
preemptively trigger the failure, observe it, fix it, and then repeat until the process is per-
fected and the people involved are confident in their skills.
Drills work because they give us practice and find bugs in procedures. It's better to pre-
pare for failures by causing them in production while you are watching than to rely on a
strategy ofhopingthesystemwillbehavecorrectly whenyouaren'twatching. Doingthese
drills in production does carry the risk that something catastrophic will happen. However,
what better time for a catastrophe than when the entire engineering team is ready to re-
spond?
Drills build confidence in the disaster recovery technology because bugs are found and
fixed. The less often a failover mechanism is used, the less confidence we have in it. Ima-
gine if a failover mechanism has not been triggered in more than a year. We don't know if
seemingly unrelated changes in the environment have made the process obsolete. It is un-
reasonabletoexpectittoworkseamlessly.Ignoranceisnotbliss.Beingunsureifafailover
mechanism will work is a cause of stress.
Drills build confidence within the operations team. If a team is not accustomed to deal-
ingwithdisaster,theyaremorelikelytoreacttooquickly,reactpoorly,orfeelunduelevels
of pressure and anxiety. This reduces their ability to handle the situation well. Drills give
the team a chance to practice reacting calmly and confidently.
Drills build executive-level confidence in their operations teams. While some execut-
ives would rather remain ignorant, smarter executives know that failures happen and that a
company that is prepared and rehearsed is the best defense.
Drills can be done to gain confidence in an individual process, in larger tests involving
major systems, or even in larger tests that involve the entire company for multiple days.
Start small and work your way up to the biggest drills over time. Trying a large-scale exer-
cise is futile and counterproductive, if you haven't first built up capability and confidence
through a series of smaller but ever-growing drills.
Search WWH ::




Custom Search