Upgrading Live Services - The Practice of Cloud System Administration

Information Technology Reference

In-Depth Information

technique, beta users would go to an entirely different set of servers. It is very

costly to set up a complete replica of the server environment just to support beta

users. Moreover, granularity was an all-or-nothing proposition. With flags, each in-

dividual feature can be taken in and out of beta testing.

• Finely Timed Release Dates: Timing a flag flip is easier than timing a software

deployment. If a new feature is to be announced at noon on Tuesday, it is difficult

to roll out new software exactly at that time, especially if there are many servers.

Instead, flags can be controlled dynamically to flip at a specific time.

• Dynamic Roll Backs: It is easier to disable an ill-behaved new feature by dis-

abling a flag than by rolling back to an older release of the software. If a new re-

lease has many new features, it would be a shame to have to roll back the entire re-

lease because of one bad feature. With flag flips, just the one feature can be dis-

abled.

• Bug Isolation: Having each change associated with a flag helps isolate a bug. Ima-

gine a memory leak that may be in one of 30 recently added features. If they are all

attached to toggles, a binary search can identify which feature is creating the prob-

lem. If the binary search fails to isolate the problem in a test environment, doing

this bug detection in production via flags is considerably more sane than via many

individual releases.

• A-B Testing: Often the best way to tell if users prefer one design or another is to

implement both, show certain users the new feature, and observe their behavior.

For example, suppose sign-ups for a product are very low. Would sign-ups be im-

proved if a check box defaulted to checked, or if instead of a check box an entirely

different mechanism was used? A group of users could be selected, half with their

check box default checked (group A) and the other half with the new mechanism

(group B). Whichever design has the better results would be used for all users after

the test. Flag flips can be used both to control the test and to enable the winning

design when the test is finished.

• One Percent Testing: This testing exposes a feature to a statistical sample of

users. Sometimes, similar to the canary process, it is done to test a new feature be-

fore deploying it globally. Sometimes, similar to A-B testing, it is done to identify

the reaction to an experimental feature before deciding whether it should be de-

ployed at all. Sometimes it is done to collect information via statistical sampling.

For example, a site might like to gather performance statistics about page load

times. JavaScript code can be inserted into HTML pages that transmits back to the

service data about how long the page took to load. Collecting this information

would, essentially, double the number of queries that reach the service and would

Search WWH ::

Custom Search

Home