Information Technology Reference
In-Depth Information
N + 1 Configurations
Usingtwomachinesbecamethebestpractice.Onewouldruntheserviceandtheotherwas
idle but configured and ready to take over if the first machine failed. Unless the site had a
load balancer the “failover” usually required manual intervention but a good systems ad-
ministrator could do the switch fast enough that the web site would be down for less than
an hour. This is called an N + 1 configuration since there is one more device than required
to provide the service. This technique is very expensive considering that at any given time
50 percent of your investment is sitting idle.
Softwareupgradescouldbedonebyupgradingthespareserverandswitchingtoitwhen
it was time to unveil the new features. The downtime would only be minutes or seconds to
perform the failover. Users might not even notice!
N + 2 Configurations
What if the primary machine failed while the spare was being upgraded? The half-con-
figuredmachinewouldnotbeinausablestate.Assoftwarereleasesincreasedinfrequency,
the likelihood that the spare would not be in a usable state also increased.
Thus, the best practice became having three machines, or an N + 2 configuration. Now
systems administrators could safely perform upgrades but 66 percent of the hardware in-
vestment was idle at any given time. Imagine paying for three houses but only living in
one. Imagine being the person who had to tell the CEO how much money was used on idle
equipment!
Some companies tried to optimize by load sharing between the machines. Extra soft-
waredevelopmentoraloadbalancerwasrequiredtomakethisworkbutitwaspossible.In
an N + 1 configuration, systems administrators could perform software upgrades by taking
one machine out of service and upgrading it while the other remained running. However,
if both machines were at 80 percent utilization, the site now had a single machine that was
160 percent utilized, which would make it unacceptably slow for the end users. The web
site might aswell bedown.The easy solution tothat problem istonever let either machine
getmorethan50percentutilized—butthatsimplyreturnsustothesituationwherehalfthe
capacity we paid for is idle. The idle capacity is just split between two machines!
Some companies tried to do such upgrades only late at night when fewer users meant
that utilization had dipped below 50 percent. That left very little time to do the upgrade,
makinglargeorcomplexupgradesextremelyrisky.Changesthatrequiredanoperatingsys-
tem upgrade or extensive testing were not possible. If the site became popular internation-
allyandwasbusyduringeverytimezone,thisoptiondisappeared.Also,noonecansched-
ule hardware failures to happen only at night! Neither of these approaches was a viable
option for better utilization of the available resources.
Search WWH ::




Custom Search