Information Technology Reference
In-Depth Information
Change-Over Mechanism: Failover and Failback
To protect a business, critical application and data servers can be fully replicated to secure
cloud environments using cold, warm, or hot backup options. Servers and data can be kept
current using continuous and scheduled replication procedures. Change-over mechanisms
can also be put in place to automatically handle small or medium failures and prevent
large-scale or elongated service disruptions.
There are two types of change-over mechanisms: failover and failback.
Failover Service In the case of a service failure, downtime, or a complete site loss, a failover
service can be set up to automatically redirect data and users to replica servers. These servers
can be configured to operate in failover mode for the duration of the downtime event. Users
will be able to securely access the replica servers from any location and continue to work
seamlessly, as they normally would. However, users may experience some nonrepeating delay
in accessing the service the first time it goes down.
Failback Service Failback refers to getting a service back from the recovery environment
after a planned or an unplanned outage. Failback is fast and transparent to the end users,
who can continue to work without any interruption. The biggest plus of the failback pro-
cess is that it synchronizes the changed data in real time and on-the-fly while users continue
to access their applications and data. It then seamlessly redirects the user traffic back to the
production environment once it is restored.
Change-over mechanisms such as failover and failback can improve the mean time
between failures (MTBF) and mean time to failures (MTTF) of the overall system. MTBF
is usually calculated as an arithmetic mean time between inherent failures of a system dur-
ing operation, assuming that the failed components are repaired immediately as a part of
renewal process. The MTTF, on the other hand, measures average time to failures with the
assumption that the failed system has an infinite repair time.
Business Continuity and
Cloud Computing
Implementing failure-resilient systems is not easy. Quickly moving operations from one
infrastructure to the next during peak load or in the advent of a failure is a huge design
challenge. The problem is twofold:
Allowing new resources (compute, storage, and network) to operate as a part of
the service
Maintaining up-to-date copies of the data that the users and customers depend on
This is the key to business continuity in the cloud.
Search WWH ::




Custom Search