Database Reference
In-Depth Information
Failover and fencing
The transition from the active namenode to the standby is managed by a new entity in the
system called the failover controller . There are various failover controllers, but the default
implementation uses ZooKeeper to ensure that only one namenode is active. Each namen-
ode runs a lightweight failover controller process whose job it is to monitor its namenode
for failures (using a simple heartbeating mechanism) and trigger a failover should a na-
menode fail.
Failover may also be initiated manually by an administrator, for example, in the case of
routine maintenance. This is known as a graceful failover , since the failover controller ar-
ranges an orderly transition for both namenodes to switch roles.
In the case of an ungraceful failover, however, it is impossible to be sure that the failed
namenode has stopped running. For example, a slow network or a network partition can
trigger a failover transition, even though the previously active namenode is still running
and thinks it is still the active namenode. The HA implementation goes to great lengths to
ensure that the previously active namenode is prevented from doing any damage and caus-
ing corruption — a method known as fencing .
The QJM only allows one namenode to write to the edit log at one time; however, it is still
possible for the previously active namenode to serve stale read requests to clients, so set-
ting up an SSH fencing command that will kill the namenode's process is a good idea.
Stronger fencing methods are required when using an NFS filer for the shared edit log,
since it is not possible to only allow one namenode to write at a time (this is why QJM is
recommended). The range of fencing mechanisms includes revoking the namenode's ac-
cess to the shared storage directory (typically by using a vendor-specific NFS command),
and disabling its network port via a remote management command. As a last resort, the
previously active namenode can be fenced with a technique rather graphically known as
STONITH , or “shoot the other node in the head,” which uses a specialized power distribu-
tion unit to forcibly power down the host machine.
Client failover is handled transparently by the client library. The simplest implementation
uses client-side configuration to control failover. The HDFS URI uses a logical hostname
that is mapped to a pair of namenode addresses (in the configuration file), and the client
library tries each namenode address until the operation succeeds.
Search WWH ::




Custom Search