Database Reference
In-Depth Information
advisable. In addition, setting up a Backup node may help you recover more
quickly in the event of a NameNode failure. The Backup node maintains
its own copy of the FsImage and EditLog. It receives all the file system
transactions from the NameNode and uses that to keep its copy of the
FsImage up to date. If the NameNode fails catastrophically, you can use
the Backup node's copy of the FsImage to start up a new NameNode more
quickly.
NOTE
Despite their name, Backup nodes aren't a direct backup to a
NameNode. Rather, they manage the checkpointing process and retain
a backup copy of the FsImage and EditLog. A NameNode cannot fail
over to a Backup node automatically.
NOTE
Hadoop 2.0 includes several improvements for improving the
availability of NameNodes, with support for Active and Standby
NameNodes. These new options will make it much easier to have a
highly available HDFS cluster.
Data Replication
One of the critical features of HDFS is its support for data replication. This
is critical for creating redundancy in the data, which allows HDFS to be
resilient to the failure of one or more nodes. Without this capability, HDFS
would not be reliable to run on commodity hardware, and as a result, would
require significantly more investment in highly available servers.
Data replication also enables better performance for large data sets. By
spreading copies of the data across multiple nodes, the data can be read in
parallel. This enables faster access and processing of large files.
Search WWH ::




Custom Search