Database Reference
In-Depth Information
Implementing HDFS Federation
HDFS Federation is a technique of splitting up the filesystem namespace into multiple
parts. Each part will be managed by an individual namenode, resulting in multiple namen-
odes.
In the following diagram, you will see two namenodes, Namenode-1 ( NN1 ) and Namen-
ode-2 ( NN2 ).
Each namenode manages a namespace volume that consists of the namespace metadata and
block pool information. The namespace metadata contains the location information of the
files present in HDFS. A block pool is a collection of data blocks that belong to a single
namespace in a Hadoop cluster.
Both these namenodes have the same set of datanodes in the cluster. The datanodes store
blocks for each of the namenodes. However, the two namenodes do not communicate with
each other. In the preceding diagram, you see only two namenodes; however, in production
environments, you may have more than two namenodes.
With such architecture in place, it is possible to scale the cluster to a large number of
nodes, as the memory is not a limiting factor any more. As a result of this architecture, the
read/write operations throughput will significantly improve as the load is not on a single
Search WWH ::




Custom Search