Database Reference
In-Depth Information
Datanodes are the workhorses of the filesystem. They store and retrieve blocks when they
are told to (by clients or the namenode), and they report back to the namenode periodic-
ally with lists of blocks that they are storing.
Without the namenode, the filesystem cannot be used. In fact, if the machine running the
namenode were obliterated, all the files on the filesystem would be lost since there would
be no way of knowing how to reconstruct the files from the blocks on the datanodes. For
this reason, it is important to make the namenode resilient to failure, and Hadoop provides
two mechanisms for this.
The first way is to back up the files that make up the persistent state of the filesystem
metadata. Hadoop can be configured so that the namenode writes its persistent state to
multiple filesystems. These writes are synchronous and atomic. The usual configuration
choice is to write to local disk as well as a remote NFS mount.
It is also possible to run a secondary namenode , which despite its name does not act as a
namenode. Its main role is to periodically merge the namespace image with the edit log to
prevent the edit log from becoming too large. The secondary namenode usually runs on a
separate physical machine because it requires plenty of CPU and as much memory as the
namenode to perform the merge. It keeps a copy of the merged namespace image, which
can be used in the event of the namenode failing. However, the state of the secondary na-
menode lags that of the primary, so in the event of total failure of the primary, data loss is
almost certain. The usual course of action in this case is to copy the namenode's metadata
files that are on NFS to the secondary and run it as the new primary. (Note that it is pos-
sible to run a hot standby namenode instead of a secondary, as discussed in HDFS High
Availability .)
See The filesystem image and edit log for more details.
Block Caching
Normally a datanode reads blocks from disk, but for frequently accessed files the blocks
may be explicitly cached in the datanode's memory, in an off-heap block cache . By de-
fault, a block is cached in only one datanode's memory, although the number is configur-
able on a per-file basis. Job schedulers (for MapReduce, Spark, and other frameworks)
can take advantage of cached blocks by running tasks on the datanode where a block is
cached, for increased read performance. A small lookup table used in a join is a good can-
didate for caching, for example.
Users or applications instruct the namenode which files to cache (and for how long) by
adding a cache directive to a cache pool . Cache pools are an administrative grouping for
managing cache permissions and resource usage.
Search WWH ::




Custom Search