Database Reference
In-Depth Information
design, since in the event of total namenode failure (when there are no recoverable
backups, even from NFS), it allows recovery from a secondary namenode. This can be
achieved either by copying the relevant storage directory to a new namenode or, if the sec-
ondary is taking over as the new primary namenode, by using the -importCheck-
point option when starting the namenode daemon. The -importCheckpoint option
will load the namenode metadata from the latest checkpoint in the directory defined by the
dfs.namenode.checkpoint.dir property, but only if there is no metadata in the
dfs.namenode.name.dir directory, to ensure that there is no risk of overwriting
precious metadata.
Datanode directory structure
Unlike namenodes, datanodes do not need to be explicitly formatted, because they create
their storage directories automatically on startup. Here are the key files and directories:
${dfs.datanode.data.dir}/
├── current
│ ├── BP-526805057-127.0.0.1-1411980876842
│ │ └── current
│ │ ├── VERSION
│ │ ├── finalized
│ │ │ ├── blk_1073741825
│ │ │ ├── blk_1073741825_1001.meta
│ │ │ ├── blk_1073741826
│ │ │ └── blk_1073741826_1002.meta
│ │ └── rbw
│ └── VERSION
└── in_use.lock
HDFS blocks are stored in files with a blk_ prefix; they consist of the raw bytes of a por-
tion of the file being stored. Each block has an associated metadata file with a .meta suf-
fix. It is made up of a header with version and type information, followed by a series of
checksums for sections of the block.
Each block belongs to a block pool, and each block pool has its own storage directory that
is formed from its ID (it's the same block pool ID from the namenode's VERSION file).
When the number of blocks in a directory grows to a certain size, the datanode creates a
new subdirectory in which to place new blocks and their accompanying metadata. It cre-
ates a new subdirectory every time the number of blocks in a directory reaches 64 (set by
the dfs.datanode.numblocks configuration property). The effect is to have a tree
with high fan-out, so even for systems with a very large number of blocks, the directories
will be only a few levels deep. By taking this measure, the datanode ensures that there is a
Search WWH ::




Custom Search