Database Reference
In-Depth Information
filesystem write operation, because writing out the fsimage file, which can grow to be
gigabytes in size, would be very slow. This does not compromise resilience because if the
namenode fails, then the latest state of its metadata can be reconstructed by loading the
latest fsimage from disk into memory, and then applying each of the transactions from the
relevant point onward in the edit log. In fact, this is precisely what the namenode does
when it starts up (see Safe Mode ).
NOTE
Each fsimage file contains a serialized form of all the directory and file inodes in the filesystem. Each in-
ode is an internal representation of a file or directory's metadata and contains such information as the
file's replication level, modification and access times, access permissions, block size, and the blocks the
file is made up of. For directories, the modification time, permissions, and quota metadata are stored.
An fsimage file does not record the datanodes on which the blocks are stored. Instead, the namenode
keeps this mapping in memory, which it constructs by asking the datanodes for their block lists when
they join the cluster and periodically afterward to ensure the namenode's block mapping is up to date.
As described, the edit log would grow without bound (even if it was spread across several
physical edits files). Though this state of affairs would have no impact on the system
while the namenode is running, if the namenode were restarted, it would take a long time
to apply each of the transactions in its (very long) edit log. During this time, the filesys-
tem would be offline, which is generally undesirable.
The solution is to run the secondary namenode, whose purpose is to produce checkpoints
of the primary's in-memory filesystem metadata. [ 77 ] The checkpointing process proceeds
as follows (and is shown schematically in Figure 11-1 for the edit log and image files
shown earlier):
1. The secondary asks the primary to roll its in-progress edits file, so new edits go to
a new file. The primary also updates the seen_txid file in all its storage director-
ies.
2. The secondary retrieves the latest fsimage and edits files from the primary (using
HTTP GET).
3. The secondary loads fsimage into memory, applies each transaction from edits ,
then creates a new merged fsimage file.
4. The secondary sends the new fsimage back to the primary (using HTTP PUT),
and the primary saves it as a temporary .ckpt file.
5. The primary renames the temporary fsimage file to make it available.
At the end of the process, the primary has an up-to-date fsimage file and a short in-pro-
gress edits file (it is not necessarily empty, as it may have received some edits while the
Search WWH ::




Custom Search