Database Reference
In-Depth Information
manageable number of files per directory, which avoids the problems that most operating
systems encounter when there are a large number of files (tens or hundreds of thousands)
in a single directory.
If the configuration property dfs.datanode.data.dir specifies multiple directories
on different drives, blocks are written in a round-robin fashion. Note that blocks are not
replicated on each drive on a single datanode; instead, block replication is across distinct
datanodes.
Safe Mode
When the namenode starts, the first thing it does is load its image file ( fsimage ) into
memory and apply the edits from the edit log. Once it has reconstructed a consistent in-
memory image of the filesystem metadata, it creates a new fsimage file (effectively doing
the checkpoint itself, without recourse to the secondary namenode) and an empty edit log.
During this process, the namenode is running in safe mode , which means that it offers
only a read-only view of the filesystem to clients.
WARNING
Strictly speaking, in safe mode, only filesystem operations that access the filesystem metadata (such as
producing a directory listing) are guaranteed to work. Reading a file will work only when the blocks are
available on the current set of datanodes in the cluster, and file modifications (writes, deletes, or re-
names) will always fail.
Recall that the locations of blocks in the system are not persisted by the namenode; this
information resides with the datanodes, in the form of a list of the blocks each one is stor-
ing. During normal operation of the system, the namenode has a map of block locations
stored in memory. Safe mode is needed to give the datanodes time to check in to the na-
menode with their block lists, so the namenode can be informed of enough block locations
to run the filesystem effectively. If the namenode didn't wait for enough datanodes to
check in, it would start the process of replicating blocks to new datanodes, which would
be unnecessary in most cases (because it only needed to wait for the extra datanodes to
check in) and would put a great strain on the cluster's resources. Indeed, while in safe
mode, the namenode does not issue any block-replication or deletion instructions to datan-
odes.
Safe mode is exited when the minimal replication condition is reached, plus an extension
time of 30 seconds. The minimal replication condition is when 99.9% of the blocks in the
whole filesystem meet their minimum replication level (which defaults to 1 and is set by
dfs.namenode.replication.min ; see Table 11-1 ) .
Search WWH ::




Custom Search