Database Reference
In-Depth Information
In addition to block verification on client reads, each datanode runs a
DataBlock-
Scanner
in a background thread that periodically verifies all the blocks stored on the
datanode. This is to guard against corruption due to “bit rot” in the physical storage me-
dia. See
Datanode block scanner
for details on how to access the scanner reports.
Because HDFS stores replicas of blocks, it can “heal” corrupted blocks by copying one of
the good replicas to produce a new, uncorrupt replica. The way this works is that if a cli-
ent detects an error when reading a block, it reports the bad block and the datanode it was
trying to read from to the namenode before throwing a
ChecksumException
. The na-
menode marks the block replica as corrupt so it doesn't direct any more clients to it or try
to copy this replica to another datanode. It then schedules a copy of the block to be replic-
ated on another datanode, so its replication factor is back at the expected level. Once this
has happened, the corrupt replica is deleted.
It is possible to disable verification of checksums by passing
false
to the
setVeri-
fyChecksum()
method on
FileSystem
before using the
open()
method to read a
file. The same effect is possible from the shell by using the
-ignoreCrc
option with the
-get
or the equivalent
-copyToLocal
command. This feature is useful if you have a
corrupt file that you want to inspect so you can decide what to do with it. For example,
you might want to see whether it can be salvaged before you delete it.
You can find a file's checksum with
hadoop fs -checksum
. This is useful to check
whether two files in HDFS have the same contents — something that
distcp
does, for ex-
ample (see
Parallel Copying with distcp
).
LocalFileSystem
The Hadoop
LocalFileSystem
performs client-side checksumming. This means that
when you write a file called
filename
, the filesystem client transparently creates a hidden
file,
.filename.crc
, in the same directory containing the checksums for each chunk of the
file. The chunk size is controlled by the
file.bytes-per-checksum
property,
which defaults to 512 bytes. The chunk size is stored as metadata in the
.crc
file, so the
file can be read back correctly even if the setting for the chunk size has changed. Check-
sums are verified when the file is read, and if an error is detected,
LocalFileSystem
throws a
ChecksumException
.
Checksums are fairly cheap to compute (in Java, they are implemented in native code),
typically adding a few percent overhead to the time to read or write a file. For most ap-
plications, this is an acceptable price to pay for data integrity. It is, however, possible to
disable checksums, which is typically done when the underlying filesystem supports
checksums natively. This is accomplished by using
RawLocalFileSystem
in place of