Database Reference
In-Depth Information
LocalFileSystem
. To do this globally in an application, it suffices to remap the im-
plementation for
file
URIs by setting the property
fs.file.impl
to the value
org.apache.hadoop.fs.RawLocalFileSystem
. Alternatively, you can directly
create a
RawLocalFileSystem
instance, which may be useful if you want to disable
checksum verification for only some reads, for example:
Configuration conf
= ...
FileSystem fs
=
new
RawLocalFileSystem
();
fs
.
initialize
(
null
,
conf
);
ChecksumFileSystem
LocalFileSystem
uses
ChecksumFileSystem
to do its work, and this class
makes it easy to add checksumming to other (nonchecksummed) filesystems, as
Check-
sumFileSystem
is just a wrapper around
FileSystem
. The general idiom is as fol-
lows:
FileSystem rawFs
= ...
FileSystem checksummedFs
=
new
ChecksumFileSystem
(
rawFs
);
The underlying filesystem is called the
raw
filesystem, and may be retrieved using the
getRawFileSystem()
method on
ChecksumFileSystem
.
Check-
sumFileSystem
has a few more useful methods for working with checksums, such as
getChecksumFile()
for getting the path of a checksum file for any file. Check the
documentation for the others.
If an error is detected by
ChecksumFileSystem
when reading a file, it will call its
reportChecksumFailure()
method. The default implementation does nothing, but
LocalFileSystem
moves the offending file and its checksum to a side directory on
the same device called
bad_files
. Administrators should periodically check for these bad
files and take action on them.