Database Reference
In-Depth Information
HDFS Concepts
Blocks
A disk has a block size, which is the minimum amount of data that it can read or write.
Filesystems for a single disk build on this by dealing with data in blocks, which are an in-
tegral multiple of the disk block size. Filesystem blocks are typically a few kilobytes in
size, whereas disk blocks are normally 512 bytes. This is generally transparent to the
filesystem user who is simply reading or writing a file of whatever length. However, there
are tools to perform filesystem maintenance, such as df and fsck , that operate on the filesys-
tem block level.
HDFS, too, has the concept of a block, but it is a much larger unit — 128 MB by default.
Like in a filesystem for a single disk, files in HDFS are broken into block-sized chunks,
which are stored as independent units. Unlike a filesystem for a single disk, a file in HDFS
that is smaller than a single block does not occupy a full block's worth of underlying stor-
age. (For example, a 1 MB file stored with a block size of 128 MB uses 1 MB of disk
space, not 128 MB.) When unqualified, the term “block” in this topic refers to a block in
HDFS.
WHY IS A BLOCK IN HDFS SO LARGE?
HDFS blocks are large compared to disk blocks, and the reason is to minimize the cost of seeks. If the
block is large enough, the time it takes to transfer the data from the disk can be significantly longer than
the time to seek to the start of the block. Thus, transferring a large file made of multiple blocks operates at
the disk transfer rate.
A quick calculation shows that if the seek time is around 10 ms and the transfer rate is 100 MB/s, to make
the seek time 1% of the transfer time, we need to make the block size around 100 MB. The default is actu-
ally 128 MB, although many HDFS installations use larger block sizes. This figure will continue to be re-
vised upward as transfer speeds grow with new generations of disk drives.
This argument shouldn't be taken too far, however. Map tasks in MapReduce normally operate on one
block at a time, so if you have too few tasks (fewer than nodes in the cluster), your jobs will run slower
than they could otherwise.
Having a block abstraction for a distributed filesystem brings several benefits. The first be-
nefit is the most obvious: a file can be larger than any single disk in the network. There's
nothing that requires the blocks from a file to be stored on the same disk, so they can take
advantage of any of the disks in the cluster. In fact, it would be possible, if unusual, to store
a single file on an HDFS cluster whose blocks filled all the disks in the cluster.
Search WWH ::




Custom Search