MapReduce - Hadoop: The Definitive Guide

Database Reference

In-Depth Information

HDFS Concepts

Blocks

A disk has a block size, which is the minimum amount of data that it can read or write.

Filesystems for a single disk build on this by dealing with data in blocks, which are an in-

tegral multiple of the disk block size. Filesystem blocks are typically a few kilobytes in

size, whereas disk blocks are normally 512 bytes. This is generally transparent to the

filesystem user who is simply reading or writing a file of whatever length. However, there

are tools to perform filesystem maintenance, such as df and fsck , that operate on the filesys-

tem block level.

HDFS, too, has the concept of a block, but it is a much larger unit — 128 MB by default.

Like in a filesystem for a single disk, files in HDFS are broken into block-sized chunks,

which are stored as independent units. Unlike a filesystem for a single disk, a file in HDFS

that is smaller than a single block does not occupy a full block's worth of underlying stor-

age. (For example, a 1 MB file stored with a block size of 128 MB uses 1 MB of disk

space, not 128 MB.) When unqualified, the term “block” in this topic refers to a block in

HDFS.

WHY IS A BLOCK IN HDFS SO LARGE?

HDFS blocks are large compared to disk blocks, and the reason is to minimize the cost of seeks. If the

block is large enough, the time it takes to transfer the data from the disk can be significantly longer than

the time to seek to the start of the block. Thus, transferring a large file made of multiple blocks operates at

the disk transfer rate.

A quick calculation shows that if the seek time is around 10 ms and the transfer rate is 100 MB/s, to make

the seek time 1% of the transfer time, we need to make the block size around 100 MB. The default is actu-

ally 128 MB, although many HDFS installations use larger block sizes. This figure will continue to be re-

vised upward as transfer speeds grow with new generations of disk drives.

This argument shouldn't be taken too far, however. Map tasks in MapReduce normally operate on one

block at a time, so if you have too few tasks (fewer than nodes in the cluster), your jobs will run slower

than they could otherwise.

Having a block abstraction for a distributed filesystem brings several benefits. The first be-

nefit is the most obvious: a file can be larger than any single disk in the network. There's

nothing that requires the blocks from a file to be stored on the same disk, so they can take

advantage of any of the disks in the cluster. In fact, it would be possible, if unusual, to store

a single file on an HDFS cluster whose blocks filled all the disks in the cluster.

Search WWH ::

Custom Search

Home