Database Reference
In-Depth Information
used, so there were significant performance savings based on the startup costs
for additional mappers.
A Flexible File System for Hadoop: GPFS-FPO
The General Parallel File System (GPFS) was developed by IBM Research in
the 1990s for high-performance computing (HPC) applications. Since its first
release in 1998, GPFS has been used in many of the world's fastest supercom-
puters, including Blue Gene, Watson (the Jeopardy! supercomputer), and the
reigning fastest supercomputer in the world, IBM Sequoia. In addition to
HPC, GPFS is commonly found in thousands of other mission-critical instal-
lations worldwide. Needless to say, GPFS has earned an enterprise-grade
reputation and pedigree for extreme scalability, high performance, and reli-
ability.
The design principles behind HDFS were defined by use cases that
assumed Hadoop workloads would involve sequential reads of very large
file sets (and no random writes to files already in the cluster—just append
writes). In contrast, GPFS has been designed for a wide variety of workloads
and for a multitude of uses.
Extending GPFS for Hadoop:
GPFS File Placement Optimization
In 2009, IBM extended GPFS to work with Hadoop. GPFS-Shared Nothing
Cluster was eventually renamed to GPFS-FPO (File Placement Optimization).
GPFS was originally available only as a storage area network (SAN) file sys-
tem, which typically isn't suitable for a Hadoop cluster, because MapReduce
jobs perform better when data is stored on the node where it's processed
(which requires locality awareness for the data). In a SAN, the location of the
data is transparent, which requires a high degree of network bandwidth and
disk I/O, especially in clusters with many nodes.
By storing your Hadoop data in GPFS-FPO, you are free from the design-
based restrictions that are inherent in HDFS. You can take advantage of
GPFS-FPO's pedigree as a multipurpose file system in your Hadoop cluster,
which gives you tremendous flexibility. A significant architectural difference
 
Search WWH ::




Custom Search