Hardware Reference
In-Depth Information
Chapter 9
GPFS
Dean Hildebrand and Frank Schmuck
IBM Research
9.1
Motivation :::::::::::::::::::::::::::::::::::::::::::::::::::::::: 107
9.2
Design and Architecture ::::::::::::::::::::::::::::::::::::::::: 108
9.2.1
Shared Storage Model ::::::::::::::::::::::::::::::::::: 108
9.2.2
Design Overview ::::::::::::::::::::::::::::::::::::::::: 110
9.2.3
Distributed Locking and Metadata Management ::::::: 111
9.2.3.1
The Distributed Lock Manager :::::::::::: 111
9.2.3.2
Metadata Management :::::::::::::::::::: 112
9.2.3.3
Concurrent Directory Updates ::::::::::::: 113
9.2.4
Advanced Data Management :::::::::::::::::::::::::::: 114
9.2.4.1
GPFS Native RAID :::::::::::::::::::::::: 114
9.2.4.2
Information Lifecycle Management :::::::: 114
9.2.4.3
Wide-Area Caching and Replication ::::::: 115
9.3
Deployment and Usage ::::::::::::::::::::::::::::::::::::::::::: 116
9.3.1
Usage Examples ::::::::::::::::::::::::::::::::::::::::: 117
9.4
Conclusion :::::::::::::::::::::::::::::::::::::::::::::::::::::::: 117
Bibliography :::::::::::::::::::::::::::::::::::::::::::::::::::::: 118
9.1 Motivation
The GPFS is IBM's parallel le system for high performance computing
and data-intensive applications [7, 10]. GPFS is designed to enable seamless
and ecient data sharing between nodes within and across small-to-large clus-
ters using standard file system APIs and standard POSIX [1] semantics. At
the same time, GPFS exploits the parallelism and redundancy available in a
cluster to provide the large capacity, high performance, and high availability
required by the most demanding scientific and commercial applications.
GPFS originates from an IBM research project in the early 1990s called
TigerShark, which focused on high performance lossless streaming of multime-
dia video files [6]. It was subsequently extended to serve as a general-purpose
le system for high performance computing applications on IBM's Scalable
107
 
Search WWH ::




Custom Search