Lustre - High Performance Parallel I/O

Hardware Reference

In-Depth Information

The majority of Lustre deployments are single-site, however a number of

prominent Lustre users have deployed Lustre over wide-area networks (WAN)

in order to share data between geographically distant locations. The Naval Re-

search Laboratory [8] pioneered the deployment of Lustre using RDMA over

WAN networks to create a globally accessible storage and compute cloud.

Indiana University, funded by the National Science Foundation, created the

\Data Capacitor"|a 535-TB Lustre le system connected to wide-area net-

works, first across 10 Gigabit connections [13], and most recently across 100

Gigabit connections [7]. The latest Data Capacitor (DC II) is 10 times larger

and roughly 3 times faster than its predecessor and continues to run Lustre

because of its speed and scalability.

8.4 Conclusion

The Lustre file system has grown from a DOE research initiative to be-

come the extreme-scale HPC file system of choice. Large Lustre deployments

have reported tens of thousands of clients, tens of petabytes of capacity, and

sustained write/read speeds measured in terabytes per second. For example,

the \Spider" le system deployed at ORNL has 26,000 clients [2]; the Sequoia

system at LLNL has 55 PB of storage capacity [15]; and the Fujitsu K Sys-

tem has reported 1.2 TB/s sustained write and over 2.0 TB/s sustained read

performance [16].

In response to the increasing momentum behind \big data" and especially

a growing interest in MapReduce as a tool for analysis in the HPC community,

a corresponding interest has developed in providing Hadoop runtime that can

exploit Lustre storage effectively. In recent developments, a storage abstraction

that allows Hadoop to use Lustre in place of HDFS has been shown to double

I/O performance [9] while retaining the benefits of full POSIX access so that

datasets can be analyzed in-site.

Lustre is also being used in the DOE Fast Forward Storage and I/O project

to leverage research in Exascale storage systems [1]. These systems will push

I/O scaling limits to the absolute extreme and require a drastic re-think of

POSIX storage semantics. From a surface appearance, therefore, they will

have very little in common with Lustre. However, distributed shared-nothing

object storage will continue, as it is in Lustre today, to be a foundational

system component. As the Fast Forward prototype evolves into tomorrow's

production Exascale storage systems, this underlying architecture, includ-

ing some of the actual source code, will be able to trace its roots back to

Lustre.

Search WWH ::

Custom Search

Home