White Paper: High Performance, Open Source, Dell Lustre Storage System
Date
:
7/1/2010
The following paper was produced by the Dell™ Cambridge HPC Solution Centre and is based on operational experience gained from using mass storage technologies within a production university high-performance computing (HPC) environment.
The paper provides a detailed description of how to build a commodity Dell Lustre™ storage brick and provides comprehensive performance characteristics obtained from the Lustre storage brick when integrated into a Gigabit Ethernet storage network. The paper also discusses the operational characteristics and system administration best practices derived from over two years of production usage. The performance data shows good I/O (input/output) throughput using the Lustre file system on top of the Dell storage brick, yielding 80 percent of the bare-metal MD3000 storage array performance. Each Dell Lustre storage brick is able to provide 400 MB/s read/write I/O bandwidth through the file system layer, this performance scaling linearly with each additional storage brick. Over Gigabit Ethernet, each client is able to achieve an I/O bandwidth of 100 MB/s, which scales linearly with successive clients until the back-end bandwidth is saturated. We have scaled such a system to hundreds of terabytes with over 2 GB/s total back-end storage I/O performance and 600 clients.
The Dell Lustre storage brick shows good overall performance and scalability characteristics with a high data security and availability record when in production. From usage experience over several years, it can be said to be a good fit for departmental and workgroup HPC installations. A large 270 TB (6 brick) configuration within the Cambridge production environment has demonstrated very good operational characteristics with an unscheduled downtime of less than 0.5 percent over two years of 24x7 service.
The paper provides a detailed description of how to build a commodity Dell Lustre™ storage brick and provides comprehensive performance characteristics obtained from the Lustre storage brick when integrated into a Gigabit Ethernet storage network. The paper also discusses the operational characteristics and system administration best practices derived from over two years of production usage. The performance data shows good I/O (input/output) throughput using the Lustre file system on top of the Dell storage brick, yielding 80 percent of the bare-metal MD3000 storage array performance. Each Dell Lustre storage brick is able to provide 400 MB/s read/write I/O bandwidth through the file system layer, this performance scaling linearly with each additional storage brick. Over Gigabit Ethernet, each client is able to achieve an I/O bandwidth of 100 MB/s, which scales linearly with successive clients until the back-end bandwidth is saturated. We have scaled such a system to hundreds of terabytes with over 2 GB/s total back-end storage I/O performance and 600 clients.
The Dell Lustre storage brick shows good overall performance and scalability characteristics with a high data security and availability record when in production. From usage experience over several years, it can be said to be a good fit for departmental and workgroup HPC installations. A large 270 TB (6 brick) configuration within the Cambridge production environment has demonstrated very good operational characteristics with an unscheduled downtime of less than 0.5 percent over two years of 24x7 service.
| Note: Lustre is a registered trademark of Oracle Corporation. |
