Dell

Customizing Management of High-Performance Computing Clusters

High-performance computing clusters (HPCCs) consist of complex hardware, firmware and software. In Dell™ HPCCs, the hardware typically includes the latest generation of dual-core Intel® Xeon® processors, the latest chipsets, frontside buses with high data transfer rates, the latest memory architectures such as quad-channel fully buffered Double Data Rate 2 (DDR2) memory and high-performance PCI Express buses. The firmware comprises the Intelligent Platform Management Interface (IPMI), System Management BIOS (SMBIOS) and other industry-standard specifications. The software includes instrumentation services, Advanced Configuration and Power Interface (ACPI), and OpenIPMI modules integrated into the Linux® OS extended kernel, and the latest Dell OpenManagesuite supporting hardware management from the circuit level to cluster-level, out-of-band management.

In addition to these elements, the Platform Open Cluster
Stack (OCS) cluster computing software stack (formerly called Platform Rocks) supports cluster software deployment, resource management, monitoring of the cluster-level performance, cluster job state and resource utilization. This article addresses frequently used HPCC management components and best practices for large-scale deployments of HPCCs.