Resource-Aware Scientific Computation on a Heterogeneous Cluster
- 7 March 2005
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in Computing in Science & Engineering
- Vol. 7 (2) , 40-50
- https://doi.org/10.1109/mcse.2005.38
Abstract
Although researchers can develop software on small, local clusters and move it later to larger clusters and supercomputers, the software must run efficiently in both environments. Two efforts aim to improve the efficiency of scientific computation on clusters through resource-aware dynamic load balancing. The popularity of cost-effective clusters built from commodity hardware has opened up a new platform for the execution of software originally designed for tightly coupled supercomputers. Because these clusters can be built to include any number of processors ranging from fewer than 10 to thousands, researchers in high-performance scientific computation at smaller institutions or in smaller departments can maintain local parallel computing resources to support software development and testing, then move the software to larger clusters and supercomputers. As promising as this ability is, it has also led to the need for local expertise and resources to set up and maintain these clusters. The software must execute efficiently both on smaller local clusters and on larger ones. These computing environments vary in the number of processors, speed of processing and communication resources, and size and speed of memory throughout the memory hierarchy as well as in the availability of support tools and preferred programming paradigms. Software developed and optimized using a particular computing environment might not be as efficient when it's moved to another one. In this article, we describe a small cluster along with two efforts to improve the efficiency of parallel scientific computation on that cluster. Both approaches modify the dynamic load-balancing step of an adaptive solution procedure to tailor the distribution of data across the cooperating processes. This modification helps account for the heterogeneity and hierarchy in various computing environments.Keywords
This publication has 11 references indexed in Scilit:
- New challenges in dynamic load balancingApplied Numerical Mathematics, 2004
- MPICH-G2: A Grid-enabled implementation of the Message Passing InterfaceJournal of Parallel and Distributed Computing, 2003
- An Adaptive Discontinuous Galerkin Technique with an Orthogonal Basis Applied to Compressible Flow ProblemsSIAM Review, 2003
- Adaptive System Sensitive Partitioning of AMR Applications on Heterogeneous ClustersCluster Computing, 2002
- Multilevel mesh partitioning for heterogeneous communication networksFuture Generation Computer Systems, 2001
- Parallel optimisation algorithms for multilevel mesh partitioningParallel Computing, 2000
- The network weather service: a distributed resource performance forecasting service for metacomputingFuture Generation Computer Systems, 1999
- Parallel Multilevel series k-Way Partitioning Scheme for Irregular GraphsSIAM Review, 1999
- Parallel, adaptive finite element methods for conservation lawsApplied Numerical Mathematics, 1994
- High-order adaptive methods for parabolic systemsPhysica D: Nonlinear Phenomena, 1992