Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling
Top Cited Papers
- 1 June 2001
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Parallel and Distributed Systems
- Vol. 12 (6) , 529-543
- https://doi.org/10.1109/71.932708
Abstract
Scheduling jobs on the IBM SP2 system and many other distributed-memory MPPs is usually done by giving each job a partition of the machine for its exclusive use. Allocating such partitions in the order in which the jobs arrive (FCFS scheduling) is fair and predictable, but suffers from severe fragmentation, leading to low utilization. This situation led to the development of the EASY scheduler which uses aggressive backfilling: Small jobs are moved ahead to fill in holes in the schedule, provided they do not delay the first job in the queue. We compare this approach with a more conservative approach in which small jobs move ahead only if they do not delay any job in the queue and show that the relative performance of the two schemes depends on the workload. For workloads typical on SP2 systems, the aggressive approach is indeed better, but, for other workloads, both algorithms are similar. In addition, we study the sensitivity of backfilling to the accuracy of the runtime estimates provided by the users and find a very surprising result. Backfilling actually works better when users overestimate the runtime by a substantial factor.Keywords
This publication has 20 references indexed in Scilit:
- The elusive goal of workload characterizationACM SIGMETRICS Performance Evaluation Review, 1999
- Scheduling of a parallel workload: Implementation and use of the argonne easy scheduler at PDCPublished by Springer Nature ,1998
- Predicting application run times using historical informationPublished by Springer Nature ,1998
- A comparative study of real workload traces and synthetic workload models for parallel job schedulingPublished by Springer Nature ,1998
- Using queue time predictions for processor allocationPublished by Springer Nature ,1997
- The EASY — LoadLeveler API projectPublished by Springer Nature ,1996
- Packing schemes for gang schedulingPublished by Springer Nature ,1996
- Workload evolution on the Cornell Theory Center IBM SP2Published by Springer Nature ,1996
- Job scheduling is more important than processor allocation for hypercube computersIEEE Transactions on Parallel and Distributed Systems, 1994
- A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessorsACM Transactions on Computer Systems, 1993