Space-efficient scheduling of nested parallelism
- 1 January 1999
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Programming Languages and Systems
- Vol. 21 (1) , 138-173
- https://doi.org/10.1145/314602.314607
Abstract
Many of today's high-level parallel languages support dynamic, fine-grained parallelism. These languages allow the user to expose all the parallelism in the program, which is typically of a much higher degree than the number of processors. Hence an efficient scheduling algorithm is required to assign computations to processors at runtime. Besides having low overheads and good load balancing, it is important for the scheduling algorithm to minimize the space usage of the parallel program. This article presents an on-line scheduling algorithm that is provably space efficient and time efficient for nested-parallel languages. For a computation with depth D and serial space requirement S 1 , the algorithm generates a schedule that requires at most S 1 + O (K•D•p ) space (including scheduler space) on p processors. Here, K is a user-adjustable runtime parameter specifying the net amount of memory that a thread may allocate before it is preempted by the scheduler. Adjusting the value of K provides a trade-off between the running time and the memory requirement of a parallel computation. To allow the scheduler to scale with the number of processors we also parallelize the scheduler and analyze the space and time bounds of the computation to include scheduling costs. In addition to showing that the scheduling algorithm is space and time efficient in theory, we demonstrate that it is effective in practice. We have implemented a runtime system that uses our algorithm to schedule lightweight parallel threads. The results of executing parallel programs on this system show that our scheduling algorithm significantly reduces memory usage compared to previous techniques, without compromising performance.Keywords
This publication has 26 references indexed in Scilit:
- Implementation of a Portable Nested Data-Parallel LanguageJournal of Parallel and Distributed Computing, 1994
- Trapezoid self-scheduling: a practical scheduling scheme for parallel compilersIEEE Transactions on Parallel and Distributed Systems, 1993
- Factoring: a method for scheduling parallel loopsCommunications of the ACM, 1992
- Lazy task creation: a technique for increasing the granularity of parallel programsIEEE Transactions on Parallel and Distributed Systems, 1991
- A report on the sisal language projectJournal of Parallel and Distributed Computing, 1990
- I-structures: data structures for parallel computingACM Transactions on Programming Languages and Systems, 1989
- WorkCrews: An abstraction for controlling parallelismInternational Journal of Parallel Programming, 1988
- Storage management in virtual tree machinesIEEE Transactions on Computers, 1988
- MULTILISP: a language for concurrent symbolic computationACM Transactions on Programming Languages and Systems, 1985
- A taxonomy of problems with fast parallel algorithmsInformation and Control, 1985