Automated learning of load-balancing strategies in multiprogrammed distributed systems
- 1 July 1997
- journal article
- research article
- Published by Taylor & Francis in International Journal of Systems Science
- Vol. 28 (11) , 1077-1099
- https://doi.org/10.1080/00207729708929470
Abstract
Dynamic load-balancing strategies for distributed systems seek to improve average completion time of independent tasks by migrating each incoming task to the site where it is expected to finish the fastest: usually the site having the smallest load index. SMALL is an offline learning system for developing configuration-specific load-balancing strategies; it learns new load indices as well as tunes the parameters of given migration policies. Using a dynamic workload generator, a number of typical systemwide load patterns are first recorded; the completion times of several benchmark jobs are then measured at each site, under each of the recorded load patterns. These measurements are used to train comparator neural networks simultaneously, one per site. The comparators collectively model a set of perfect load indices in that they seek to rank, at arrival time, the possible destinations for an incoming task by their (not yet known) respective completion times. The numerous parameters of the decentralized dynamic load-balancing policy are then tuned using a genetic algorithm. We present experimental results for a mix of scientific and interactive workloads on Sun workstations connected by Ethernet. The policies tuned by SMALL are shown to exploit idle resources intelligently and effectively.Keywords
This publication has 9 references indexed in Scilit:
- Population-based learning: a method for learning from examples under resource constraintsIEEE Transactions on Knowledge and Data Engineering, 1992
- Intelligent process mapping through systematic improvement of heuristicsJournal of Parallel and Distributed Computing, 1992
- Predictability of process resource usage: a measurement-based study on UNIXIEEE Transactions on Software Engineering, 1989
- The Perfect Club Benchmarks: Effective Performance Evaluation of SupercomputersThe International Journal of Supercomputing Applications, 1989
- A parallel network that learns to play backgammonArtificial Intelligence, 1989
- GAMMON: a load balancing strategy for local computer systems with multiaccess networksIEEE Transactions on Computers, 1989
- Performance Studies of Dynamic Load Balancing in Distributed SystemsPublished by Defense Technical Information Center (DTIC) ,1987
- Using stochastic learning automata for job scheduling in distributed processing systemsJournal of Parallel and Distributed Computing, 1986
- An Application of Bayesian Decision Theory to Decentralized Control of Job SchedulingIEEE Transactions on Computers, 1985