Utopia: A load sharing facility for large, heterogeneous distributed computer systems
- 1 December 1993
- journal article
- research article
- Published by Wiley in Software: Practice and Experience
- Vol. 23 (12) , 1305-1336
- https://doi.org/10.1002/spe.4380231203
Abstract
Load sharing in large, heterogeneous distributed systems allows users to access vast amounts of computing resources scattered around the system and may provide substantial performance improvements to applications. We discuss the design and implementation issues in Utopia, a load sharing facility specifically built for large and heterogeneous systems. The system has no restriction on the types of tasks that can be remotely executed, involves few application changes and no operating system change, supports a high degree of transparency for remote task execution, and incurs low overhead. The algorithms for managing resource load information and task placement take advantage of the clustering nature of large‐scale distributed systems; centralized algorithms are used within host clusters, and directed graph algorithms are used among the clusters to make Utopia scalable to thousands of hosts. Task placements in Utopia exploit the heterogeneous hosts and consider varying resource demands of the tasks. A range of mechanisms for remote execution is available in Utopia that provides varying degrees of transparency and efficiency.A number of applications have been developed for Utopia, ranging from a load sharing command interpreter, to parallel and distributed applications, to a distributed batch facility. For example, an enhanced Unix command interpreter allows arbitrary commands and user jobs to be executed remotely, and a parallel make facility achieves speed‐ups of 15 or more by processing a collection of tasks in parallel on a number of hosts.Keywords
This publication has 21 references indexed in Scilit:
- Transparent process migration: Design alternatives and the sprite implementationSoftware: Practice and Experience, 1991
- Technical correspondenceCommunications of the ACM, 1989
- The limited performance benefits of migrating active processes for load sharingACM SIGMETRICS Performance Evaluation Review, 1988
- Interconnecting heterogeneous computer systemsCommunications of the ACM, 1988
- Scale and performance in a distributed file systemACM Transactions on Computer Systems, 1988
- Using idle workstations in a shared computing environmentACM SIGOPS Operating Systems Review, 1987
- Preemptable remote execution facilities for the V-systemACM SIGOPS Operating Systems Review, 1985
- A Butler process for resource sharing on Spice machinesACM Transactions on Information Systems, 1985
- Implementing remote procedure callsACM Transactions on Computer Systems, 1984
- The LOCUS distributed operating systemACM SIGOPS Operating Systems Review, 1983