Overcoming the startup time problem in distributed memory architectures

9 December 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. i, 551-559
https://doi.org/10.1109/hicss.1991.183927

Abstract

Massively parallel systems are distributed memory architectures consisting of a very large number of autonomously operating, interconnected nodes. The nodes cooperate through message passing via a high-speed interconnection network. The critical issue in such systems is communication latency. A major component of communication latency is the message startup time, i.e. the time it takes to execute the appropriate operating system kernel function for message passing. One of the necessary system optimizations therefore concerns the startup time minimization. The author's approach to this problem is to provide different versions of the kernel for the different modes of operation of the system. All versions have the same interface to the other modules of the distributed operating system and, thus, are interchangeable. This is accomplished by representing the message-passing kernel as an abstract datatype.

Keywords

This publication has 9 references indexed in Scilit:

Performance of the world's fastest distributed operating system
ACM SIGOPS Operating Systems Review, 1988
The Design of a Capability-Based Distributed Operating System
The Computer Journal, 1986
Distributed operating systems
ACM Computing Surveys, 1985
The structuring of systems using upcalls
Published by Association for Computing Machinery (ACM) ,1985
An Empirical Study of Distributed Application Performance
Published by Defense Technical Information Center (DTIC) ,1985
The V Kernel: A Software Base for Distributed Systems
IEEE Software, 1984
Designing Software for Ease of Extension and Contraction
IEEE Transactions on Software Engineering, 1979
Modularization and hierarchy in a family of operating systems
Communications of the ACM, 1976
Programming with abstract data types
ACM SIGPLAN Notices, 1974