Global combine on mesh architectures with wormhole routing
- 30 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 156-162
- https://doi.org/10.1109/ipps.1993.262873
Abstract
Several algorithms are discussed for implementing global combine (summation) on distributed memory computers using a two-dimensional mesh interconnect with wormhole routing. These include algorithms that are asymptotically optimal for short vectors (O(log(p)) for p processing nodes) and for long vectors (O(n) for n data elements per node), as well as hybrid algorithms that are superior for intermediate n. Performance models are developed that include the effects of link conflicts and other characteristics of the underlying communication system. The models are validated using experimental data from the Intel Touchstone DELTA computer. Each of the combine algorithms is shown to be superior under some circumstances.<>Keywords
This publication has 2 references indexed in Scilit:
- The Touchstone 30 Gigaflop DELTA PrototypePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Efficient Global Combine OperationsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005