Assessing the benefits of fine-grain parallelism in dataflow programs
- 6 January 2003
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
A method for assessing the benefits of fine-grain parallelism in actual programs is presented. The method is based on parallelism profiles and speedup curves derived by executing dataflow graphs on an interpreter under progressively more realistic assumptions about processor resources and communication costs. It is shown that programs, even using traditional algorithms, exhibit ample parallelism when parallelism is exposed at all levels. Since only dataflow graphs compiled from the high-level language Id are considered, the bias introduced by the language and the compiler is examined. A method of estimating speedup through analysis of the ideal parallelism profile is developed, avoiding repeated execution of programs. It is shown that the fine-grain parallelism can be used to mask large, unpredictable memory latency and synchronization waits in architectures using dataflow instruction execution mechanisms. The effects of grouping portions of dataflow programs, such as function invocations or loop iterations, and requiring that the operators in a group execute on a single processor, are explored.Keywords
This publication has 9 references indexed in Scilit:
- Resource requirements of dataflow programsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Toward a dataflow/von Neumann hybrid architecturePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- I-Structures: Data structures for parallel computingPublished by Springer Nature ,1987
- Partitioning parallel programs for macro-dataflowPublished by Association for Computing Machinery (ACM) ,1986
- Very Long Instruction Word architectures and the ELI-512Published by Association for Computing Machinery (ACM) ,1983
- Performance measurements on HEP - a pipelined MIMD computerPublished by Association for Computing Machinery (ACM) ,1983
- An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 FamilyComputer, 1981
- The CRAY-1 computer systemCommunications of the ACM, 1978
- The parallel execution of DO loopsCommunications of the ACM, 1974