Assessing the benefits of fine-grain parallelism in dataflow programs

Abstract
A method for assessing the benefits of fine-grain parallelism in actual programs is presented. The method is based on parallelism profiles and speedup curves derived by executing dataflow graphs on an interpreter under progressively more realistic assumptions about processor resources and communication costs. It is shown that programs, even using traditional algorithms, exhibit ample parallelism when parallelism is exposed at all levels. Since only dataflow graphs compiled from the high-level language Id are considered, the bias introduced by the language and the compiler is examined. A method of estimating speedup through analysis of the ideal parallelism profile is developed, avoiding repeated execution of programs. It is shown that the fine-grain parallelism can be used to mask large, unpredictable memory latency and synchronization waits in architectures using dataflow instruction execution mechanisms. The effects of grouping portions of dataflow programs, such as function invocations or loop iterations, and requiring that the operators in a group execute on a single processor, are explored.

This publication has 9 references indexed in Scilit: