Efficient Parallel Graph Exploration on Multi-Core CPU and GPU
Top Cited Papers
- 1 October 2011
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 1089795X,p. 78-88
- https://doi.org/10.1109/pact.2011.14
Abstract
Graphs are a fundamental data representation that has been used extensively in various domains. In graph-based applications, a systematic exploration of the graph such as a breadth-first search (BFS) often serves as a key component in the processing of their massive data sets. In this paper, we present a new method for implementing the parallel BFS algorithm on multi-core CPUs which exploits a fundamental property of randomly shaped real-world graph instances. By utilizing memory bandwidth more efficiently, our method shows improved performance over the current state-of-the-art implementation and increases its advantage as the size of the graph increases. We then propose a hybrid method which, for each level of the BFS algorithm, dynamically chooses the best implementation from: a sequential execution, two different methods of multicore execution, and a GPU execution. Such a hybrid approach provides the best performance for each graph size while avoiding poor worst-case performance on high-diameter graphs. Finally, we study the effects of the underlying architecture on BFS performance by comparing multiple CPU and GPU systems, a high-end GPU system performed as well as a quad-socket high-end CPU system.Keywords
This publication has 23 references indexed in Scilit:
- Accelerating CUDA graph algorithms at maximum warpACM SIGPLAN Notices, 2011
- Better benchmarking for supercomputersIEEE Spectrum, 2010
- Scalable Graph Exploration on Multicore ProcessorsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2010
- Efficient Breadth-First Search on the Cell/BE ProcessorIEEE Transactions on Parallel and Distributed Systems, 2008
- SNAP, Small-world Network Analysis and Partitioning: An open-source parallel graph framework for the exploration of large-scale networks2008 IEEE International Symposium on Parallel and Distributed Processing, 2008
- Early experience with out-of-core applications on the cray XMT2008 IEEE International Symposium on Parallel and Distributed Processing, 2008
- Accelerating Large Graph Algorithms on the GPU Using CUDAPublished by Springer Nature ,2008
- Designing Multithreaded Algorithms for Breadth-First Search and st-connectivity on the Cray MTA-2Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
- A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/LPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Collective dynamics of ‘small-world’ networksNature, 1998