research-article

An effective GPU implementation of breadth-first search

Authors:

Lijuan Luo,

Martin Wong,

Wen-mei HwuAuthors Info & Claims

DAC '10: Proceedings of the 47th Design Automation Conference

Pages 52 - 55

https://doi.org/10.1145/1837274.1837289

Published: 13 June 2010 Publication History

Get Access

Abstract

Breadth-first search (BFS) has wide applications in electronic design automation (EDA) as well as in other fields. Researchers have tried to accelerate BFS on the GPU, but the two published works are both asymptotically slower than the fastest CPU implementation. In this paper, we present a new GPU implementation of BFS that uses a hierarchical queue management technique and a three-layer kernel arrangement strategy. It guarantees the same computational complexity as the fastest sequential version and can achieve up to 10 times speedup.

References

[1]

C. Rodrigues, D. Hardy, J. Stone, K. Schulten, and W. Hwu, "GPU acceleration of cutoff pair potentials for molecular modeling applications," in ACM International Conference on Computing Frontiers, 2008, pp. 273--282.

Digital Library

Google Scholar

[2]

P. Harish and P. J. Narayanan, "Accelerating large graph algorithms on the GPU using CUDA," in IEEE High Performance Computing, 2007, pp 197--208.

Digital Library

Google Scholar

[3]

Y. Deng, B. Wang, and S. Mu, "Taming irregular EDA applications on GPUs," in ICCAD, 2009, pp. 539--546.

Digital Library

Google Scholar

[4]

C. Lauterbach, M. Garland, S. Sengupta, D. Luebke, and D. Manocha, "Fast BVH construction on GPUs," Computer Graphics Forum, vol. 28, no. 2., pp. 375--384, 2009.

Crossref

Google Scholar

[5]

A. Yoo, E. Chow, K. Henderson, W. Mcledon, B. Hendrickson, and Ü Catalyürek, "A scalable distributed parallel breadth-first search algorithm on Bluegene/L," in Proceedings of the ACM/IEEE Conference on Supercomputing, 2005, pp. 25--32.

Digital Library

Google Scholar

[6]

D. A. Bader and K. Madduri, "Designing Multithreaded Algorithms for Breadth-First Search and st-connectivity on the Cray MTA-2," in ICPP, 2006, pp. 523--530.

Digital Library

Google Scholar

[7]

D. P. Scarpazza, O. Villa, and F. Petrini, "Efficient breadth-first search on the Cell/BE processor," IEEE Trans. on Parallel Distributed Systems, vol. 19, no. 10, pp. 1381--1395, 2008.

Digital Library

Google Scholar

[8]

T. Cormen, C. Leiserson, R. Rivest, and C. Stein, Introduction to Algorithms, MIT press, 2001.

Digital Library

Google Scholar

[9]

http://www.dis.uniromal1.it/~challenge9/

Google Scholar

[10]

S. Xiao and W. Feng, "Inter-block GPU communication via fast barrier synchronization," Technical Report TR-09-19, Dept. of Computer Science, Virginia Tech.

Google Scholar

[11]

U. Meyer, "External memory BFS on undirected graphs with bounded degree," in SODA, 2001, pp. 87--88.

Digital Library

Google Scholar

Cited By

View all

Guo WWang QLiu XPeng MYang SLiang YShi YCao LLiu J(2025)PEbfs: Implement High-Performance Breadth-First Search on PEZY-SC3sAlgorithms and Architectures for Parallel Processing10.1007/978-981-96-1551-3_17(249-267)Online publication date: 17-Feb-2025
https://doi.org/10.1007/978-981-96-1551-3_17
Akbarzadeh NDarabi SGheibi-Fetrat AMirzaei ASadrosadati MSarbazi-Azad H(2024)H3DM: A High-bandwidth High-capacity Hybrid 3D Memory Design for GPUsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36390388:1(1-28)Online publication date: 21-Feb-2024
https://dl.acm.org/doi/10.1145/3639038
Yang HLu HSattar NLiu HWang F(2024)Establish the basis for Breadth-First Search on Frontier System: XBFS on AMD GPUsProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00090(650-658)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SCW63240.2024.00090
Show More Cited By

Index Terms

An effective GPU implementation of breadth-first search

Recommendations

Efficient breadth first search on multi-GPU systems

Simple algorithms for the execution of a Breadth First Search on large graphs lead, running on clusters of GPUs, to a situation of load unbalance among threads and un-coalesced memory accesses, resulting in pretty low performances. To obtain a ...
Enhanced GPU-based distributed breadth first search
CF '15: Proceedings of the 12th ACM International Conference on Computing Frontiers

There is growing interest in studying large scale graphs having millions of vertices and billions of edges, up to the point that a specific benchmark, called Graph500, has been defined to measure the performance of graph algorithms on modern computing ...
Breadth first search vectorization on the Intel Xeon Phi
CF '16: Proceedings of the ACM International Conference on Computing Frontiers

Breadth First Search (BFS) is a building block for graph algorithms and has recently been used for large scale analysis of information in a variety of applications including social networks, graph databases and web searching. Due to its importance, a ...

Comments

Information & Contributors

Information

Published In

DAC '10: Proceedings of the 47th Design Automation Conference

June 2010

1036 pages

ISBN:9781450300025

DOI:10.1145/1837274

General Chair:
Sachin Sapatnekar
Univ. of Minnesota, Minneapolis, MN

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

DAC '10

Sponsor:

EDAC
SIGDA

DAC '10: The 47th Annual Design Automation Conference 2010

June 13 - 18, 2010

California, Anaheim

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

178
Total Citations
View Citations
2,094
Total Downloads

Downloads (Last 12 months)86
Downloads (Last 6 weeks)15

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Guo WWang QLiu XPeng MYang SLiang YShi YCao LLiu J(2025)PEbfs: Implement High-Performance Breadth-First Search on PEZY-SC3sAlgorithms and Architectures for Parallel Processing10.1007/978-981-96-1551-3_17(249-267)Online publication date: 17-Feb-2025
https://doi.org/10.1007/978-981-96-1551-3_17
Akbarzadeh NDarabi SGheibi-Fetrat AMirzaei ASadrosadati MSarbazi-Azad H(2024)H3DM: A High-bandwidth High-capacity Hybrid 3D Memory Design for GPUsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36390388:1(1-28)Online publication date: 21-Feb-2024
https://dl.acm.org/doi/10.1145/3639038
Yang HLu HSattar NLiu HWang F(2024)Establish the basis for Breadth-First Search on Frontier System: XBFS on AMD GPUsProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00090(650-658)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SCW63240.2024.00090
Zhang YLuo GWang WCao SDong SYu DWang XWang K(2024)TTN: Topological Transformer Network for Automated Coronary Artery Branch Labeling in Cardiac CT AngiographyIEEE Journal of Translational Engineering in Health and Medicine10.1109/JTEHM.2023.332903112(129-139)Online publication date: 2024
https://doi.org/10.1109/JTEHM.2023.3329031
Moori MRocha HLorenzon ABeck A(2024)Allok: a machine learning approach for efficient graph execution on CPU–GPU clustersThe Journal of Supercomputing10.1007/s11227-024-06079-980:13(18838-18865)Online publication date: 23-May-2024
https://doi.org/10.1007/s11227-024-06079-9
Ziche FBombieri NBusato FGiugno R(2024)GPU-Accelerated BFS for Dynamic NetworksEuro-Par 2024: Parallel Processing10.1007/978-3-031-69583-4_6(74-87)Online publication date: 26-Aug-2024
https://doi.org/10.1007/978-3-031-69583-4_6
Đukić JMišić M(2023)An Evaluation of Directive-Based Parallelization on the GPU Using a Parboil BenchmarkElectronics10.3390/electronics1222455512:22(4555)Online publication date: 7-Nov-2023
https://doi.org/10.3390/electronics12224555
Hsieh CCheng PChang CKuo S(2023)A Decentralized Frontier Queue for Improving Scalability of Breadth-First-Search on GPUs2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137054(1-6)Online publication date: Apr-2023
https://doi.org/10.23919/DATE56975.2023.10137054
Hijma PHeldens SSclocco Avan Werkhoven BBal H(2023)Optimization Techniques for GPU ProgrammingACM Computing Surveys10.1145/357063855:11(1-81)Online publication date: 16-Mar-2023
https://dl.acm.org/doi/10.1145/3570638
Liang PTang YZhang XBai YSu TLai ZQiao LLi D(2023)A Survey on Auto-Parallelism of Large-Scale Deep Learning TrainingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.328193134:8(2377-2390)Online publication date: Aug-2023
https://doi.org/10.1109/TPDS.2023.3281931
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Efficient breadth first search on multi-GPU systems

Enhanced GPU-based distributed breadth first search

Breadth first search vectorization on the Intel Xeon Phi

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations