skip to main content
poster

To 4,000 compute nodes and beyond: network-aware vertex placement in large-scale graph processing systems

Published: 27 August 2013 Publication History

Abstract

The explosive growth of "big data" is giving rise to a new breed of large scale graph systems, such as Pregel. This poster describes our ongoing work in characterizing and minimizing the communication cost of Bulk Synchronous Parallel (BSP) graph mining systems, like Pregel, when scaling to 4,096 compute nodes. Existing implementations generally assume a fixed communication cost. This is sufficient in small deployments as the BSP programming model (i.e., overlapping computation and communication) masks small variations in the underlying network. In large scale deployments, such variations can dominate the overall runtime characteristics. In this poster, we first quantify the impact of network communication on the total compute time of a Pregel system. We then propose an efficient vertex placement strategy that subsamples highly connected vertices and applies the Reverse Cuthill-McKee (RCM) algorithm to efficiently partition the input graph and place partitions closer to each other based on their expected communication patterns. We finally describe a vertex replication strategy to further reduce communication overhead.

References

[1]
Gonzalez, J. E., Low, Y., Gu, H., Bickson, D., and Guestrin, C. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In Proceedings of USENIX OSDI (Hollywood, Oct 2012).
[2]
Hoefler, T., and Snir, M. Generic Topology Mapping Strategies for Large-scale Parallel Architectures. In ACM ICS (Tucson, AZ, June 2011).
[3]
Khayyat, Z., Awara, K., Alonazi, A., Jamjoom, H., Williams, D., and Kalnis, P. Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing. In ACM EuroSys (Prague, April 2013).
[4]
The Laboratory for Web Algorithmics. http://law.di.unimi.it/index.php, 2012.
[5]
Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, I., Leiser, N., and Czajkowski, G. Pregel: A System for Large-scale Graph Processing. In Proceedings of ACM SIGMOD (Indianapolis, IN, 2010).

Cited By

View all
  • (2021)3-D Partitioning for Large-Scale Graph ProcessingIEEE Transactions on Computers10.1109/TC.2020.298673670:1(111-127)Online publication date: 1-Jan-2021
  • (2016)Exploring the hidden dimension in graph processingProceedings of the 12th USENIX conference on Operating Systems Design and Implementation10.5555/3026877.3026900(285-300)Online publication date: 2-Nov-2016

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGCOMM Computer Communication Review
ACM SIGCOMM Computer Communication Review  Volume 43, Issue 4
October 2013
595 pages
ISSN:0146-4833
DOI:10.1145/2534169
Issue’s Table of Contents
  • cover image ACM Conferences
    SIGCOMM '13: Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
    August 2013
    580 pages
    ISBN:9781450320566
    DOI:10.1145/2486001
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 August 2013
Published in SIGCOMM-CCR Volume 43, Issue 4

Check for updates

Author Tags

  1. bulk synchronous parallel
  2. extreme scaling
  3. graph mining systems
  4. network topology
  5. vertex placement

Qualifiers

  • Poster

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)48
  • Downloads (Last 6 weeks)13
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)3-D Partitioning for Large-Scale Graph ProcessingIEEE Transactions on Computers10.1109/TC.2020.298673670:1(111-127)Online publication date: 1-Jan-2021
  • (2016)Exploring the hidden dimension in graph processingProceedings of the 12th USENIX conference on Operating Systems Design and Implementation10.5555/3026877.3026900(285-300)Online publication date: 2-Nov-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media