research-article

An efficient transactional memory algorithm for computing minimum spanning forest of sparse graphs

Authors:

David A. BaderAuthors Info & Claims

ACM SIGPLAN Notices, Volume 44, Issue 4

Pages 15 - 24

https://doi.org/10.1145/1594835.1504182

Published: 14 February 2009 Publication History

Abstract

Due to power wall, memory wall, and ILP wall, we are facing the end of ever increasing single-threaded performance. For this reason, multicore and manycore processors are arising as a new paradigm to pursue. However, to fully exploit all the cores in a chip, parallel programming is often required, and the complexity of parallel programming raises a significant concern. Data synchronization is a major source of this programming complexity, and Transactional Memory is proposed to reduce the difficulty caused by data synchronization requirements, while providing high scalability and low performance overhead.

The previous literature on Transactional Memory mostly focuses on architectural designs. Its impact on algorithms and applications has not yet been studied thoroughly. In this paper, we investigate Transactional Memory from the algorithm designer's perspective. This paper presents an algorithmic model to assist in the design of efficient Transactional Memory algorithms and a novel Transactional Memory algorithm for computing a minimum spanning forest of sparse graphs. We emphasize multiple Transactional Memory related design issues in presenting our algorithm. We also provide experimental results on an existing software Transactional Memory system. Our algorithm demonstrates excellent scalability in the experiments, but at the same time, the experimental results reveal the clear limitation of software Transactional Memory due to its high performance overhead. Based on our experience, we highlight the necessity of efficient hardware support for Transactional Memory to realize the potential of the technology.

References

[1]

C. S. Ananian, K. Asanovic, B. C. Kuszmaul, C. E. Leiserson, and S. Lie. Unbounded transactional memory. In Proc. 11th Int'l Conf. on High-Performance Computer Architecture (HPCA), San Francisco, CA, Feb. 2005.

Digital Library

[2]

K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, J. Shalf, S. W. Williams, and K. A. Yelick. The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006-183, Electrical Engineering and Computer Sciences, University of California at Berkeley, Dec. 2006.

[3]

D. A. Bader and G. Cong. Fast shared-memory algorithms for computing the minimum spanning forest of sparse graphs. Journal of Parallel and Distributed Computing, 66(11), 2006.

Digital Library

[4]

D. A. Bader and K. Madduri. SNAP, small-world network analysis and partitioning: an open-source parallel graph framework for the exploration of large-scale networks. In Proc. 22nd Int'l Parallel and Distributed Processing Symp. (IPDPS), Miami, FL, Apr. 2008.

[5]

C. Blundell, J. Devietti, E. C. Lewis, and M. M. K. Martin. Making the fast case common and the uncommon case simple in unbounded transactional memory. In Proc. 34th Ann. Int'l Symp. on Computer Architecture (ISCA), San Diego, CA, Jun. 2007.

Digital Library

[6]

C. Blundell, E. C. Lewis, and M. M. K. Martin. Deconstructing transactional semantics: The subtleties of atomicity. In Proc. 4th Ann. Workshop on Duplicating, Deconstructing, and Debunking (WDDD), Madison, WI, Jun. 2005.

[7]

J. Chung, C. C. Minh, B. D. Carlstrom, and C. Kozyrakis. Parallelizing SPECjbb2000 with transactional memory. In In Proc. Workshop on Transactional Memory Workloads (WTW), Ottawa, Canada, Jun. 2006.

[8]

P. Damron, A. Fedorova, Y. Lev, V. Luchangco, M. Moir, and D. Nussbaum. Hybrid transactional memory. In Proc. 12th Int'l Conf. on Architecture Support for Programming Languages and Operating Systems (ASPLOS), San Jose, CA, Oct. 2006.

Digital Library

[9]

D. Dice, M. Herlihy, D. Lea, Y. Lev, V. Luchangco, W. Mesard, M. Moir, K. Moore, and D. Nussbaum. Applications of the adaptive transactional memory test platform. In Proc. 3rd ACM SIGPLAN Workshop on Transactional Computing (TRANSACT), Salt Lake City, UT, Feb. 2008.

[10]

D. Dice, O. Shalev, and N. Shavit. Transactional locking II. In Proc. 20th Int'l Symp. on Distributed Computing (DISC), Stockholm, Sweden, Sep 2006.

Digital Library

[11]

R. Guerraoui, M. Kapalka, and J. Vitek. STMBench7: a benchmark for software transactional memory. In Proc. 2nd ACM SIGOPS/EuroSys European Conf. on Computer Systems (EuroSys), Lisbon, Portugal, Mar. 2007.

Digital Library

[12]

L. Hammond, V. Wong, M. Chen, B. D. Carlstrom, J. D. Davis, B. Hertzberg, M. K. Prabhu, H. Wijaya, C. Kozyrakis, and K. Olukotun. Transactional memory coherence and consistency. In Proc. 31st Ann. Int'l Symp. on Computer Architecture (ISCA), Munich, Germany, Jun. 2004.

Digital Library

[13]

T. Harris, S. Marlow, S. P. Jones, and M. Herlihy. Composable memory transactions. In Proc. 10th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Chicago, IL, Jun. 2005.

Digital Library

[14]

M. Herlihy, J. Eliot, and B. Moss. Transactional memory: Architectural support for lock-free data structures. In Proc. 20th Ann. Int'l Symp. on Computer Architecture (ISCA), New York, NY, May 1993.

Digital Library

[15]

M. Herlihy, V. Luchangco, M. Moir, and W. N. Scherer III. Software transactional memory for dynamic-sized data structures. In Proc. 22nd Ann. Symp. on Principle of Distributed Computing (PODC), Boston, MA, Jul. 2003.

Digital Library

[16]

M. Kulkarni, K. Pingali, G. Ramanarayanan, B. Walter, K. Bala, and L. P. Chew. Optimistic parallelism benefits from data partitioning. In Proc. 13th Int'l Conf. on Architecture Support for Programming Languages and Operating Systems (ASPLOS), Seattle, WA, Mar. 2008.

Digital Library

[17]

V. J. Marathe, W. N. Scherer III, and M. L. Scott. Adaptive software transactional memory. In Proc. 19th Int'l Symp. on Distributed Computing (DISC), Cracow, Poland, Mar. 2005.

Digital Library

[18]

K. Mehlhorn and S. Näher. The LEDA platform of combinatorial and geometric computing. Communications of the ACM, 38(1):96--102, 1995.

Digital Library

[19]

C. C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford transactional applications for multi-processing. In Proc. 11th IEEE Int'l Symp. on Workload Characterization (IISWC), Seattle, WA, Sep. 2008.

[20]

C. C. Minh, M. Trautmann, J. Chung, A. McDonald, N. Bronson, J. Casper, C. Kozyrakis, and K. Olukotun. An effective hybrid transactional memory system with strong isolation guarantees. In Proc. 34th Ann. Int'l Symp. on Computer Architecture (ISCA), San Diego, CA, Jun. 2007.

Digital Library

[21]

K. E. Moore, J. Bobba, M. J. Moravan, M. D. Hill, and D. A. Wood. Log™: Log-based transactional memory. In Proc. 12th Int'l Conf. on High-Performance Computer Architecture (HPCA), Austin, TX, Feb. 2006.

[22]

B. M. E. Moret and H. D. Shapiro. An empirical assessment of algorithms for constructing a minimal spanning tree. In DIMACS Monographs in Discrete Mathematics and Theoretical Computer Science: Computational Support for Discrete Mathematics 15, pages 99--117. American Mathematical Society, 1994.

[23]

C. Perfumo, N. Sonmez, S. Stipic, O. Unsal, A. Cristal, T. Harris, and M. Valero. The limits of software transactional memory (STM):dissecting Haskell STM applications on a many-core environment. In Proc. 5th ACM Int'l Conf. on Computing Frontiers (CF), Ischia, Italy, May 2008.

Digital Library

[24]

R. Rajwar, M. Herlihy, and K. Lai. Virtualizing transactional memory. In Proc. 32nd Ann. Int'l Symp. on Computer Architecture (ISCA), Madison, WI, Jun. 2005.

Digital Library

[25]

M. L. Scott, M. F. Spear, L. Dalessandro, and V. J. Marathe. Delaunay triangulation with transactions and barriers. In Proc. 10th IEEE Int'l Symp. on Workload Characterization (IISWC), Boston, MA, Sep. 2007.

Digital Library

[26]

A. Shriraman, M. F. Spear, H. H., V. J. Marathe, S. Dwarkadas, and M. L. Scott. An integrated hardware-software approach to flexible transactional memory. In Proc. 34th Ann. Int'l Symp. on Computer Architecture (ISCA), San Diego, CA, Jun. 2007.

Digital Library

[27]

M. Tremblay and S. Chaudhry. A third-generation 65nm 16-core 32-thread plus 32-scout-thread CMT SPARC processor. In Proc. Int'l Solid State Circuits Conf. (ISSCC), San Francisco, CA, Feb. 2008.

[28]

T. Vijaykumar, S. Gopal, J. E. Smith, and G. Sohi. Speculative versioning cache. In Proc. 5th Int'l Conf. on High-Performance Computer Architecture (HPCA), Las Vegas, NV, Jan. 1998.

Digital Library

[29]

C. von Praun, R. Bordawekar, and C. Cascaval. Modeling optimistic concurrency using quantitative dependence analysis. In Proc. 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Salt Lake City, UT, Feb. 2008.

Digital Library

[30]

I. Watson, C. Kirkham, and M. Lujan. A study of a transactional parallel routing algorithm. In Proc. 16th Int'l Conf. on Parallel Architectures and Compilation Techniques (PACT), Brasov, Romania, Sep. 2007.

Digital Library

[31]

Y. Zhang, L. Rauchwerger, and J. Torrellas. Hardware for speculative run-time parallelization in distributed shared-memory multiprocessors. In Proc. 5th Int'l Conf. on High-Performance Computer Architecture (HPCA), Las Vegas, NV, Jan. 1998.

Digital Library

Cited By

Yesil SHeidarshenas AMorrison ATorrellas JTaufer MBalaji PPeña A(2019)Understanding priority-based scheduling of graph algorithms on a shared-memory platformProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3295500.3356160(1-14)Online publication date: 17-Nov-2019
https://dl.acm.org/doi/10.1145/3295500.3356160
Giannoula CGoumas GKoziris N(2018)Combining HTM with RCU to Speed Up Graph Coloring on Multicore PlatformsHigh Performance Computing10.1007/978-3-319-92040-5_18(350-369)Online publication date: 29-May-2018
https://doi.org/10.1007/978-3-319-92040-5_18
Qayum MBadawy ACook JJacob B(2017)DyAdHyTMProceedings of the International Symposium on Memory Systems10.1145/3132402.3132442(327-336)Online publication date: 2-Oct-2017
https://dl.acm.org/doi/10.1145/3132402.3132442
Show More Cited By

Index Terms

An efficient transactional memory algorithm for computing minimum spanning forest of sparse graphs
1. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory
      1. Graph algorithms

Recommendations

An efficient transactional memory algorithm for computing minimum spanning forest of sparse graphs
PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming

Due to power wall, memory wall, and ILP wall, we are facing the end of ever increasing single-threaded performance. For this reason, multicore and manycore processors are arising as a new paradigm to pursue. However, to fully exploit all the cores in a ...
Transactional memory: from semantics to silicon
IWMSE '08: Proceedings of the 1st international workshop on Multicore software engineering

Multi-core architectures bring parallel programming into the mainstream. Parallel programming poses many new challenges to the developer, one of which is synchronizing concurrent access to shared memory by multiple threads. Programmers have ...
Unbounded page-based transactional memory
Proceedings of the 2006 ASPLOS Conference

Exploiting thread level parallelism is paramount in the multicore era. Transactions enable programmers to expose such parallelism by greatly simplifying the multi-threaded programming model. Virtualized transactions (unbounded in space and time) are ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 44, Issue 4

PPoPP '09

April 2009

294 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/1594835

Issue’s Table of Contents

PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
February 2009
322 pages
ISBN:9781605583976
DOI:10.1145/1504176
General Chair:
Daniel Reed
Microsoft Research, USA
,
Program Chair:
Vivek Sarkar
Rice University, USA

Copyright © 2009 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 February 2009

Published in SIGPLAN Volume 44, Issue 4

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

27
Total Citations
View Citations
655
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yesil SHeidarshenas AMorrison ATorrellas JTaufer MBalaji PPeña A(2019)Understanding priority-based scheduling of graph algorithms on a shared-memory platformProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3295500.3356160(1-14)Online publication date: 17-Nov-2019
https://dl.acm.org/doi/10.1145/3295500.3356160
Giannoula CGoumas GKoziris N(2018)Combining HTM with RCU to Speed Up Graph Coloring on Multicore PlatformsHigh Performance Computing10.1007/978-3-319-92040-5_18(350-369)Online publication date: 29-May-2018
https://doi.org/10.1007/978-3-319-92040-5_18
Qayum MBadawy ACook JJacob B(2017)DyAdHyTMProceedings of the International Symposium on Memory Systems10.1145/3132402.3132442(327-336)Online publication date: 2-Oct-2017
https://dl.acm.org/doi/10.1145/3132402.3132442
Qayum MBadawy ACook J(2017)DAdHTM: Low overhead dynamically adaptive hardware transactional memory for large graphs a scalability study2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI)10.1109/UIC-ATC.2017.8397653(1-8)Online publication date: Aug-2017
https://doi.org/10.1109/UIC-ATC.2017.8397653
Manoochehri SGoodarzi BGoswami D(2017)An Efficient Transaction-Based GPU Implementation of Minimum Spanning Forest Algorithm2017 International Conference on High Performance Computing & Simulation (HPCS)10.1109/HPCS.2017.100(643-650)Online publication date: Jul-2017
https://doi.org/10.1109/HPCS.2017.100
Qayum MBadawy ACook J(2017)StAdHyTM: A Statically Adaptive Hybrid Transactional Memory: A scalability study on large parallel graphs2017 IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC)10.1109/CCWC.2017.7868398(1-7)Online publication date: Jan-2017
https://doi.org/10.1109/CCWC.2017.7868398
Bihari BYang UWong Mde Supinski B(2016)Transactional Memory for Algebraic Multigrid SmoothersOpenMP: Memory, Devices, and Tasks10.1007/978-3-319-45550-1_23(320-335)Online publication date: 21-Sep-2016
https://doi.org/10.1007/978-3-319-45550-1_23
Besta MHoefler TKielmann THildebrand DTaufer M(2015)Accelerating Irregular Computations with Hardware Transactional Memory and Active MessagesProceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing10.1145/2749246.2749263(161-172)Online publication date: 15-Jun-2015
https://dl.acm.org/doi/10.1145/2749246.2749263
Kobayashi TSato SIwasaki H(2015)Efficient Use of Hardware Transactional Memory for Parallel Mesh GenerationProceedings of the 2015 44th International Conference on Parallel Processing (ICPP)10.1109/ICPP.2015.69(600-609)Online publication date: 1-Sep-2015
https://dl.acm.org/doi/10.1109/ICPP.2015.69
Lv HTan GChen MSun N(2013)Understanding parallelism in graph traversal on multi-core clustersComputer Science - Research and Development10.1007/s00450-012-0207-328:2-3(193-201)Online publication date: 1-May-2013
https://dl.acm.org/doi/10.1007/s00450-012-0207-3
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents