research-article

Scalable concurrency debugging with distributed graph processing

Authors:
Long Zheng

Huazhong University of Science and Technology, China

Huazhong University of Science and Technology, China
View Profile

,
Xiaofei Liao

Huazhong University of Science and Technology, China

Huazhong University of Science and Technology, China
View Profile

,
Hai Jin

Huazhong University of Science and Technology, China

Huazhong University of Science and Technology, China
View Profile

,
Jieshan Zhao

Huazhong University of Science and Technology, China

Huazhong University of Science and Technology, China
View Profile

,
Qinggang Wang

Huazhong University of Science and Technology, China

Huazhong University of Science and Technology, China
View Profile

CGO 2018: Proceedings of the 2018 International Symposium on Code Generation and OptimizationFebruary 2018Pages 188–199https://doi.org/10.1145/3168817

Published:24 February 2018Publication History

CGO 2018: Proceedings of the 2018 International Symposium on Code Generation and Optimization

Pages 188–199

ABSTRACT

Existing constraint-solving-based technique enables an efficient and high-coverage concurrency debugging. Yet, there remains a significant gap between the state of the art and the state of the programming practices for scaling to handle long-running execution of programs.

In this paper, we revisit the scalability problem of state-of-the-art constraint-solving-based technique. Our key insight is that concurrency debugging for many real-world bugs can be turned into a graph traversal problem. We therefore present GraphDebugger, a novel debugging framework to enable the scalable concurrency analysis on program graphs via a tailored graph-parallel analysis in a distributed environment. It is verified that GraphDebugger is more capable than CLAP in reproducing the real-world bugs that involve a complex concurrency analysis. Our extensive evaluation on 7 real-world programs shows that, GraphDebugger (deployed on an 8-node EC2 like cluster) is significantly efficient to reproduce concurrency bugs within 1∼8 minutes while CLAP does so with 1∼30 hours, or even without returning solutions.

References

J. Ahn, S. Hong, S. Yoo, O. Mutlu, and K. Choi. 2015. A scalable processing-in-memory accelerator for parallel graph processing. In Proceedings of the 42nd ACM/IEEE International Symposium on Computer Architecture (ISCA '15). 105-117. Google ScholarDigital Library
Karim Ali, Marianna Rapoport, Ondrej Lhoták, Julian Dolby, and Frank Tip. 2015. Type-Based Call Graph Construction Algorithms for Scala. ACM Transactions on Software Engneering and Methodology 25, 1 (2015), 9:1-9:43. Google ScholarDigital Library
Dan Alistarh, Justin Kopinsky, Jerry Li, and Nir Shavit. 2015. The SprayList: A Scalable Relaxed Priority Queue. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '15). 11-20. Google ScholarDigital Library
Saddek Bensalem, Marius Bozga, Thanh-Hung Nguyen, and Joseph Sifakis. 2009. D-Finder: A Tool for Compositional Deadlock Detection and Verification. In Proceedings of the 21st International Conference on Computer Aided Verification (CAV '09). 614-619. Google ScholarDigital Library
Sebastian Burckhardt, Pravesh Kothari, Madanlal Musuvathi, and Santosh Nagarakatte. 2010. A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs. In Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '10). 167-178. Google ScholarDigital Library
Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs. In Proceedings of the 10th European Conference on Computer Systems (EuroSys '15). 1:1-1:15. Google ScholarDigital Library
Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, and Sambavi Muthukrishnan. 2015. One Trillion Edges: Graph Processing at Facebook-scale. Proceedings of VLDB Endowment 8, 12 (2015), 1804- 1815. Google ScholarDigital Library
Ankit Choudhary, Shan Lu, and Michael Pradel. 2017. Efficient Detection of Thread Safety Violations via Coverage-guided Generation of Concurrent Tests. In Proceedings of the 39th International Conference on Software Engineering (ICSE '17). 266-277. Google ScholarDigital Library
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS'08). 337-340. Google ScholarDigital Library
Joseph Devietti, Brandon Lucia, Luis Ceze, and Mark Oskin. 2009. DMP: Deterministic Shared Memory Multiprocessing. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '09). 85-96. Google ScholarDigital Library
Dawson Engler and Ken Ashcraft. 2003. RacerX: effective, static detection of race conditions and deadlocks. In Proceedings of the 9th ACM Symposium on Operating Systems Principles (SOSP '03). 237-252. Google ScholarDigital Library
Diego Garbervetsky, Edgardo Zoppi, and Benjamin Livshits. 2017. Toward Full Elasticity in Distributed Static Analysis: The Case of Call-graph Analysis. In Proceedings of the 11th Joint Meeting on Foundations of Software Engineering (FSE '17). 442-453. Google ScholarDigital Library
Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed Graph-parallel Computation on Natural Graphs. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI '12). 17-30. Google ScholarDigital Library
Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph Processing in a Distributed Dataflow Framework. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI '14). 599-613. Google ScholarDigital Library
Michelle L. Goodstein, Shimin Chen, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2012. Chrysalis Analysis: Incorporating Synchronization Arcs in Dataflow-analysis-based Parallel Monitoring. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT '12). 201-212. Google ScholarDigital Library
Jeff Huang, Peng Liu, and Charles Zhang. 2010. LEAP: Lightweight Deterministic Multi-processor Replay of Concurrent Java Programs. In Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE '10). 207-216. Google ScholarDigital Library
Jeff Huang, Patrick O'Neil Meredith, and Grigore Rosu. 2014. Maximal Sound Predictive Race Detection with Control Flow Abstraction. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '14). 337-348. Google ScholarDigital Library
Jeff Huang and Charles Zhang. 2012. LEAN: Simplifying Concurrency Bug Reproduction via Replay-supported Execution Reduction. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA '12). 451-466. Google ScholarDigital Library
Jeff Huang, Charles Zhang, and Julian Dolby. 2013. CLAP: Recording Local Executions to Reproduce Concurrency Failures. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '13). 141-152. Google ScholarDigital Library
Shiyou Huang, Bowen Cai, and Jeff Huang. 2017. Towards Production-Run Heisenbugs Reproduction on Commercial Hardware. In Proceedings of the USENIX Annual Technical Conference (UNIENIX ATC '17). 403-115. Google ScholarDigital Library
Richard Johnson and Keshav Pingali. 1993. Dependence-based Program Analysis. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '93). 78-89. Google ScholarDigital Library
Hang Liu and H. Howie Huang. 2015. Enterprise: breadth-first graph traversal on GPUs. In Proceedings of the IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC '15). 1-12. Google ScholarDigital Library
Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud. Proceedings of the VLDB Endowment 5, 8 (2012), 716-727. Google ScholarDigital Library
Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. 2008. Learning from Mistakes: A Comprehensive Study on Real World Concurrency Bug Characteristics. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08). ACM, New York, NY, USA, 329-339. Google ScholarDigital Library
Nuno Machado, Brandon Lucia, and Luís Rodrigues. 2015. Concurrency Debugging with Differential Schedule Projections. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '15). 586-595. Google ScholarDigital Library
Nuno Machado, Brandon Lucia, and Luís Rodrigues. 2016. Production-guided Concurrency Debugging. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '16). 29:1-29:12. Google ScholarDigital Library
Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A System for Large-scale Graph Processing. In Proceedings of the ACM International Conference on Management of Data (SIGMOD '10). 135-146. Google ScholarDigital Library
Madanlal Musuvathi and Shaz Qadeer. 2007. Iterative Context Bounding for Systematic Testing of Multithreaded Programs. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '07). 446-455. Google ScholarDigital Library
Madanlal Musuvathi, Shaz Qadeer, Thomas Ball, Gerard Basler, Piramanayagam Arumuga Nainar, and Iulian Neamtiu. 2008. Finding and Reproducing Heisenbugs in Concurrent Programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI '08). 267-280. Google ScholarDigital Library
Santosh Nagarakatte, Sebastian Burckhardt, Milo M.K. Martin, and Madanlal Musuvathi. 2012. Multicore Acceleration of Priority-based Schedulers for Concurrency Bug Detection. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '12). 543-554. Google ScholarDigital Library
NERC Steering Group. 2004. Technical Analysis of the August 14, 2003, Blackout: What Happened, Why, and What Did We Learn? Technical Report. North American Electric Reliability Council.Google Scholar
Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A Lightweight Infrastructure for Graph Analytics. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP '13). 456-471. Google ScholarDigital Library
Marek Olszewski, Jason Ansel, and Saman Amarasinghe. 2009. Kendo: Efficient Deterministic Multithreading in Software. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '09). 97-108. Google ScholarDigital Library
Soyeon Park, Shan Lu, and Yuanyuan Zhou. 2009. CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '09). 25-36. Google ScholarDigital Library
Soyeon Park, Yuanyuan Zhou, Weiwei Xiong, Zuoning Yin, Rini Kaushik, Kyu H. Lee, and Shan Lu. 2009. PRES: probabilistic replay with execution sketching on multiprocessors. In Proceedings of the ACM International Symposium on Operating Systems Principles (SOSP '09). 177-192. Google ScholarDigital Library
Michael Reif, Michael Eichberg, Ben Hermann, Johannes Lerch, and Mira Mezini. 2016. Call Graph Construction for Java Libraries. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE '16). 474-486. Google ScholarDigital Library
Torsten Robschink and Gregor Snelting. 2002. Efficient Path Conditions in Dependence Graphs. In Proceedings of the 24th International Conference on Software Engineering (ICSE '02). 478-488. Google ScholarDigital Library
Mahmoud Said, Chao Wang, Zijiang Yang, and Karem Sakallah. 2011. Generating Data Race Witnesses by an SMT-based Analysis. In Proceedings of the 3rd International Conference on NASA Formal Methods (NFM'11). 313-327. Google ScholarDigital Library
Julian Shun and Guy E. Blelloch. 2013. Ligra: A Lightweight Graph Processing Framework for Shared Memory. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '13). 135-146. Google ScholarDigital Library
Stephen Soltesz, Herbert Pötzl, Marc E. Fiuczynski, Andy Bavier, and Larry Peterson. 2007. Container-based Operating System Virtualization: A Scalable, High-performance Alternative to Hypervisors. In Proceedings of the ACM SIGOPS European Conference on Computer Systems (EuroSys '07). 275-287. Google ScholarDigital Library
Yices Solver. 2013. http://yices.csl.sri.com/. (2013).Google Scholar
SPARC Manual V9. 1994. http://pages.cs.wisc.edu/~fischer/cs701.f08/sparc.v9.pdf. (1994).Google Scholar
Yulei Sui, Peng Di, and Jingling Xue. 2016. Sparse Flow-sensitive Pointer Analysis for Multithreaded Programs. In Proceedings of the 2016 International Symposium on Code Generation and Optimization (CGO '16). 160-170. Google ScholarDigital Library
Tian Tan, Yue Li, and Jingling Xue. 2017. Efficient and Precise Points-to Analysis: Modeling the Heap by Merging Equivalent Automata. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '17). 278-291. Google ScholarDigital Library
Carlos H. C. Teixeira, Alexandre J. Fonseca, Marco Serafini, Georgos Siganos, Mohammed J. Zaki, and Ashraf Aboulnaga. 2015. Arabesque: A System for Distributed Graph Mining. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP '15). 425-440. Google ScholarDigital Library
Aditya Thakur and R. Govindarajan. 2008. Comprehensive Path-sensitive Data-flow Analysis. In Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO '08). 55-63. Google ScholarDigital Library
Cheng Wang, Wei-Yu Chen, Youfeng Wu, Bratin Saha, and Ali-Reza Adl-Tabatabai. 2007. Code Generation and Optimization for Transactional Memory Constructs in an Unmanaged Language. In Proceedings of the International Symposium on Code Generation and Optimization (CGO '07). 34-48. Google ScholarDigital Library
Kai Wang, Aftab Hussain, Zhiqiang Zuo, Guoqing Xu, and Ardalan Amiri Sani. 2017. Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '17). 389-404. Google ScholarDigital Library
Ming Wu, Fan Yang, Jilong Xue, Wencong Xiao, Youshan Miao, Lan Wei, Haoxiang Lin, Yafei Dai, and Lidong Zhou. 2015. GraM: Scaling Graph Computation to the Trillions. In Proceedings of the 6th ACM Symposium on Cloud Computing (SoCC '15). 408-421. Google ScholarDigital Library
Chenning Xie, Rong Chen, Haibing Guan, Binyu Zang, and Haibo Chen. 2015. SYNC or ASYNC: Time to Fuse for Distributed Graph-parallel Computation. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '15). 194-204. Google ScholarDigital Library
Jie Yu, Satish Narayanasamy, Cristiano Pereira, and Gilles Pokam. 2012. Maple: A Coverage-driven Testing Tool for Multithreaded Programs. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA '12). 485-502. Google ScholarDigital Library
Long Zheng, Xiaofei Liao, Bingsheng He, Song Wu, and Hai Jin. 2015. On performance debugging of unnecessary lock contentions on multicore processors: a replay-based approach. In Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization (CGO '15). 56-67. Google ScholarCross Ref
Jingguo Zhou, Xiao Xiao, and Charles Zhang. 2012. Stride: Search-based deterministic replay in polynomial time via bounded linkage. In Proceedings of the 34th International Conference on Software Engineering (ICSE '12). 892-902. Google ScholarDigital Library
Andy Diwen Zhu, Xiaokui Xiao, Sibo Wang, and Wenqing Lin. 2013. Efficient Single-source Shortest Path and Distance Queries on Large Graphs. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '13). 998-1006. Google ScholarDigital Library

Index Terms

Scalable concurrency debugging with distributed graph processing
1. Social and professional topics
  1. Professional topics
    1. History of computing
      1. History of programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages

Recommendations

D4: fast concurrency debugging with parallel differential analysis
PLDI '18

We present D4, a fast concurrency analysis framework that detects concurrency bugs (e.g., data races and deadlocks) interactively in the programming phase. As developers add, modify, and remove statements, the code changes are sent to D4 to detect ...
Read More
D4: fast concurrency debugging with parallel differential analysis
PLDI 2018: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation

We present D4, a fast concurrency analysis framework that detects concurrency bugs (e.g., data races and deadlocks) interactively in the programming phase. As developers add, modify, and remove statements, the code changes are sent to D4 to detect ...
Read More
Scalable atomic visibility with RAMP transactions
SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data

Databases can provide scalability by partitioning data across several servers. However, multi-partition, multi-operation transactional access is often expensive, employing coordination-intensive locking, validation, or scheduling mechanisms. Accordingly,...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CGO 2018: Proceedings of the 2018 International Symposium on Code Generation and Optimization
February 2018
377 pages
ISBN:9781450356176
DOI:10.1145/3179541
General Chairs:
Jens Knoop
Vienna University of Technology, Austria
,
Markus Schordan
Lawrence Livermore National Laboratory, USA
,
Program Chairs:
Teresa Johnson
Google, USA
,
Michael O'Boyle
University of Edinburgh, UK
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 February 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
concurrency debugging
graph processing
multi-threaded programs
scalability
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate312of1,061submissions,29%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 443
  Total Downloads
- Downloads (Last 12 months)18
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Scalable concurrency debugging with distributed graph processing

CGO 2018: Proceedings of the 2018 International Symposium on Code Generation and Optimization

ABSTRACT

References

Cited By

Index Terms

Recommendations

D4: fast concurrency debugging with parallel differential analysis

D4: fast concurrency debugging with parallel differential analysis

Scalable atomic visibility with RAMP transactions

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Scalable concurrency debugging with distributed graph processing

CGO 2018: Proceedings of the 2018 International Symposium on Code Generation and Optimization

ABSTRACT

References

Cited By

Index Terms

Recommendations

D4: fast concurrency debugging with parallel differential analysis

D4: fast concurrency debugging with parallel differential analysis

Scalable atomic visibility with RAMP transactions

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media