Article

On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs

Authors:

Michael A. Bender,

Jeremy T. Fineman,

Charles E. LeisersonAuthors Info & Claims

SPAA '04: Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures

Pages 133 - 144

https://doi.org/10.1145/1007912.1007933

Published: 27 June 2004 Publication History

Abstract

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two algorithms, one serial and one parallel, to maintain series-parallel (SP) relationships "on the fly" for fork-join multithreaded programs. The serial SP-order algorithm runs in O(1) amortized time per operation. In contrast, the previously best algorithm requires a time per operation that is proportional to Tarjan's functional inverse of Ackermann's function. SP-order employs an order-maintenance data structure that allows us to implement a more efficient "English-Hebrew" labeling scheme than was used in earlier race detectors, which immediately yields an improved determinacy-race detector. In particular, any fork-join program running in T₁ time on a single processor can be checked on the fly for determinacy races in O(T₁) time. Corresponding improved bounds can also be obtained for more sophisticated data-race detectors, for example, those that use locks.By combining SP-order with Feng and Leiserson's serial SP-bags algorithm, we obtain a parallel SP-maintenance algorithm, called SP-hybrid. Suppose that a fork-join program has n threads, T₁ work, and a critical-path length of T_∞. When executed on P processors, we prove that SP-hybrid runs in O((T₁/P +PT,/i>∞)lg <i.n) expected time. To understand this bound, consider that the original program obtains linear speed-up over a 1-processor execution when P=O(T₁T_∞). In contrast, SP-hybrid obtains linear speed-up when P=O(√T₁T_∞), but the work is increased by a factor of O(lg n).

References

[1]

S. Abiteboul, H. Kaplan, and T. Milo. Compact labeling schemes for ancestor queries. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 547--556, 2001.

Digital Library

[2]

S. Alstrup, P. Bille, and T. Rauhe. Labeling schemes for small distances in trees. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 689--698, 2003.

Digital Library

[3]

S. Alstrup, C. Gavoille, H. Kaplan, and T. Rauhe. Nearest common ancestors: a survey and a new distributed algorithm. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 258--264, 2002.

Digital Library

[4]

S. Alstrup, J. Holm, K. de Lichtenberg, and M. Thorup. Direct routing on trees. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 342--349, 1998.

Digital Library

[5]

S. Alstrup and T. Rauhe. Improved labeling scheme for ancestor queries. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 947--953, 2002.

Digital Library

[6]

R. J. Anderson and H. Woll. Wait-free parallel algorithms for the union-find problem. In Proceedings of the ACM Symposium on the Theory of Computing, pages 370--380, 1991.

Digital Library

[7]

A. Andersson and O. Petersson. Approximate indexed lists. Journal of Algorithms, 29:256--276, 1998.

Digital Library

[8]

M. Arias, L. J. Cowen, and K. A. Laing. Compact roundtrip routing with topology-independent node names. In Proceedings of the ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, pages 43--52, 2003.

Digital Library

[9]

N. Arora, R. Blumofe, and G. Plaxton. Thread scheduling for multi-programmed multiprocessors. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 119--129, 1998.

Digital Library

[10]

M. A. Bender, R. Cole, E. D. Demaine, M. Farach-Colton, and J. Zito. Two simplified algorithms for maintaining order in a list. In Proceedings of the European Syposium on Algorithms, pages 152--164, 2002.

Digital Library

[11]

R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: An efficient multithreaded runtime system. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 207--216, Santa Barbara, California, July 1995.

Digital Library

[12]

R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM, 46(5):720--748, 1999.

Digital Library

[13]

G.-I. Cheng, M. Feng, C. E. Leiserson, K. H. Randall, and A. F. Stark. Detecting data races in Cilk programs that use locks. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, Puerto Vallarta, Mexico, June 1998.

Digital Library

[14]

T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. The MIT Press and McGraw-Hill, second edition, 2001.

Digital Library

[15]

P. F. Dietz. Maintaining order in a linked list. In Proceedings of the ACM Symposium on the Theory of Computing, pages 122--127, May 1982.

Digital Library

[16]

P. F. Dietz, J. I. Seiferas, and J. Zhang. A tight lower bound for online monotonic list labeling. In Proceedings of the Scandinavian Workshop on Algorithm Theory, volume 824 of Lecture Notes in Computer Science, pages 131--142, July 1994.

Digital Library

[17]

P. F. Dietz and D. D. Sleator. Two algorithms for maintaining order in a list. In Proceedings of the ACM Symposium on the Theory of Computing, pages 365--372, May 1987.

Digital Library

[18]

P. F. Dietz and J. Zhang. Lower bounds for monotonic list labeling. In Proceedings of the Scandinavian Workshop on Algorithm Theory, volume 447 of Lecture Notes in Computer Science, July 1990.

Digital Library

[19]

A. Dinning and E. Schonberg. An empirical comparison of monitoring algorithms for access anomaly detection. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 1--10, 1990.

Digital Library

[20]

M. Feng and C. E. Leiserson. Efficient detection of determinacy races in Cilk programs. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 1--11, Newport, Rhode Island, June 1997.

Digital Library

[21]

M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 212--223, 1998.

Digital Library

[22]

C. Gavoille, D. Peleg, S. Pérennes, and R. Raz. Distance labeling in graphs. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 210--219, 2001.

Digital Library

[23]

A. Itai, A. G. Konheim, and M. Rodeh. A sparse table implementation of priority queues. In S. Even and O. Kariv, editors, Proceedings of the Colloquium on Automata, Languages, and Programming, volume 115 of Lecture Notes in Computer Science, pages 417--431, Acre (Akko), Israel, July 1981.

Digital Library

[24]

H. Kaplan, T. Milo, and R. Shabo. A comparison of labeling schemes for ancestor queries. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 954--963, 2002.

Digital Library

[25]

M. Katz, N. A. Katz, A. Korman, and D. Peleg. Labeling schemes for flow and connectivity. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 927--936, 2002.

Digital Library

[26]

J. Mellor-Crummey. On-the-fly detection of data races for programs with nested fork-join parallelism. In Proceedings of Supercomputing, pages 24--33, 1991.

Digital Library

[27]

I. Nudler and L. Rudolph. Tools for the efficient development of efficient parallel programs. In Proceedings of the First Israeli Conference on Computer Systems Engineering, May 1986.

[28]

Supercomputing Technologies Group, MIT Laboratory for Computer Science. Cilk 5.3.2 Reference Manual, November 2001.

[29]

R. E. Tarjan. Efficiency of a good but not linear set union algorithm. Journal of the ACM, 22(2):215--225, April 1975.

Digital Library

[30]

R. E. Tarjan. Applications of path compression on balanced trees. Journal of the ACM, 26(4):690--715, October 1979.

Digital Library

[31]

R. E. Tarjan. Data Structures and Network Algorithms. Society for Industrial and Applied Mathematics, 1983.

Digital Library

[32]

M. Thorup and U. Zwick. Compact routing schemes. In Proceedings of the ACMSymposium on Parallel Algorithms and Architectures, pages 1--10, 2001.

Digital Library

[33]

A. K. Tsakalidis. Maintaining order in a generalized linked list. Acta Informatica, 21(1):101--112, May 1984.

Digital Library

[34]

D. E. Willard. Inserting and deleting records in blocked sequential files. Technical Report TM81-45193-5, Bell Laboratories, 1981.

[35]

D. E. Willard. Maintaining dense sequential files in a dynamic en-vironment. In Proceedings of the ACM Symposium on the Theory of Computing, pages 114--121, San Francisco, California, May 1982.

Digital Library

[36]

D. E. Willard. Good worst-case algorithms for inserting and deleting records in dense sequential files. In Proceedings of the ACM International Conference on Management of Data, pages 251--260, Wash-ington, D.C., May 1986.

Digital Library

[37]

D. E. Willard. A density control algorithm for doing insertions and deletions in a sequentially ordered file in good worst-case time. Information and Computation, 97(2):150--204, April 1992.

Digital Library

Cited By

Muller SLee IChabbi MSteuwer M(2024)Language-Agnostic Static Deadlock Detection for FuturesProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638487(68-79)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1145/3627535.3638487
Xu YZhou AAgrawal KLee I(2022)PINT: Parallel INTerval-Based Race Detector2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00087(850-861)Online publication date: May-2022
https://doi.org/10.1109/IPDPS53621.2022.00087
Goodrich MJacob RSitchinava NMarx D(2021)Atomic power in forksProceedings of the Thirty-Second Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3458064.3458192(2141-2153)Online publication date: 10-Jan-2021
https://dl.acm.org/doi/10.5555/3458064.3458192
Show More Cited By

Index Terms

Recommendations

Provably Good and Practically Efficient Parallel Race Detection for Fork-Join Programs
SPAA '16: Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures

If a parallel program has determinacy race(s), different schedules can result in memory accesses that observe different values --- various race-detection tools have been designed to find such bugs. A key component of race detectors is an algorithm for ...
Nested parallelism in transactional memory
PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming

This paper investigates adding transactions with nested parallelism and nested transactions to a dynamically multithreaded parallel programming language that generates only series-parallel programs. We describe XConflict, a data structure that ...
Helper locks for fork-join parallel programming
PPoPP '10

Helper locks allow programs with large parallel critical sections, called parallel regions, to execute more efficiently by enlisting processors that might otherwise be waiting on the helper lock to aid in the execution of the parallel region. Suppose ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SPAA '04: Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures

June 2004

332 pages

ISBN:1581138407

DOI:10.1145/1007912

General Chair:
Phil Gibbons
Intel Research
,
Program Chair:
Micah Adler
University of Massachusetts

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SPAA04

Sponsor:

SPAA04: 16th ACM Symposium on Parallelism in Algorithms and Architectures 2004

June 27 - 30, 2004

Barcelona, Spain

Acceptance Rates

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25

Sponsor:
sigact
sigact

37th ACM Symposium on Parallelism in Algorithms and Architectures

July 28 - August 1, 2025

Portland , OR , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

71
Total Citations
View Citations
576
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Muller SLee IChabbi MSteuwer M(2024)Language-Agnostic Static Deadlock Detection for FuturesProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638487(68-79)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1145/3627535.3638487
Xu YZhou AAgrawal KLee I(2022)PINT: Parallel INTerval-Based Race Detector2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00087(850-861)Online publication date: May-2022
https://doi.org/10.1109/IPDPS53621.2022.00087
Goodrich MJacob RSitchinava NMarx D(2021)Atomic power in forksProceedings of the Thirty-Second Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3458064.3458192(2141-2153)Online publication date: 10-Jan-2021
https://dl.acm.org/doi/10.5555/3458064.3458192
Kamath ABasu A(2021)iGUARDProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483545(49-65)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3477132.3483545
Kamath AGeorge ABasu AMartínez JDuato JEeckhout L(2020)ScoRDProceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture10.1109/ISCA45697.2020.00088(1036-1049)Online publication date: 30-May-2020
https://dl.acm.org/doi/10.1109/ISCA45697.2020.00088
Utterback RAgrawal KLee IKulkarni M(2019)Processor-Oblivious Record and ReplayACM Transactions on Parallel Computing10.1145/33656596:4(1-28)Online publication date: 17-Dec-2019
https://dl.acm.org/doi/10.1145/3365659
Utterback RAgrawal KFineman JLee IHollingsworth JKeidar I(2019)Efficient race detection with futuresProceedings of the 24th Symposium on Principles and Practice of Parallel Programming10.1145/3293883.3295732(340-354)Online publication date: 16-Feb-2019
https://dl.acm.org/doi/10.1145/3293883.3295732
Nakade RMercer EAldous PStorey KOgles BHooker JPowell SMcCarthy J(2019)Model-checking task-parallel programs for data-raceInnovations in Systems and Software Engineering10.1007/s11334-019-00343-5Online publication date: 18-May-2019
https://doi.org/10.1007/s11334-019-00343-5
Storey KPowell JOgles BHooker JAldous PMercer E(2019)Optimized Sound and Complete Data Race Detection in Structured Parallel ProgramsLanguages and Compilers for Parallel Computing10.1007/978-3-030-34627-0_8(94-111)Online publication date: 13-Nov-2019
https://doi.org/10.1007/978-3-030-34627-0_8
Agrawal KDevietti JFineman JLee IUtterback RXu CCzumaj A(2018)Race detection and reachability in nearly series-parallel DAGsProceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3174304.3175277(156-171)Online publication date: 7-Jan-2018
https://dl.acm.org/doi/10.5555/3174304.3175277
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten