skip to main content
10.1145/1007912.1007933acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
Article

On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs

Published: 27 June 2004 Publication History

Abstract

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two algorithms, one serial and one parallel, to maintain series-parallel (SP) relationships "on the fly" for fork-join multithreaded programs. The serial SP-order algorithm runs in O(1) amortized time per operation. In contrast, the previously best algorithm requires a time per operation that is proportional to Tarjan's functional inverse of Ackermann's function. SP-order employs an order-maintenance data structure that allows us to implement a more efficient "English-Hebrew" labeling scheme than was used in earlier race detectors, which immediately yields an improved determinacy-race detector. In particular, any fork-join program running in T1 time on a single processor can be checked on the fly for determinacy races in O(T1) time. Corresponding improved bounds can also be obtained for more sophisticated data-race detectors, for example, those that use locks.By combining SP-order with Feng and Leiserson's serial SP-bags algorithm, we obtain a parallel SP-maintenance algorithm, called SP-hybrid. Suppose that a fork-join program has n threads, T1 work, and a critical-path length of T. When executed on P processors, we prove that SP-hybrid runs in O((T1/P +PT,/i>∞)lg <i.n) expected time. To understand this bound, consider that the original program obtains linear speed-up over a 1-processor execution when P=O(T1T). In contrast, SP-hybrid obtains linear speed-up when P=O(√T1T), but the work is increased by a factor of O(lg n).

References

[1]
S. Abiteboul, H. Kaplan, and T. Milo. Compact labeling schemes for ancestor queries. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 547--556, 2001.
[2]
S. Alstrup, P. Bille, and T. Rauhe. Labeling schemes for small distances in trees. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 689--698, 2003.
[3]
S. Alstrup, C. Gavoille, H. Kaplan, and T. Rauhe. Nearest common ancestors: a survey and a new distributed algorithm. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 258--264, 2002.
[4]
S. Alstrup, J. Holm, K. de Lichtenberg, and M. Thorup. Direct routing on trees. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 342--349, 1998.
[5]
S. Alstrup and T. Rauhe. Improved labeling scheme for ancestor queries. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 947--953, 2002.
[6]
R. J. Anderson and H. Woll. Wait-free parallel algorithms for the union-find problem. In Proceedings of the ACM Symposium on the Theory of Computing, pages 370--380, 1991.
[7]
A. Andersson and O. Petersson. Approximate indexed lists. Journal of Algorithms, 29:256--276, 1998.
[8]
M. Arias, L. J. Cowen, and K. A. Laing. Compact roundtrip routing with topology-independent node names. In Proceedings of the ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, pages 43--52, 2003.
[9]
N. Arora, R. Blumofe, and G. Plaxton. Thread scheduling for multi-programmed multiprocessors. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 119--129, 1998.
[10]
M. A. Bender, R. Cole, E. D. Demaine, M. Farach-Colton, and J. Zito. Two simplified algorithms for maintaining order in a list. In Proceedings of the European Syposium on Algorithms, pages 152--164, 2002.
[11]
R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: An efficient multithreaded runtime system. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 207--216, Santa Barbara, California, July 1995.
[12]
R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM, 46(5):720--748, 1999.
[13]
G.-I. Cheng, M. Feng, C. E. Leiserson, K. H. Randall, and A. F. Stark. Detecting data races in Cilk programs that use locks. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, Puerto Vallarta, Mexico, June 1998.
[14]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. The MIT Press and McGraw-Hill, second edition, 2001.
[15]
P. F. Dietz. Maintaining order in a linked list. In Proceedings of the ACM Symposium on the Theory of Computing, pages 122--127, May 1982.
[16]
P. F. Dietz, J. I. Seiferas, and J. Zhang. A tight lower bound for online monotonic list labeling. In Proceedings of the Scandinavian Workshop on Algorithm Theory, volume 824 of Lecture Notes in Computer Science, pages 131--142, July 1994.
[17]
P. F. Dietz and D. D. Sleator. Two algorithms for maintaining order in a list. In Proceedings of the ACM Symposium on the Theory of Computing, pages 365--372, May 1987.
[18]
P. F. Dietz and J. Zhang. Lower bounds for monotonic list labeling. In Proceedings of the Scandinavian Workshop on Algorithm Theory, volume 447 of Lecture Notes in Computer Science, July 1990.
[19]
A. Dinning and E. Schonberg. An empirical comparison of monitoring algorithms for access anomaly detection. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 1--10, 1990.
[20]
M. Feng and C. E. Leiserson. Efficient detection of determinacy races in Cilk programs. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 1--11, Newport, Rhode Island, June 1997.
[21]
M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 212--223, 1998.
[22]
C. Gavoille, D. Peleg, S. Pérennes, and R. Raz. Distance labeling in graphs. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 210--219, 2001.
[23]
A. Itai, A. G. Konheim, and M. Rodeh. A sparse table implementation of priority queues. In S. Even and O. Kariv, editors, Proceedings of the Colloquium on Automata, Languages, and Programming, volume 115 of Lecture Notes in Computer Science, pages 417--431, Acre (Akko), Israel, July 1981.
[24]
H. Kaplan, T. Milo, and R. Shabo. A comparison of labeling schemes for ancestor queries. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 954--963, 2002.
[25]
M. Katz, N. A. Katz, A. Korman, and D. Peleg. Labeling schemes for flow and connectivity. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 927--936, 2002.
[26]
J. Mellor-Crummey. On-the-fly detection of data races for programs with nested fork-join parallelism. In Proceedings of Supercomputing, pages 24--33, 1991.
[27]
I. Nudler and L. Rudolph. Tools for the efficient development of efficient parallel programs. In Proceedings of the First Israeli Conference on Computer Systems Engineering, May 1986.
[28]
Supercomputing Technologies Group, MIT Laboratory for Computer Science. Cilk 5.3.2 Reference Manual, November 2001.
[29]
R. E. Tarjan. Efficiency of a good but not linear set union algorithm. Journal of the ACM, 22(2):215--225, April 1975.
[30]
R. E. Tarjan. Applications of path compression on balanced trees. Journal of the ACM, 26(4):690--715, October 1979.
[31]
R. E. Tarjan. Data Structures and Network Algorithms. Society for Industrial and Applied Mathematics, 1983.
[32]
M. Thorup and U. Zwick. Compact routing schemes. In Proceedings of the ACMSymposium on Parallel Algorithms and Architectures, pages 1--10, 2001.
[33]
A. K. Tsakalidis. Maintaining order in a generalized linked list. Acta Informatica, 21(1):101--112, May 1984.
[34]
D. E. Willard. Inserting and deleting records in blocked sequential files. Technical Report TM81-45193-5, Bell Laboratories, 1981.
[35]
D. E. Willard. Maintaining dense sequential files in a dynamic en-vironment. In Proceedings of the ACM Symposium on the Theory of Computing, pages 114--121, San Francisco, California, May 1982.
[36]
D. E. Willard. Good worst-case algorithms for inserting and deleting records in dense sequential files. In Proceedings of the ACM International Conference on Management of Data, pages 251--260, Wash-ington, D.C., May 1986.
[37]
D. E. Willard. A density control algorithm for doing insertions and deletions in a sequentially ordered file in good worst-case time. Information and Computation, 97(2):150--204, April 1992.

Cited By

View all
  • (2024)Language-Agnostic Static Deadlock Detection for FuturesProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638487(68-79)Online publication date: 2-Mar-2024
  • (2022)PINT: Parallel INTerval-Based Race Detector2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00087(850-861)Online publication date: May-2022
  • (2021)Atomic power in forksProceedings of the Thirty-Second Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3458064.3458192(2141-2153)Online publication date: 10-Jan-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '04: Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
June 2004
332 pages
ISBN:1581138407
DOI:10.1145/1007912
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cilk
  2. SP-bags
  3. SP-hybrid
  4. SP-order
  5. algorithm
  6. amortized analysis
  7. data race
  8. data structure
  9. dynamic set
  10. fork-join
  11. graph
  12. least common ancestor
  13. locking
  14. multi-threading
  15. mutual exclusion
  16. on the fly
  17. order maintenance
  18. parallel computing
  19. parse tree
  20. race detection
  21. series-parallel
  22. thread
  23. trace
  24. tree
  25. work stealing

Qualifiers

  • Article

Conference

SPAA04

Acceptance Rates

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25
37th ACM Symposium on Parallelism in Algorithms and Architectures
July 28 - August 1, 2025
Portland , OR , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Language-Agnostic Static Deadlock Detection for FuturesProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638487(68-79)Online publication date: 2-Mar-2024
  • (2022)PINT: Parallel INTerval-Based Race Detector2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00087(850-861)Online publication date: May-2022
  • (2021)Atomic power in forksProceedings of the Thirty-Second Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3458064.3458192(2141-2153)Online publication date: 10-Jan-2021
  • (2021)iGUARDProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483545(49-65)Online publication date: 26-Oct-2021
  • (2020)ScoRDProceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture10.1109/ISCA45697.2020.00088(1036-1049)Online publication date: 30-May-2020
  • (2019)Processor-Oblivious Record and ReplayACM Transactions on Parallel Computing10.1145/33656596:4(1-28)Online publication date: 17-Dec-2019
  • (2019)Efficient race detection with futuresProceedings of the 24th Symposium on Principles and Practice of Parallel Programming10.1145/3293883.3295732(340-354)Online publication date: 16-Feb-2019
  • (2019)Model-checking task-parallel programs for data-raceInnovations in Systems and Software Engineering10.1007/s11334-019-00343-5Online publication date: 18-May-2019
  • (2019)Optimized Sound and Complete Data Race Detection in Structured Parallel ProgramsLanguages and Compilers for Parallel Computing10.1007/978-3-030-34627-0_8(94-111)Online publication date: 13-Nov-2019
  • (2018)Race detection and reachability in nearly series-parallel DAGsProceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3174304.3175277(156-171)Online publication date: 7-Jan-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media