research-article

A divide-and-conquer parallel pattern implementation for multicores

Authors:

Marco Danelutto,

Tiziano De Matteis,

Gabriele Mencagli,

Massimo TorquatiAuthors Info & Claims

SEPS 2016: Proceedings of the 3rd International Workshop on Software Engineering for Parallel Systems

Pages 10 - 19

https://doi.org/10.1145/3002125.3002128

Published: 21 October 2016 Publication History

Abstract

Divide-and-Conquer (DaC) is a sequential programming paradigm which models a large class of algorithms used in real-life applications. Although suitable to extract parallelism in a straightforward way, the parallel implementation of DaC algorithms still requires some expertise in parallel programming tools by the programmer.

In this paper we aim at providing to non-expert programmers a high-level solution for fast prototyping parallel DaC programs on multicores with minimal programming effort.

Following the rationale of parallel design pattern methodology, we design a C++11-compliant template interface for developing parallel DaC programs. The interface is implemented using different back-end frameworks (i.e. OpenMP, Intel TBB and FastFlow) supporting source code reuse and a certain amount of performance portability.

Experiments on a 24-core Intel server show the effectiveness of our approach: with a reduced programming effort the programmer easily prototypes parallel versions with performance comparable with hand-made parallelizations.

References

[1]

M. Aldinucci, M. Danelutto, P. Kilpatrick, M. Meneghin, and M. Torquati. An efficient unbounded lock-free queue for multi-core systems. In Proc. of 18th Intl. Euro-Par 2012 Parallel Processing, volume 7484 of LNCS, pages 662–673, Rhodes Island, Greece, Aug. 2012. Springer.

Digital Library

[2]

65.

[3]

E. Ayguadé, N. Copty, A. Duran, J. Hoeflinger, Y. Lin, F. Massaioli, E. Su, P. Unnikrishnan, and G. Zhang. A Proposal for Task Parallelism in OpenMP, pages 1–12. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008. ISBN 978-3- 540-69303-1. 1.

Digital Library

[4]

G. E. Blelloch, R. A. Chowdhury, P. B. Gibbons, V. Ramachandran, S. Chen, and M. Kozuch. Provably good multicore cache performance for divide-and-conquer algorithms. In Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’08, pages 501–510, Philadelphia, PA, USA, 2008. Society for Industrial and Applied Mathematics.

Digital Library

[5]

D. Buono, M. Danelutto, T. D. Matteis, G. Mencagli, and M. Torquati. A lightweight run-time support for fast dense linear algebra on multi-core. In Proc. of the 12th International Conference on Parallel and Distributed Computing and Networks (PDCN 2014). IASTED, ACTA press, Feb. 2014.

[6]

M. Cole. Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press, Cambridge, MA, USA, 1988. ISBN 0-262-53086-4.

Digital Library

[7]

M. Cole. Bringing skeletons out of the closet: A pragmatic manifesto for skeletal parallel programming. Parallel Comput., 30(3):389–406, Mar. 2004. ISSN 0167-8191.

Digital Library

[8]

T. H. Cormen, C. Stein, R. L. Rivest, and C. E. Leiserson. Introduction to Algorithms. McGraw-Hill Higher Education, 2nd edition, 2001. ISBN 0070131511.

Digital Library

[9]

M. Danelutto and M. Torquati. Structured parallel programming with ”core” fastflow. In V. Zs´ok, Z. Horváth, and L. Csat´o, editors, Central European Functional Programming School, volume 8606 of LNCS, pages 29–75. Springer, 2015. ISBN 978-3-319-15939-3. 978-3-319-15940-9 2.

[10]

D. del Rio Astorga, M. F. Dolz, L. M. Sanchez, and J. D. Garc´ıa. Discovering Pipeline Parallel Patterns in Sequential Legacy C++ Codes. In Procs of the 7th Int’l Workshop on Progr. Models and Applications for Multicores and Manycores, PMAM’16, pages 11–19, NY, USA, 2016. ACM. ISBN 978-1-4503-4196-7.

Digital Library

[11]

I. Dorta, C. Leon, C. Rodriguez, and A. Rojas. Parallel skeletons for divide-and-conquer and branch-and-bound techniques. In Parallel, Distributed and Network-Based Processing, 2003. Proceedings. Eleventh Euromicro Conference on, pages 292–298, Feb 2003.

[12]

1183602.

[13]

A. Duran, J. Corbalán, and E. Ayguadé. An adaptive cut-off for task parallelism. In Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC ’08, pages 36:1–36:11, Piscataway, NJ, USA, 2008. IEEE Press. ISBN 978-1-4244- 2835-9.

Digital Library

[14]

A. Duran, J. Corbalán, and E. Ayguadé. Evaluation of openmp task scheduling strategies. In Proceedings of the 4th International Conference on OpenMP in a New Era of Parallelism, IWOMP’08, pages 100–110, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 3-540-79560-X, 978-3-540- 79560-5.

Digital Library

[15]

A. Fonseca and B. Cabral. Evaluation of runtime cut-off approaches for parallel programs. In Proceedings of the 12th International Meeting on High Performance Computing for Computational Science (VECPAR 2016). Springer, 2016. to appear.

[16]

M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the cilk-5 multithreaded language. SIGPLAN Not., 33(5):212–223, May 1998. ISSN 0362-1340.

Digital Library

[17]

C. H. Gonzalez and B. B. Fraguela. A generic algorithm template for divide-and-conquer in multicore systems. In Proceedings of the 2010 IEEE 12th International Conference on High Performance Computing and Communications, HPCC ’10, pages 79–88, Washington, DC, USA, 2010.

Digital Library

[18]

IEEE Computer Society. ISBN 978-0-7695-4214-0.

[19]

C. A. Herrmann and C. Lengauer. A higher-order language for divide-and-conquer. Parallel Processing Letters, 10(02n03): 239–250, 2000.

[20]

Intel R ⃝Threading Building Blocks (Intel R ⃝TBB) Developer Guide. Intel R ⃝, 2016. Available at: https://www.threadingbuildingblocks.org/docs/ help/tbb_userguide/Design_Patterns/Divide_and_ Conquer.html.

[21]

M. Leyton and J. Piquer. Skandium: Multi-core programming with algorithmic skeletons. In Parallel, Distributed and Network-Based Processing (PDP), 2010 18th Euromicro International Conference on, pages 289–296, Feb 2010.

Digital Library

[22]

T. Mattson, B. Sanders, and B. Massingill. Patterns for Parallel Programming. Addison-Wesley Professional, first edition, 2004. ISBN 0321228111.

Digital Library

[23]

G. Mencagli, M. Vanneschi, and E. Vespa. Control-theoretic adaptation strategies for autonomic reconfigurable parallel applications on cloud environments. In High Performance Computing and Simulation (HPCS), 2013 International Conference on, pages 11–18, July 2013.

[24]

6641387.

[25]

G. Mencagli, M. Vanneschi, and E. Vespa. A cooperative predictive control approach to improve the reconfiguration stability of adaptive distributed parallel applications. ACM Trans. Auton. Adapt. Syst., 9(1):2:1–2:27, Mar. 2014. ISSN 1556-4665.

Digital Library

[26]

K. Molitorisz, T. Müller, and W. F. Tichy. Patty: A patternbased parallelization tool for the multicore age. In Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM ’15, pages 153–163, New York, NY, USA, 2015.

Digital Library

[27]

ACM. ISBN 978-1-4503-3404-4.

[28]

2712392.

[29]

2712392.

[30]

OpenMP Architecture Review Board. Openmp application program interface, 2011.

[31]

M. Poldner and H. Kuchen. Task parallel skeletons for divide and conquer. In Proceedings of the Workshop of the Working Group Programming Languages and Computing Concepts of the German Computer Science Association GI, Bad Honnef, 2008.

[32]

J. Reinders. Intel Threading Building Blocks. O’Reilly & Associates, Inc., Sebastopol, CA, USA, first edition, 2007. ISBN 9780596514808.

Digital Library

Cited By

Galante Gda Rosa Righi Rde Andrade C(2024)Extending parallel programming patterns with adaptability featuresCluster Computing10.1007/s10586-024-04622-027:9(12547-12568)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1007/s10586-024-04622-0
Niculescu V(2022)On Generalizing Divide and Conquer Parallel Programming PatternMathematics10.3390/math1021392510:21(3925)Online publication date: 23-Oct-2022
https://doi.org/10.3390/math10213925
Martínez MFraguela BCabaleiro J(2022)A highly optimized skeleton for unbalanced and deep divide-and-conquer algorithms on multi-core clustersThe Journal of Supercomputing10.1007/s11227-021-04259-578:8(10434-10454)Online publication date: 24-Jan-2022
https://doi.org/10.1007/s11227-021-04259-5
Show More Cited By

Index Terms

A divide-and-conquer parallel pattern implementation for multicores
1. Computing methodologies
  1. Concurrent computing methodologies
    1. Concurrent programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language types
        Concurrent programming languages

Recommendations

Grid-enabled parallel divide-and-conquer: theory and practice
SAC '02: Proceedings of the 2002 ACM symposium on Applied computing

This paper presents a general methodology for the communication-efficient parallelization of graph algorithms using the divide-and-conquer approach. The algorithm is communication-free in the conquer stage and uses only a small amount of messages while ...
Divide-and-conquer programming on MIMD computers
IPPS '95: Proceedings of the 9th International Symposium on Parallel Processing

We have developed a programming template to implement divide and conquer algorithms on MIMD computers. The template is based on the parallel divide and conquer function of Z.G. Mou and P. Hudak (1988). We explore the programmability and performance of ...
A Parallel Divide and Conquer Algorithm for the Symmetric Eigenvalue Problem on Distributed Memory Architectures

We present a new parallel implementation of a divide and conquer algorithm for computing the spectral decomposition of a symmetric tridiagonal matrix on distributed memory architectures. The implementation we develop differs from other implementations in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SEPS 2016: Proceedings of the 3rd International Workshop on Software Engineering for Parallel Systems

October 2016

34 pages

ISBN:9781450346412

DOI:10.1145/3002125

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

In-Cooperation

SIGAda: ACM Special Interest Group on Ada Programming Language

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SPLASH '16

Sponsor:

SIGPLAN

SPLASH '16: Conference on Systems, Programming, Languages, and Applications: Software for Humanity

November 1, 2016

Amsterdam, Netherlands

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
273
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Galante Gda Rosa Righi Rde Andrade C(2024)Extending parallel programming patterns with adaptability featuresCluster Computing10.1007/s10586-024-04622-027:9(12547-12568)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1007/s10586-024-04622-0
Niculescu V(2022)On Generalizing Divide and Conquer Parallel Programming PatternMathematics10.3390/math1021392510:21(3925)Online publication date: 23-Oct-2022
https://doi.org/10.3390/math10213925
Martínez MFraguela BCabaleiro J(2022)A highly optimized skeleton for unbalanced and deep divide-and-conquer algorithms on multi-core clustersThe Journal of Supercomputing10.1007/s11227-021-04259-578:8(10434-10454)Online publication date: 24-Jan-2022
https://doi.org/10.1007/s11227-021-04259-5
Ben Jmaa YDuvivier DAbid M(2022)ARM vs FPGA: Comparative Analysis of Sorting AlgorithmsAdvanced Information Networking and Applications10.1007/978-3-030-99619-2_27(275-287)Online publication date: 31-Mar-2022
https://doi.org/10.1007/978-3-030-99619-2_27
Martínez MFraguela BCabaleiro J(2021)A Parallel Skeleton for Divide-and-conquer Unbalanced and Deep ProblemsInternational Journal of Parallel Programming10.1007/s10766-021-00709-yOnline publication date: 14-May-2021
https://doi.org/10.1007/s10766-021-00709-y
Niculescu VBufnea DSterca ASilimon R(2019)Multi-way Divide and Conquer Parallel Programming based on PLists2019 International Conference on Software, Telecommunications and Computer Networks (SoftCOM)10.23919/SOFTCOM.2019.8903794(1-6)Online publication date: Sep-2019
https://doi.org/10.23919/SOFTCOM.2019.8903794
Drocco MViviani PColonnelli IAldinucci MGrangetto M(2019)Accelerating Spectral Graph Analysis Through Wavefronts of Linear Algebra Operations2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/EMPDP.2019.8671640(9-16)Online publication date: Feb-2019
https://doi.org/10.1109/EMPDP.2019.8671640
Jmaa YAli KDuvivier DJemaa MAtitallah R(2017)An Efficient Hardware Implementation of TimSort and MergeSort Algorithms Using High Level Synthesis2017 International Conference on High Performance Computing & Simulation (HPCS)10.1109/HPCS.2017.92(580-587)Online publication date: Jul-2017
https://doi.org/10.1109/HPCS.2017.92
González CFraguela B(2017)A general and efficient divide-and-conquer algorithm framework for multi-core clustersCluster Computing10.1007/s10586-017-0766-y20:3(2605-2626)Online publication date: 1-Sep-2017
https://dl.acm.org/doi/10.1007/s10586-017-0766-y

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten