skip to main content
10.1145/1274971.1275008acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article

Sensitivity analysis for automatic parallelization on multi-cores

Published: 17 June 2007 Publication History

Abstract

Sensitivity Analysis (SA) is a novel compiler technique that complements, and integrates with, static automatic parallelization analysis for the cases when relevant program behavior is input sensitive. In this paper we show how SA can extract all the input dependent, statically unavailable, conditions for which loops can be dynamically parallelized. SA generates a sequence of sufficient conditions which, when evaluated dynamically in order of their complexity, can each validate the dynamic parallel execution of the corresponding loop. For example, SA can first attempt to validate parallelization by checking simple conditions related to loop bounds. If such simple conditions cannot be met, then validating dynamic parallelization may require evaluating conditions related to the entire memory reference trace of a loop, thus decreasing the benefits of parallel execution.
We have implemented Sensitivity Analysis in the Polaris compiler and evaluated its performance using 22 industry standard benchmark codes running on two multicore systems. In most cases we have obtained speedups superior to the Intel Ifort compiler because with SA we could complement static analysis with minimum cost dynamic analysis and extract most of the available coarse grained parallelism.

References

[1]
G. Agrawal, J. H. Saltz, and R. Das. Interprocedural partial redundancy elimination and its application to distributed memory compilation. In Proc. of Prog. Lang. Design and Impl., pp. 258--269, La Jolla, CA, June 1995.
[2]
U. Banerjee. Dependence Analysis for Supercomputing. Norwell, Mass.: Kluwer Academic Publishers, 1988.
[3]
W. Blume, et. al. Advanced Program Restructuring for High-Performance Computers with Polaris. IEEE Computer, 29(12):78--82, 1996.
[4]
W. Blume and R. Eigenmann. The Range Test: A Dependence Test for Symbolic, Non-linear Expressions. In Proc. of Supercomputing '94, Washington DC, pp. 528--537, Nov. 1994.
[5]
B. Creusillet and F. Irigoin. Exact vs. approximate array region analyses. In Proc. of Workshop on Lang. and Comp. for Par. Computing, LNCS, vol. 1239, pp. 86--100, Springer 1996.
[6]
P. Feautrier. Parametric integer programming. Operations Research, 22(3):243--268, 1988.
[7]
P. Feautrier. Dataflow analysis of array and scalar references. Int. Journal of Par. Prog., 20(1):23--54, 1991.
[8]
G. Goff, K. Kennedy, and C.-W. Tseng. Practical dependence testing. In Proc. of Prog. Lang. Design and Impl., pp. 15--29, Toronto, ON, June 1991.
[9]
M. Gupta, S. Mukhopadhyay, and N. Sinha. Automatic parallelization of recursive procedures. Int. Journal of Par. Prog., 28(6):537--562, 2000.
[10]
M. R. Haghighat and C. D. Polychronopoulos. Symbolic analysis for parallelizing compilers. ACM TOPLAS, 18(4):477--518, 1996.
[11]
M. W. Hall, S. P. Amarasinghe, B. R. Murphy, S.-W. Liao, and M. S. Lam. Interprocedural parallelization analysis in SUIF. ACM TOPLAS., 27(4):662--731, 2005.
[12]
J. Hoeflinger. Interprocedural Parallelization Using Memory Classification Analysis. PhD thesis, Univ. of Illinois, Urbana-Champaign, Aug. 1998.
[13]
D. Kai Chen, J. Torrellas, and P.-C. Yew. An Efficient Algorithm for the Run-time Parallelization of DOACROSS Loops. In Proc. of Supercomputing '94, Washington DC, Nov. 1994.
[14]
X. Kong, D. Klappholz, and K. Psarris. The I test: An improved dependence test for automatic parallelization and vectorization. IEEE TPDS, 2(3):342--349, July 1991.
[15]
S. Leung and J. Zahorjan. Improving the performance of runtime parallelization. In Proc. of Principles and Practice of Par. Prog., pp. 83--91, ACM Press, May 1993.
[16]
Y. Lin and D. Padua. Analysis of irregular single-indexed array accesses and its application in compiler optimizations. In Proc. of Int. Conf. on Compiler Constr., pp. 202--218, LNCS vol. 1781, 2000.
[17]
D. E. Maydan, J. L. Hennessy, and M. S. Lam. Efficient and exact data dependence analysis. In Proc. of Prog. Lang. Design and Impl., pp. 1--14, Toronto, ON, June 1991.
[18]
S. Moon and M. W. Hall. Evaluation of predicated array data-flow analysis for automatic parallelization. In Proc. of Principles and Practice of Par. Prog., pp. 84--99, ACM Press, 1999.
[19]
S. Moon, M. W. Hall, and B. R. Murphy. Predicated array data-flow analysis for run-time parallelization. in Proc. of Int. Conf. on Supercomputing, Melbourne, Australia, pp. 212--219, July 1998.
[20]
S. Moon, B. So, M. W. Hall, and B. R. Murphy. A case for combining compile-time and run-time parallelization. In Proc. of Workshop on Lang., Comp, and Run-time Support for Scalable Systems, pp. 91--106, London, 1998.
[21]
Y. Paek, J. Hoeflinger, and D. Padua. Efficient and precise array access analysis. ACM TOPLAS, 24(1):65--109, 2002.
[22]
W. Pugh. The Omega test: A fast and practical integer programming algorithm for dependence analysis. In Proc. of Supercomputing '91, Albuquerque, NM, pp. 4--13, Nov. 1991.
[23]
W. Pugh and D. Wonnacott. An exact method for analysis of value-based array data dependences. In Proc. of Workshop on Lang. and Comp. for Par. Computing, in LNCS 768, pp. 546--566, Portland, OR, Aug. 1993.
[24]
W. Pugh and D. Wonnacott. Nonlinear array dependence analysis. In Proc. of Workshop on Lang., Comp. and Run-Time Support for Scalable Systems, Kluwer, Boston 1995.
[25]
C. G. Quiñones, et. al. Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices. In Proc. of Prog. Lang. Design and Impl., pp. 269--279, New York, NY, 2005.
[26]
L. Rauchwerger and D. A. Padua. The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization. In Proc. of Prog. Lang. Design and Impl., pp. 218--232, La Jolla, CA, June 1995.
[27]
S. Rus, J. Hoeflinger, and L. Rauchwerger. Hybrid analysis: static & dynamic memory reference analysis. Int. Journal of Par. Prog., 31(3):251--283, 2003.
[28]
S. Rus, D. Zhang, and L. Rauchwerger. The value evolution graph and its use in memory reference analysis. In Proc. of Par. Arch. and Comp. Techniques, pp. 243--254, Antibes, France, Oct. 2004.
[29]
J. H. Saltz, R. Mirchandaney, and K. Crowley. Run-time parallelization and scheduling of loops. IEEE Trans. on Computers, 40(5):603--612, May 1991.
[30]
R. van Engelen, J. Birch, Y. Shou, B. Walsh, and K. Gallivan. A unified framework for nonlinear dependence testing and symbolic analysis. In Proc. of Int. Conf. on Supercomp., pp. 106--115, ACM Press, 2004.
[31]
M. Wolfe and C.-W. Tseng. The Power test for data dependence. IEEE TPDS, 3(5):591--601, Sept. 1992.
[32]
M. J. Wolfe. High Performance Compilers for Parallel Computing. Addison-Wesley Longman Publishing, Inc., Boston, MA, 1995.
[33]
P. Wu, A. Cohen, and D. Padua. Induction variable analysis without idiom recognition: Beyond monotonicity. In Proc. of Workshop on Lang. and Comp. for Par. Computing, pp. 427--441, Cumberland Falls, KY, 2001, LNCS 2624.
[34]
H. Yu and L. Rauchwerger. Run-time parallelization overhead reduction techniques. In Proc. of Int. Conf. on Compiler Construction, Berlin, Germany, pp. 232--248, Springer LNCS 1781, March 2000.

Cited By

View all
  • (2022)Memory Optimizations in an Array LanguageSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00036(1-15)Online publication date: Nov-2022
  • (2021)Practical Examples of Automated Development of Efficient Parallel ProgramsFormal and Adaptive Methods for Automation of Parallel Programs Construction10.4018/978-1-5225-9384-3.ch006(180-216)Online publication date: 2021
  • (2020)SCAF: a speculation-aware collaborative dependence analysis frameworkProceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3385412.3386028(638-654)Online publication date: 11-Jun-2020
  • Show More Cited By

Index Terms

  1. Sensitivity analysis for automatic parallelization on multi-cores

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICS '07: Proceedings of the 21st annual international conference on Supercomputing
    June 2007
    315 pages
    ISBN:9781595937681
    DOI:10.1145/1274971
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 June 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. multicore
    2. parallelism
    3. sensitivity analysis

    Qualifiers

    • Article

    Conference

    ICS07
    Sponsor:
    ICS07: International Conference on Supercomputing
    June 17 - 21, 2007
    Washington, Seattle

    Acceptance Rates

    Overall Acceptance Rate 629 of 2,180 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Memory Optimizations in an Array LanguageSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00036(1-15)Online publication date: Nov-2022
    • (2021)Practical Examples of Automated Development of Efficient Parallel ProgramsFormal and Adaptive Methods for Automation of Parallel Programs Construction10.4018/978-1-5225-9384-3.ch006(180-216)Online publication date: 2021
    • (2020)SCAF: a speculation-aware collaborative dependence analysis frameworkProceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3385412.3386028(638-654)Online publication date: 11-Jun-2020
    • (2019)Approaching 3/2 for the s-t-path TSPJournal of the ACM10.1145/330971566:2(1-17)Online publication date: 7-Mar-2019
    • (2019)Going Higher in First-Order Quantifier Alternation Hierarchies on WordsJournal of the ACM10.1145/330399166:2(1-65)Online publication date: 13-Mar-2019
    • (2018)Algebra-Algorithmic Models and Methods of Parallel Programing10.15407/akademperiodyka.367.192Online publication date: 2018
    • (2016)A Control-Theoretic Analysis of Low-Priority Congestion Control Reprioritization under AQMACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/29346521:4(1-33)Online publication date: 2-Aug-2016
    • (2016)PEASACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/29306591:4(1-31)Online publication date: 2-Aug-2016
    • (2015)AutoMO: automatic inference of memory order parameters for C/C++11ACM SIGPLAN Notices10.1145/2858965.281428650:10(221-240)Online publication date: 23-Oct-2015
    • (2015)Distributed memory code generation for mixed Irregular/Regular computationsACM SIGPLAN Notices10.1145/2858788.268851550:8(65-75)Online publication date: 24-Jan-2015
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media