skip to main content
research-article

Sub-polyhedral scheduling using (unit-)two-variable-per-inequality polyhedra

Published: 23 January 2013 Publication History

Abstract

Polyhedral compilation has been successful in the design and implementation of complex loop nest optimizers and parallelizing compilers. The algorithmic complexity and scalability limitations remain one important weakness. We address it using sub-polyhedral under-aproximations of the systems of constraints resulting from affine scheduling problems. We propose a sub-polyhedral scheduling technique using (Unit-)Two-Variable-Per-Inequality or (U)TVPI Polyhedra. This technique relies on simple polynomial time algorithms to under-approximate a general polyhedron into (U)TVPI polyhedra. We modify the state-of-the-art PLuTo compiler using our scheduling technique, and show that for a majority of the Polybench (2.0) kernels, the above under-approximations yield polyhedra that are non-empty. Solving the under-approximated system leads to asymptotic gains in complexity, and shows practically significant improvements when compared to a traditional LP solver. We also verify that code generated by our sub-polyhedral parallelization prototype matches the performance of PLuTo-optimized code when the under-approximation preserves feasibility.

Supplementary Material

JPG File (r1d3_talk9.jpg)
MP4 File (r1d3_talk9.mp4)

References

[1]
R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network flows: theory, algorithms, and applications. Prentice-Hall, Inc., NJ, USA, 1993.
[2]
R. Allen and K. Kennedy. Automatic translation of fortran programs to vector form. ACM Trans. Program. Lang. Syst., 9(4):491--542, Oct. 1987.
[3]
A. Andersson and S. Nilsson. Implementing radixsort. J. Exp. Algorithmics, 3, Sept. 1998.
[4]
B. Aspvall and Y. Shiloach. A polynomial time algorithm for solving systems of linear inequalities with two variables per inequality. SIAM J. Comput., 9(4):827--845, 1980.
[5]
R. Bagnara, P. M. Hill, and E. Zaffanella. The Parma Polyhedra Library: Toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems. Science of Computer Programming, 72(1--2):3--21, 2008.
[6]
R. Bagnara, P. M. Hill, and E. Zaffanella. Weakly-relational shapes for numeric abstractions: improved algorithms and proofs of correctness. Formal Methods in System Design, 35(3):279--323, 2009.
[7]
V. Balasundaram and K. Kennedy. A technique for summarizing data access and its use in parallelism enhancing transformations. In PLDI, pages 41--53, 1989.
[8]
U. Banerjee. Loop Transformations for Restructuring Compilers: The Foundations. Kluwer Academic Publishers, Boston, 1992.
[9]
R. Bellman. On a routing problem. Quarterly of Applied Mathematics, 16:87--90, 1958.
[10]
M.-W. Benabderrahmane, L.-N. Pouchet, A. Cohen, and C. Bastoul. The polyhedral model is more widely applicable than you think. In Proceedings of the International Conference on Compiler Construction (ETAPS CC'10), number 6011 in LNCS, Paphos, Cyprus, Mar. 2010. Springer-Verlag.
[11]
R. E. Bixby. Solving real-world linear programs: A decade and more of progress. Oper. Res., 50(1):3--15, Jan. 2002.
[12]
B. Blanchet, P. Cousot, R. Cousot, J. Feret, L. Mauborgne, A. Miné, D. Monniaux, and X. Rival. A static analyzer for large safety-critical software. In PLDI, pages 196--207. ACM, 2003.
[13]
U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelizer and locality optimizer. In PLDI, pages 101--113, 2008.
[14]
P.-Y. Calland, A. Darte, and Y. Robert. Circuit retiming applied to decomposed software pipelining. IEEE Trans. Parallel Distrib. Syst., 9(1):24--35, Jan. 1998.
[15]
B. V. Cherkassky, L. Georgiadis, A. V. Goldberg, R. E. Tarjan, and R. F. Werneck. Shortest-path feasibility algorithms: An experimental evaluation. J. Exp. Algorithmics, 14:7:2.7--7:2.37, Jan. 2010.
[16]
E. Cohen and N. Megiddo. Improved algorithms for linear inequalities with two variables per inequality. SIAM J. Comput., 23:1313--1350, December 1994.
[17]
P. Cousot, R. Cousot, J. Feret, L. Mauborgne, A. Miné, and X. Rival. Why does Astrée scale up? Formal Methods in System Design, 35(3):229--264, Dec 2009.
[18]
G. B. Dantzig. Linear Programming and Extensions. Princeton University Press, Princeton, NJ, 1963.
[19]
A. Darte and G. Huard. Loop shifting for loop compaction. Int. J. Parallel Program., 28(5):499--534, Oct. 2000.
[20]
A. Darte and G. Huard. Complexity of multi-dimensional loop alignment. In Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science, STACS '02, pages 179--191, London, UK, UK, 2002. Springer-Verlag.
[21]
A. Darte, Y. Robert, and F. Vivien. Scheduling and Automatic Parallelization. Birkhaüser, 2000.
[22]
A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. Int. J. Parallel Program., 25:447--496, December 1997.
[23]
H. Edelsbrunner, G. Rote, and E. Welzl. Testing the necklace condition for shortest tours and optimal factors in the plane. Theor. Comput. Sci., 66(2):157--180, 1989.
[24]
P. Feautrier. Parametric integer programming. RAIRO Recherche Opérationnelle, 22(3):243--268, 1988. http://www.piplib.org/.
[25]
P. Feautrier. Some efficient solutions to the affine scheduling problem: I. one-dimensional time. IJPP, 21:313--348, October 1992.
[26]
P. Feautrier. Some efficient solutions to the affine scheduling problem: Part ii: Multidimensional time. Int. J. Parallel Program., 21:389--420, December 1992.
[27]
P. Feautrier. Scalable and structured scheduling. International Journal of Parallel Programming, 34(5):459--487, 2006.
[28]
L. R. Ford, Jr. and D. R. Fulkerson. Flows in Networks. Princeton University Press, 1962.
[29]
M. Griebl, P. Feautrier, and A. Größlinger. Forward communication only placements and their use for parallel program construction. In W. Pugh and C.-W. Tseng, editors, LCPC, volume 2481 of Lecture Notes in Computer Science, pages 16--30. Springer, 2002.
[30]
T. Grosser, H. Zheng, A. Raghesh, A. Simbürger, A. Größlinger, and L.-N. Pouchet. Polly - Polyhedral Optimization in LLVM. In IMPACT 2011, in conjunction with CGO 2011, Chamonix, France, Apr 2011.
[31]
N. Halbwachs, D. Merchat, and L. Gonnord. Some ways to reduce the space dimension in polyhedra computations. Form. Methods Syst. Des., 29:79--95, July 2006.
[32]
D. S. Hochbaum and J. Naor. Simple and fast algorithms for linear and integer programs with two variables per inequality. SIAM J. Comput., 23(6):1179--1192, 1994.
[33]
J. Jaffar, M. J. Maher, P. J. Stuckey, and R. H. C. Yap. Beyond finite domains. In A. Borning, editor, PPCP, volume 874 of Lecture Notes in Computer Science, pages 86--94. Springer, 1994.
[34]
B. Jeannet and A. Miné. Apron: A library of numerical abstract domains for static analysis. In CAV, pages 661--667, 2009.
[35]
J. C. Lagarias. The computational complexity of simultaneous diophantine approximation problems. SIAM J. Comput., 14(1):196--209, 1985.
[36]
A. W. Lim and M. S. Lam. Communication-free parallelization via affine transformations. In POPL, pages 201--214, Paris, France, jan 1997.
[37]
D. E. Maydan, J. L. Hennessy, and M. S. Lam. Efficient and exact data dependence analysis. In Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation, PLDI '91, pages 1--14, New York, NY, USA, 1991. ACM.
[38]
A. Miné. The octagon abstract domain. Higher-Order and Symbolic Computation, 19(1):31--100, 2006.
[39]
L.-N. Pouchet, U. Bondhugula, C. Bastoul, A. Cohen, J. Ramanujam, P. Sadayappan, and N. Vasilache. Loop transformations: convexity, pruning and optimization. In POPL, pages 549--562, 2011.
[40]
L.-N. Pouchet, U. Bondhugula, et al. The polybench benchmarks. http://www.cse.ohio-state.edu/ pouchet/software/polybench.
[41]
V. R. Pratt. Two easy theories whose combination is hard. Technical report, Massachusetts Institute of Technology, Cambridge, Mass, 1977. http://boole.stanford.edu/pub/sefnp.pdf.
[42]
W. Pugh. A practical algorithm for exact array dependence analysis. Commun. ACM, 35(8):102--114, Aug. 1992.
[43]
F. Santos. A counterexample to the hirsch conjecture. Annals of Mathematics, 176:383--412, 2012.
[44]
A. Schrijver. Theory of linear and integer programming. John Wiley & Sons, Inc., New York, NY, USA, 1986.
[45]
S. A. Seshia, K. Subramani, and R. E. Bryant. On solving boolean combinations of UTVPI constraints. Journal on Satisfiability, Boolean Modeling and Computation (JSAT), 3(1--2):67--90, 2007.
[46]
R. Shostak. Deciding linear inequalities by computing loop residues. J. ACM, 28:769--779, October 1981.
[47]
A. Simon and A. King. The two variable per inequality abstract domain. Higher Order Symbol. Comput., 23(1):87--143, Mar. 2010.
[48]
D. A. Spielman and S.-H. Teng. Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. J. ACM, 51:385--463, May 2004.
[49]
A. Tarski. A decision method for elementary algebra and geometry. Univ. of California Press, Berkeley, 2nd edition, 1951.
[50]
M. J. Todd. The many facets of linear programming. Mathematical Programming, 91:417--436, 2002.
[51]
K. Trifunovic, A. Cohen, D. Edelsohn, F. Li, T. Grosser, H. Jagasia, R. Ladelsky, S. Pop, J. Sjödin, and R. Upadrasta. Graphite two years after: First lessons learned from real-world polyhedral compilation. In GCC Research Opportunities Workshop (GROW'10), Pisa, Italy, Jan. 2010.
[52]
R. Upadrasta. Scalability Challenges in the Polyhedral Model: An Algorithmic Approach using (Unit-)Two-variable Per Inequality Sub-Polyhedra. PhD thesis, Université Paris-Sud (11), Orsay, France, January 2013.
[53]
R. Upadrasta and A. Cohen. Potential and Challenges of Two-Variable-Per-Inequality Sub-Polyhedral Compilation. In First International Workshop on Polyhedral Compilation Techniques (IMPACT'11), in conjunction with CGO'11, Chamonix, France, Apr. 2011.
[54]
R. Upadrasta and A. Cohen. A Case for Strongly Polynomial Time Sub-Polyhedral Scheduling Using Two-Variable-Per-Inequality Polyhedra. In Second International Workshop on Polyhedral Compilation Techniques (IMPACT'12), in conjunction with HiPEAC'12, Paris, France, Jan. 2012.
[55]
N. Vasilache. Scalable Program Optimization Techniques In The Polyhedral Model. PhD thesis, Paris-Sud 11 University, Sept. 2007.
[56]
F. Vivien. On the optimality of feautrier's scheduling algorithm. Concurrency and Computation: Practice and Experience, 15(11--12):1047--1068, 2003.
[57]
F. Vivien and N. Wicker. Minimal enclosing parallelepiped in 3d. Comput. Geom. Theory Appl., 29:177--190, November 2004.
[58]
K. D. Wayne. A polynomial combinatorial algorithm for generalized minimum cost flow. In Proceedings of the thirty-first annual ACM symposium on Theory of computing, STOC '99, pages 11--18, New York, NY, USA, 1999. ACM.
[59]
M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans. Parallel Distrib. Syst., 2(4):452--471, Oct. 1991.
[60]
Y.-Q. Yang, C. Ancourt, and F. Irigoin. Minimal data dependence abstractions for loop transformations. In K. Pingali, U. Banerjee, D. Gelernter, A. Nicolau, and D. A. Padua, editors, LCPC, volume 892 of Lecture Notes in Computer Science, pages 201--216. Springer, 1994.
[61]
G. Ziegler. Lectures on polytopes. Graduate texts in mathematics. Springer Science, 2006.

Cited By

View all
  • (2020)LLOVACM Transactions on Architecture and Code Optimization10.1145/341859717:4(1-26)Online publication date: 22-Dec-2020
  • (2018)Automatic runtime calculation of communications for data‐parallel expressions with periodic conditionsConcurrency and Computation: Practice and Experience10.1002/cpe.443031:5Online publication date: 31-Jan-2018
  • (2017)Full runtime polyhedral optimizing loop transformations with the generation, instantiation, and scheduling of code-bonesConcurrency and Computation: Practice and Experience10.1002/cpe.419229:15(e4192)Online publication date: 9-Jun-2017
  • Show More Cited By

Index Terms

  1. Sub-polyhedral scheduling using (unit-)two-variable-per-inequality polyhedra

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 48, Issue 1
    POPL '13
    January 2013
    561 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2480359
    Issue’s Table of Contents
    • cover image ACM Conferences
      POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
      January 2013
      586 pages
      ISBN:9781450318327
      DOI:10.1145/2429069
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 January 2013
    Published in SIGPLAN Volume 48, Issue 1

    Check for updates

    Author Tags

    1. affine scheduling
    2. approximation algorithms
    3. compiler optimizations
    4. loop transformations
    5. optimization
    6. parallelism

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)17
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)LLOVACM Transactions on Architecture and Code Optimization10.1145/341859717:4(1-26)Online publication date: 22-Dec-2020
    • (2018)Automatic runtime calculation of communications for data‐parallel expressions with periodic conditionsConcurrency and Computation: Practice and Experience10.1002/cpe.443031:5Online publication date: 31-Jan-2018
    • (2017)Full runtime polyhedral optimizing loop transformations with the generation, instantiation, and scheduling of code-bonesConcurrency and Computation: Practice and Experience10.1002/cpe.419229:15(e4192)Online publication date: 9-Jun-2017
    • (2015)Exact and Approximated Data-Reuse Optimizations for Tiling with Parametric SizesCompiler Construction10.1007/978-3-662-46663-6_8(151-170)Online publication date: 2015
    • (2014)Switchable Scheduling for Runtime Adaptation of OptimizationEuro-Par 2014 Parallel Processing10.1007/978-3-319-09873-9_19(222-233)Online publication date: 2014
    • (2024)Modeling the Interplay between Loop Tiling and Fusion in Optimizing Compilers Using Affine RelationsACM Transactions on Computer Systems10.1145/363530541:1-4(1-45)Online publication date: 15-Jan-2024
    • (2024)Neighborhood persistency of the linear optimization relaxation of integer linear optimizationMathematical Programming10.1007/s10107-024-02174-0Online publication date: 18-Dec-2024
    • (2024)Strided Difference Bound MatricesComputer Aided Verification10.1007/978-3-031-65627-9_14(279-302)Online publication date: 24-Jul-2024
    • (2022) BullsEye : Scalable and Accurate Approximation Framework for Cache Miss CalculationACM Transactions on Architecture and Code Optimization10.1145/355800320:1(1-28)Online publication date: 17-Nov-2022
    • (2022)Neighborhood Persistency of the Linear Optimization Relaxation of Integer Linear OptimizationCombinatorial Optimization10.1007/978-3-031-18530-4_23(312-323)Online publication date: 21-Nov-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media