skip to main content
10.1145/2429069.2429128acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
research-article

Optimizing data structures in high-level programs: new directions for extensible compilers based on staging

Published: 23 January 2013 Publication History

Abstract

High level data structures are a cornerstone of modern programming and at the same time stand in the way of compiler optimizations. In order to reason about user- or library-defined data structures compilers need to be extensible. Common mechanisms to extend compilers fall into two categories. Frontend macros, staging or partial evaluation systems can be used to programmatically remove abstraction and specialize programs before they enter the compiler. Alternatively, some compilers allow extending the internal workings by adding new transformation passes at different points in the compile chain or adding new intermediate representation (IR) types. None of these mechanisms alone is sufficient to handle the challenges posed by high level data structures. This paper shows a novel way to combine them to yield benefits that are greater than the sum of the parts.
Instead of using staging merely as a front end, we implement internal compiler passes using staging as well. These internal passes delegate back to program execution to construct the transformed IR. Staging is known to simplify program generation, and in the same way it can simplify program transformation. Defining a transformation as a staged IR interpreter is simpler than implementing a low-level IR to IR transformer. With custom IR nodes, many optimizations that are expressed as rewritings from IR nodes to staged program fragments can be combined into a single pass, mitigating phase ordering problems. Speculative rewriting can preserve optimistic assumptions around loops.
We demonstrate several powerful program optimizations using this architecture that are particularly geared towards data structures: a novel loop fusion and deforestation algorithm, array of struct to struct of array conversion, object flattening and code generation for heterogeneous parallel devices. We validate our approach using several non trivial case studies that exhibit order of magnitude speedups in experiments.

Supplementary Material

JPG File (r1d3_talk10.jpg)
MP4 File (r1d3_talk10.mp4)

References

[1]
S. Ackermann, V. Jovanovic, T. Rompf, and M. Odersky. Jet: An embedded dsl for high performance big data processing. BigData, 2012. http://infoscience.epfl.ch/record/181673/files/paper.pdf.
[2]
M. S. Ager, O. Danvy, and H. K. Rohde. Fast partial evaluation of pattern matching in strings. ACM Trans. Program. Lang. Syst., 28 (4): 696--714, 2006.
[3]
J. Auerbach, D. F. Bacon, P. Cheng, and R. Rabbah. Lime: a java-compatible and synthesizable language for heterogeneous architectures. OOPSLA, 2010.
[4]
M. Bravenboer, A. van Dam, K. Olmos, and E. Visser. Program transformation with scoped dynamic rewrite rules. Fundam. Inf., 69: 123--178, July 2005.
[5]
K. J. Brown, A. K. Sujeeth, H. Lee, T. Rompf, H. Chafi, M. Odersky, and K. Olukotun. A heterogeneous parallel framework for domain-specific languages. PACT, 2011.
[6]
J. A. Brzozowski. Derivatives of regular expressions. J. ACM, 11 (4): 481--494, 1964.
[7]
C. Calcagno, W. Taha, L. Huang, and X. Leroy. Implementing multi-stage languages using asts, gensym, and reflection. In GPCE, 2003.
[8]
J. Carette, O. Kiselyov, and C. chieh Shan. Finally tagless, partially evaluated: Tagless staged interpreters for simpler typed languages. J. Funct. Program., 19 (5): 509--543, 2009.
[9]
C. Click and K. D. Cooper. Combining analyses, combining optimizations. ACM Trans. Program. Lang. Syst., 17: 181--196, March 1995.
[10]
C. Consel and O. Danvy. Partial evaluation of pattern matching in strings. Inf. Process. Lett., 30 (2): 79--86, 1989.
[11]
W. R. Cook, B. Delaware, T. Finsterbusch, A. Ibrahim, and B. Wiedermann. Model transformation by partial evaluation of model interpreters. Technical Report TR-09-09, UT Austin Department of Computer Science, 2008.
[12]
D. Coutts, R. Leshchinskiy, and D. Stewart. Stream fusion: from lists to streams to nothing at all. In ICFP, 2007.
[13]
T. Ekman and G. Hedin. The jastadd system - modular extensible compiler construction. Sci. Comput. Program., 69 (1--3): 14--26, 2007.
[14]
C. Elliott, S. Finne, and O. de Moor. Compiling embedded languages. In W. Taha, editor, phSemantics, Applications, and Implementation of Program Generation, volume 1924 of Lecture Notes in Computer Science, pages 9--26. Springer Berlin / Heidelberg, 2000.
[15]
Y. Futamura. Partial evaluation of computation process - an approach to a compiler-compiler. Higher-Order and Symbolic Computation, 12 (4): 381--391, 1999.
[16]
C. Grelck, K. Hinckfuß, and S.-B. Scholz. With-loop fusion for data locality and parallelism. IFL, 2006.
[17]
D. M. Groenewegen, Z. Hemel, L. C. L. Kats, and E. Visser. WebDSL: a domain-specific language for dynamic web applications. In OOPSLA Companion, 2008.
[18]
C. Hofer, K. Ostermann, T. Rendel, and A. Moors. Polymorphic embedding of DSLs. GPCE, 2008.
[19]
JetBrains. Meta Programming System, 2009. URL http://www.jetbrains.com/mps/.
[20]
N. D. Jones, C. K. Gomard, and P. Sestoft. Partial evaluation and automatic program generation. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1993.
[21]
S. L. P. Jones, R. Leshchinskiy, G. Keller, and M. M. T. Chakravarty. Harnessing the Multicores: Nested Data Parallelism in Haskell. In FSTTCS, 2008.
[22]
S. P. Jones, A. Tolmach, and T. Hoare. Playing by the rules: rewriting as a practical optimisation technique in ghc. Haskell, 2001.
[23]
S. Karmesin, J. Crotinger, J. Cummings, S. Haney, W. Humphrey, J. Reynders, S. Smith, and T. J. Williams. Array design and expression evaluation in pooma ii. In ISCOPE, 1998.
[24]
L. C. L. Kats and E. Visser. The Spoofax language workbench. rules for declarative specification of languages and IDEs. In SPLASH/OOPSLA Companion, 2010.
[25]
R. Kelsey and P. Hudak. Realistic compilation by program transformation. In POPL, 1989.
[26]
K. Kennedy, B. Broom, A. Chauhan, R. Fowler, J. Garvin, C. Koelbel, C. McCosh, and J. Mellor-Crummey. Telescoping languages: A system for automatic generation of domain languages. Proceedings of the IEEE, 93 (3): 387--408, 2005.
[27]
G. Kossakowski, N. Amin, T. Rompf, and M. Odersky. Javascript as an embedded dsl. In ECOOP, 2012.
[28]
H. Lee, K. J. Brown, A. K. Sujeeth, H. Chafi, T. Rompf, M. Odersky, and K. Olukotun. Implementing domain-specific languages for heterogeneous parallel computing. IEEE Micro, 31 (5): 42--53, 2011.
[29]
S. Lerner, D. Grove, and C. Chambers. Composing dataflow analyses and transformations. SIGPLAN Not., 37: 270--282, January 2002.
[30]
S. Lerner, T. D. Millstein, and C. Chambers. Automatically proving the correctness of compiler optimizations. In PLDI, 2003.
[31]
A. Møller. dk.brics.automaton -- finite-state automata and regular expressions for Java, 2010.texttthttp://www.brics.dk/automaton/.
[32]
A. Moors, T. Rompf, P. Haller, and M. Odersky. Scala-virtualized. PEPM, 2012.
[33]
N. Nystrom, M. R. Clarkson, and A. C. Myers. Polyglot: An extensible compiler framework for java. In CC, 2003.
[34]
N. Nystrom, D. White, and K. Das. Firepile: run-time compilation for gpus in scala. GPCE, 2011.
[35]
S. Owens, J. Reppy, and A. Turon. Regular-expression derivatives re-examined. J. Funct. Program., 19 (2): 173--190, Mar. 2009.
[36]
D. J. Quinlan, M. Schordan, Q. Yi, and A. Sæbjørnsen. Classification and utilization of abstractions for optimization. In ISoLA (Preliminary proceedings), 2004.
[37]
T. Rompf. phLightweight Modular Staging and Embedded Compilers: Abstraction Without Regret for High-Level High-Performance Programming. PhD thesis, EPFL, 2012.
[38]
T. Rompf and M. Odersky. Lightweight modular staging: a pragmatic approach to runtime code generation and compiled dsls. GPCE, 2010.
[39]
T. Rompf and M. Odersky. Lightweight modular staging: a pragmatic approach to runtime code generation and compiled dsls. Commun. ACM, 55 (6): 121--130, 2012.
[40]
T. Rompf, I. Maier, and M. Odersky. Implementing first-class polymorphic delimited continuations by a type-directed selective cps-transform. In ICFP, 2009.
[41]
T. Rompf, A. K. Sujeeth, H. Lee, K. J. Brown, H. Chafi, M. Odersky, and K. Olukotun. Building-blocks for performance oriented dsls. DSL, 2011.
[42]
A. Shali and W. R. Cook. Hybrid partial evaluation. OOPSLA, 2011.
[43]
M. Sperber and P. Thiemann. Realistic compilation by partial evaluation. In PLDI, 1996.
[44]
A. K. Sujeeth, H. Lee, K. J. Brown, T. Rompf, M. Wu, A. R. Atreya, M. Odersky, and K. Olukotun. OptiML: an implicitly parallel domain-specific language for machine learning. ICML, 2011.
[45]
E. Sumii and N. Kobayashi. A hybrid approach to online and offline partial evaluation. Higher-Order and Symbolic Computation, 14 (2--3): 101--142, 2001.
[46]
W. Taha and T. Sheard. Metaml and multi-stage programming with explicit annotations. Theor. Comput. Sci., 248 (1--2): 211--242, 2000.
[47]
R. Tate, M. Stepp, Z. Tatlock, and S. Lerner. Equality saturation: a new approach to optimization. In POPL, 2009.
[48]
R. Tate, M. Stepp, and S. Lerner. Generating compiler optimizations from proofs. In POPL, 2010.
[49]
P. Thiemann and D. Dussart. Partial evaluation for higher-order languages with state. Technical report, 1999. URL http://www.informatik.uni-freiburg.de/ thiemann/papers/mlpe.ps.gz.
[50]
S. Tobin-Hochstadt, V. St-Amour, R. Culpepper, M. Flatt, and M. Felleisen. Languages as libraries. PLDI'11, 2011.
[51]
T. L. Veldhuizen. Expression templates, C++gems. SIGS Publications, Inc., New York, NY, 1996.
[52]
T. L. Veldhuizen. Arrays in blitz. In ISCOPE, 1998.
[53]
T. L. Veldhuizen and J. G. Siek. Combining optimizations, combining theories. Technical report, Indiana University, 2008.
[54]
P. Wadler. Deforestation: Transforming programs to eliminate trees. Theor. Comput. Sci., 73 (2): 231--248, 1990.
[55]
P. Wadler and S. Blott. How to make ad-hoc polymorphism less ad-hoc. In POPL, 1989.

Cited By

View all
  • (2024)Specializing Data Access in a Distributed File System (Generative Pearl)Proceedings of the 23rd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3689484.3690736(44-52)Online publication date: 21-Oct-2024
  • (2023)Multi-Stage Vertex-Centric Programming for Agent-Based SimulationsProceedings of the 22nd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3624007.3624057(100-112)Online publication date: 22-Oct-2023
  • (2023)Graph IRs for Impure Higher-Order Languages: Making Aggressive Optimizations Affordable with Precise Effect DependenciesProceedings of the ACM on Programming Languages10.1145/36228137:OOPSLA2(400-430)Online publication date: 16-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
January 2013
586 pages
ISBN:9781450318327
DOI:10.1145/2429069
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 48, Issue 1
    POPL '13
    January 2013
    561 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2480359
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 January 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. code generation
  2. data structures
  3. extensible compilers
  4. staging

Qualifiers

  • Research-article

Conference

POPL '13
Sponsor:

Acceptance Rates

Overall Acceptance Rate 860 of 4,328 submissions, 20%

Upcoming Conference

POPL '26

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)45
  • Downloads (Last 6 weeks)6
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Specializing Data Access in a Distributed File System (Generative Pearl)Proceedings of the 23rd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3689484.3690736(44-52)Online publication date: 21-Oct-2024
  • (2023)Multi-Stage Vertex-Centric Programming for Agent-Based SimulationsProceedings of the 22nd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3624007.3624057(100-112)Online publication date: 22-Oct-2023
  • (2023)Graph IRs for Impure Higher-Order Languages: Making Aggressive Optimizations Affordable with Precise Effect DependenciesProceedings of the ACM on Programming Languages10.1145/36228137:OOPSLA2(400-430)Online publication date: 16-Oct-2023
  • (2021)Reasoning about recursive tree traversalsProceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3437801.3441617(47-61)Online publication date: 17-Feb-2021
  • (2020)Multi-stage programming in the large with staged classesProceedings of the 19th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3425898.3426961(35-49)Online publication date: 16-Nov-2020
  • (2019)Precise reasoning with structured time, structured heaps, and collective operationsProceedings of the ACM on Programming Languages10.1145/33605833:OOPSLA(1-30)Online publication date: 10-Oct-2019
  • (2019)Compiler generation for performance-oriented embedded DSLs (short paper)Proceedings of the 18th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3357765.3359520(94-101)Online publication date: 21-Oct-2019
  • (2019)Efficient differentiable programming in a functional array-processing languageProceedings of the ACM on Programming Languages10.1145/33417013:ICFP(1-30)Online publication date: 26-Jul-2019
  • (2019)Demystifying differentiable programming: shift/reset the penultimate backpropagatorProceedings of the ACM on Programming Languages10.1145/33417003:ICFP(1-31)Online publication date: 26-Jul-2019
  • (2019)Lambda calculus with algebraic simplification for reduction parallelization by equational reasoningProceedings of the ACM on Programming Languages10.1145/33416443:ICFP(1-25)Online publication date: 26-Jul-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media