skip to main content
research-article
Open Access
Artifacts Evaluated & Functional

Reconciling high-level optimizations and low-level code in LLVM

Published:24 October 2018Publication History
Skip Abstract Section

Abstract

LLVM miscompiles certain programs in C, C++, and Rust that use low-level language features such as raw pointers in Rust or conversion between integers and pointers in C or C++. The problem is that it is difficult for the compiler to implement aggressive, high-level memory optimizations while also respecting the guarantees made by the programming languages to low-level programs. A deeper problem is that the memory model for LLVM's intermediate representation (IR) is informal and the semantics of corner cases are not always clear to all compiler developers. We developed a novel memory model for LLVM IR and formalized it. The new model requires a handful of problematic IR-level optimizations to be removed, but it also supports the addition of new optimizations that were not previously legal. We have implemented the new model and shown that it fixes known memory-model-related miscompilations without impacting the quality of generated code.

Skip Supplemental Material Section

Supplemental Material

a125-lee.webm

webm

92.9 MB

References

  1. Frédéric Besson, Sandrine Blazy, and Pierre Wilke. 2014. A Precise and Abstract Memory Model for C Using Symbolic Values. In APLAS.Google ScholarGoogle Scholar
  2. Frédéric Besson, Sandrine Blazy, and Pierre Wilke. 2015. A Concrete Memory Model for CompCert. In ITP.Google ScholarGoogle Scholar
  3. Frédéric Besson, Sandrine Blazy, and Pierre Wilke. 2017a. CompCertS: A Memory-Aware Verified C Compiler Using Pointer as Integer Semantics. In ITP.Google ScholarGoogle Scholar
  4. Frédéric Besson, Sandrine Blazy, and Pierre Wilke. 2017b. A Verified CompCert Front-End for a Memory Model Supporting Pointer Arithmetic and Uninitialised Data. Journal of Automated Reasoning (03 Nov 2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. David Chisnall, Justus Matthiesen, Kayvan Memarian, Peter Sewell, and Robert N. M. Watson. 2016. C memory object and value semantics: the space of de facto and ISO standards. https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdfGoogle ScholarGoogle Scholar
  6. Arie Gurfinkel and Jorge A. Navas. 2017. A Context-Sensitive Memory Model for Verification of C/C++ Programs. In SAS.Google ScholarGoogle Scholar
  7. Chris Hathhorn, Chucky Ellison, and Grigore Roşu. 2015. Defining the Undefinedness of C. In PLDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jeehoon Kang, Chung-Kil Hur, William Mansky, Dmitri Garbuzov, Steve Zdancewic, and Viktor Vafeiadis. 2015. A Formal C Memory Model Supporting Integer-pointer Casts. In PLDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Robbert Krebbers. 2013. Aliasing Restrictions of C11 Formalized in Coq. In CPP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Robbert Krebbers and Freek Wiedijk. 2015. A Typed C11 Semantics for Interactive Theorem Proving. In CPP. 15–27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Juneyoung Lee, Yoonseung Kim, Youngju Song, Chung-Kil Hur, Sanjoy Das, David Majnemer, John Regehr, and Nuno P. Lopes. 2017. Taming Undefined Behavior in LLVM. In PLDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Xavier Leroy. 2009. Formal Verification of a Realistic Compiler. Commun. ACM 52, 7 (July 2009), 107–115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Xavier Leroy and Sandrine Blazy. 2008. Formal Verification of a C-like Memory Model and Its Uses for Verifying Program Transformations. Journal of Automated Reasoning 41, 1 (Jul 2008), 1–31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Nuno P. Lopes, David Menendez, Santosh Nagarakatte, and John Regehr. 2015. Provably Correct Peephole Optimizations with Alive. In PLDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kayvan Memarian, Justus Matthiesen, James Lingard, Kyndylan Nienhuis, David Chisnall, Robert N.M. Watson, and Peter Sewell. 2016. Into the depths of C: elaborating the de facto standards. In PLDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kayvan Memarian and Peter Sewell. 2016a. Clarifying the C memory object model (revised version of WG14 N2012). https://www.cl.cam.ac.uk/~pes20/cerberus/notes64- wg14.htmlGoogle ScholarGoogle Scholar
  17. Kayvan Memarian and Peter Sewell. 2016b. N2090: Clarifying Pointer Provenance (Draft Defect Report or Proposal for C2x). https://www.cl.cam.ac.uk/~pes20/cerberus/n2090.htmlGoogle ScholarGoogle Scholar
  18. The LLVM Project. 2018. LLVM Language Reference Manual. https://llvm.org/docs/LangRef.htmlGoogle ScholarGoogle Scholar
  19. Raimondas Sasnauskas, Yang Chen, Peter Collingbourne, Jeroen Ketema, Jubi Taneja, and John Regehr. 2017. Souper: A Synthesizing Superoptimizer. CoRR abs/1711.04422 (2017). http://arxiv.org/abs/1711.04422Google ScholarGoogle Scholar
  20. Jaroslav Ševčík, Viktor Vafeiadis, Francesco Zappa Nardelli, Suresh Jagannathan, and Peter Sewell. 2013. CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency. J. ACM 60, 3, Article 22 (June 2013), 22:1–22:50 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Wei Wang, Clark Barrett, and Thomas Wies. 2017. Partitioned Memory Models for Program Analysis. In VMCAI.Google ScholarGoogle Scholar
  22. Jianzhou Zhao, Santosh Nagarakatte, Milo M.K. Martin, and Steve Zdancewic. 2012. Formalizing the LLVM Intermediate Representation for Verified Program Transformations. In POPL. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Reconciling high-level optimizations and low-level code in LLVM

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the ACM on Programming Languages
        Proceedings of the ACM on Programming Languages  Volume 2, Issue OOPSLA
        November 2018
        1656 pages
        EISSN:2475-1421
        DOI:10.1145/3288538
        Issue’s Table of Contents

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 October 2018
        Published in pacmpl Volume 2, Issue OOPSLA

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader