Reconciling high-level optimizations and low-level code in LLVM

Authors:
Juneyoung Lee

Seoul National University, South Korea

Seoul National University, South Korea
View Profile

,
Chung-Kil Hur

Seoul National University, South Korea

Seoul National University, South Korea
View Profile

,
Ralf Jung

MPI-SWS, Germany

MPI-SWS, Germany
View Profile

,
Zhengyang Liu

University of Utah, USA

University of Utah, USA
View Profile

,
John Regehr

University of Utah, USA

University of Utah, USA
View Profile

,
Nuno P. Lopes

Microsoft Research, UK

Microsoft Research, UK
View Profile

Proceedings of the ACM on Programming Languages Volume 2 Issue OOPSLAArticle No.: 125pp 1–28https://doi.org/10.1145/3276495

Published:24 October 2018Publication History

Proceedings of the ACM on Programming Languages

Abstract

LLVM miscompiles certain programs in C, C++, and Rust that use low-level language features such as raw pointers in Rust or conversion between integers and pointers in C or C++. The problem is that it is difficult for the compiler to implement aggressive, high-level memory optimizations while also respecting the guarantees made by the programming languages to low-level programs. A deeper problem is that the memory model for LLVM's intermediate representation (IR) is informal and the semantics of corner cases are not always clear to all compiler developers. We developed a novel memory model for LLVM IR and formalized it. The new model requires a handful of problematic IR-level optimizations to be removed, but it also supports the addition of new optimizations that were not previously legal. We have implemented the new model and shown that it fixes known memory-model-related miscompilations without impacting the quality of generated code.

Supplemental Material

a125-lee.webm

webm

92.9 MB

Download

References

Frédéric Besson, Sandrine Blazy, and Pierre Wilke. 2014. A Precise and Abstract Memory Model for C Using Symbolic Values. In APLAS.Google Scholar
Frédéric Besson, Sandrine Blazy, and Pierre Wilke. 2015. A Concrete Memory Model for CompCert. In ITP.Google Scholar
Frédéric Besson, Sandrine Blazy, and Pierre Wilke. 2017a. CompCertS: A Memory-Aware Verified C Compiler Using Pointer as Integer Semantics. In ITP.Google Scholar
Frédéric Besson, Sandrine Blazy, and Pierre Wilke. 2017b. A Verified CompCert Front-End for a Memory Model Supporting Pointer Arithmetic and Uninitialised Data. Journal of Automated Reasoning (03 Nov 2017).Google ScholarDigital Library
David Chisnall, Justus Matthiesen, Kayvan Memarian, Peter Sewell, and Robert N. M. Watson. 2016. C memory object and value semantics: the space of de facto and ISO standards. https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdfGoogle Scholar
Arie Gurfinkel and Jorge A. Navas. 2017. A Context-Sensitive Memory Model for Verification of C/C++ Programs. In SAS.Google Scholar
Chris Hathhorn, Chucky Ellison, and Grigore Roşu. 2015. Defining the Undefinedness of C. In PLDI. Google ScholarDigital Library
Jeehoon Kang, Chung-Kil Hur, William Mansky, Dmitri Garbuzov, Steve Zdancewic, and Viktor Vafeiadis. 2015. A Formal C Memory Model Supporting Integer-pointer Casts. In PLDI. Google ScholarDigital Library
Robbert Krebbers. 2013. Aliasing Restrictions of C11 Formalized in Coq. In CPP. Google ScholarDigital Library
Robbert Krebbers and Freek Wiedijk. 2015. A Typed C11 Semantics for Interactive Theorem Proving. In CPP. 15–27. Google ScholarDigital Library
Juneyoung Lee, Yoonseung Kim, Youngju Song, Chung-Kil Hur, Sanjoy Das, David Majnemer, John Regehr, and Nuno P. Lopes. 2017. Taming Undefined Behavior in LLVM. In PLDI. Google ScholarDigital Library
Xavier Leroy. 2009. Formal Verification of a Realistic Compiler. Commun. ACM 52, 7 (July 2009), 107–115. Google ScholarDigital Library
Xavier Leroy and Sandrine Blazy. 2008. Formal Verification of a C-like Memory Model and Its Uses for Verifying Program Transformations. Journal of Automated Reasoning 41, 1 (Jul 2008), 1–31. Google ScholarDigital Library
Nuno P. Lopes, David Menendez, Santosh Nagarakatte, and John Regehr. 2015. Provably Correct Peephole Optimizations with Alive. In PLDI. Google ScholarDigital Library
Kayvan Memarian, Justus Matthiesen, James Lingard, Kyndylan Nienhuis, David Chisnall, Robert N.M. Watson, and Peter Sewell. 2016. Into the depths of C: elaborating the de facto standards. In PLDI. Google ScholarDigital Library
Kayvan Memarian and Peter Sewell. 2016a. Clarifying the C memory object model (revised version of WG14 N2012). https://www.cl.cam.ac.uk/~pes20/cerberus/notes64- wg14.htmlGoogle Scholar
Kayvan Memarian and Peter Sewell. 2016b. N2090: Clarifying Pointer Provenance (Draft Defect Report or Proposal for C2x). https://www.cl.cam.ac.uk/~pes20/cerberus/n2090.htmlGoogle Scholar
The LLVM Project. 2018. LLVM Language Reference Manual. https://llvm.org/docs/LangRef.htmlGoogle Scholar
Raimondas Sasnauskas, Yang Chen, Peter Collingbourne, Jeroen Ketema, Jubi Taneja, and John Regehr. 2017. Souper: A Synthesizing Superoptimizer. CoRR abs/1711.04422 (2017). http://arxiv.org/abs/1711.04422Google Scholar
Jaroslav Ševčík, Viktor Vafeiadis, Francesco Zappa Nardelli, Suresh Jagannathan, and Peter Sewell. 2013. CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency. J. ACM 60, 3, Article 22 (June 2013), 22:1–22:50 pages. Google ScholarDigital Library
Wei Wang, Clark Barrett, and Thomas Wies. 2017. Partitioned Memory Models for Program Analysis. In VMCAI.Google Scholar
Jianzhou Zhao, Santosh Nagarakatte, Milo M.K. Martin, and Steve Zdancewic. 2012. Formalizing the LLVM Intermediate Representation for Verified Program Transformations. In POPL. Google ScholarDigital Library

Index Terms

Reconciling high-level optimizations and low-level code in LLVM
1. Software and its engineering
  1. Software notations and tools
    1. Compilers
    2. Formal language definitions
      1. Semantics

Recommendations

Bringing low-level languages to the JVM: efficient execution of LLVM IR on Truffle
VMIL 2016: Proceedings of the 8th International Workshop on Virtual Machines and Intermediate Languages

Although the Java platform has been used as a multi-language platform, most of the low-level languages (such as C, Fortran, and C++) cannot be executed efficiently on the JVM. We propose Sulong, a system that can execute LLVM-based languages on the ...
Read More
Source-Level Compiler Optimizations for High-Level Synthesis
SEEDA-CECNSM '16: Proceedings of the SouthEast European Design Automation, Computer Engineering, Computer Networks and Social Media Conference

With high-level synthesis becoming the preferred method for hardware design, tools that operate on high-level programming languages and optimize hardware output are crucial for successful synthesis. In high-level synthesis, conventional programming ...
Read More
From ASTs to Machine Code with LLVM
Programming '21: Companion Proceedings of the 5th International Conference on the Art, Science, and Engineering of Programming

A compiler is a program that translates source code written in a particular language into another language. Internally, the whole process is typically split into multiple stages that handle one particular aspect of this translation. One of these ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the ACM on Programming Languages Volume 2, Issue OOPSLA
November 2018
1656 pages
EISSN:2475-1421
DOI:10.1145/3288538
Issue’s Table of Contents

Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 October 2018
Published in pacmpl Volume 2, Issue OOPSLA

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Evaluated & Functional
Author Tags
IR Memory Model
LLVM
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 1,244
  Total Downloads
- Downloads (Last 12 months)238
- Downloads (Last 6 weeks)41
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Reconciling high-level optimizations and low-level code in LLVM

Proceedings of the ACM on Programming Languages

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Bringing low-level languages to the JVM: efficient execution of LLVM IR on Truffle

Source-Level Compiler Optimizations for High-Level Synthesis

From ASTs to Machine Code with LLVM