research-article

Compiler validation via equivalence modulo inputs

Authors:
Vu Le

University of California, Davis

University of California, Davis
View Profile

,
Mehrdad Afshari

University of California, Davis

University of California, Davis
View Profile

,
Zhendong Su

University of California, Davis

University of California, Davis
View Profile

PLDI '14: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and ImplementationJune 2014Pages 216–226https://doi.org/10.1145/2594291.2594334

Published:09 June 2014Publication History

PLDI '14: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pages 216–226

ABSTRACT

We introduce equivalence modulo inputs (EMI), a simple, widely applicable methodology for validating optimizing compilers. Our key insight is to exploit the close interplay between (1) dynamically executing a program on some test inputs and (2) statically compiling the program to work on all possible inputs. Indeed, the test inputs induce a natural collection of the original program's EMI variants, which can help differentially test any compiler and specifically target the difficult-to-find miscompilations.

To create a practical implementation of EMI for validating C compilers, we profile a program's test executions and stochastically prune its unexecuted code. Our extensive testing in eleven months has led to 147 confirmed, unique bug reports for GCC and LLVM alone. The majority of those bugs are miscompilations, and more than 100 have already been fixed.

Beyond testing compilers, EMI can be adapted to validate program transformation and analysis systems in general. This work opens up this exciting, new direction.

References

ACE. SuperTest compiler test and validation suite. http://www.ace.nl/compiler/supertest.html.Google Scholar
A. Balestrat. CCG: A random C code generator. https://github.com/Merkil/ccg/.Google Scholar
S. Bansal and A. Aiken. Automatic generation of peephole superoptimizers. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 394--403, 2006. Google ScholarDigital Library
Y. Chen, A. Groce, C. Zhang, W.-K. Wong, X. Fern, E. Eide, and J. Regehr. Taming compiler fuzzers. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 197--208, 2013. Google ScholarDigital Library
P. Cuoq, B. Monate, A. Pacalet, V. Prevosto, J. Regehr, B. Yakobowski, and X. Yang. Testing static analyzers with randomly generated programs. In A. Goodloe and S. Person, editors, NASA Formal Methods, volume 7226 of Lecture Notes in Computer Science, pages 120--125. Springer Berlin Heidelberg, 2012. Google ScholarDigital Library
C. Ellison and G. Rosu. An executable formal semantics of C with applications. In Proceedings of the 39th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 533--544, 2012. Google ScholarDigital Library
GNU Compiler Collection. gcov. http://gcc.gnu.org/onlinedocs/gcc/Gcov.html.Google Scholar
A. Groce, C. Zhang, E. Eide, Y. Chen, and J. Regehr. Swarm testing. In International Symposium on Software Testing and Analysis (ISSTA), pages 78--88, 2012. Google ScholarDigital Library
T. Hoare. The verifying compiler: A grand challenge for computing research. In Modular Programming Languages, pages 25--35. Springer, 2003.Google ScholarCross Ref
C. Holler, K. Herzig, and A. Zeller. Fuzzing with code fragments. In Proceedings of the 21st USENIX Security Symposium, 2012. Google ScholarDigital Library
R. Joshi, G. Nelson, and K. H. Randall. Denali: A goal-directed superoptimizer. In Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 304--314, 2002. Google ScholarDigital Library
X. Leroy. Formal certification of a compiler back-end, or: programming a compiler with a proof assistant. In Proceedings of the 33rd ACM Symposium on Principles of Programming Languages (POPL), pages 42--54. ACM Press, 2006. Google ScholarDigital Library
X. Leroy. A formally verified compiler back-end. Journal of Automated Reasoning, 43(4):363--446, 2009. Google ScholarDigital Library
G. Malecha, G. Morrisett, A. Shinnar, and R. Wisnesky. Toward a verified relational database management system. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 237--248, 2010. Google ScholarDigital Library
H. Massalin. Superoptimizer -- A look at the smallest program. In Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 122--126, 1987. Google ScholarCross Ref
W. M. McKeeman. Differential testing for software. Digital Technical Journal, 10(1):100--107, 1998.Google Scholar
S. McPeak, D. S. Wilkerson, and S. Goldsmith. Berkeley Delta. http://delta.tigris.org/.Google Scholar
E. Nagai, H. Awazu, N. Ishiura, and N. Takeda. Random testing of C compilers targeting arithmetic optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012), pages 48--53, 2012.Google Scholar
E. Nagai, A. Hashimoto, and N. Ishiura. Scaling up size and number of expressions in random testing of arithmetic optimization of C compilers. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2013), pages 88--93, 2013.Google Scholar
G. C. Necula. Translation validation for an optimizing compiler. In Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 83--94, 2000. Google ScholarDigital Library
Plum Hall, Inc. The Plum Hall Validation Suite for C. http://www.plumhall.com/stec.html.Google Scholar
A. Pnueli, M. Siegel, and E. Singerman. Translation validation. In Proceedings of the 4th International Conference on Tools and Algorithms for Construction and Analysis of Systems (TACAS), pages 151--166, London, UK, UK, 1998. Springer-Verlag. Google ScholarDigital Library
J. Regehr. Embedded in academia. http://blog.regehr.org/.Google Scholar
J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang. Test-case reduction for C compiler bugs. In Proceedings of the 2012 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 335--346, 2012. Google ScholarDigital Library
R. Tate, M. Stepp, Z. Tatlock, and S. Lerner. Equality saturation: a new approach to optimization. In Proceedings of the 36th ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages, pages 264--276, 2009. Google ScholarDigital Library
The Clang Team. Clang 3.4 documentation: LibTooling. http://clang.llvm.org/docs/LibTooling.html.Google Scholar
J.-B. Tristan, P. Govereau, and G. Morrisett. Evaluating value-graph translation validation for LLVM. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 295--305, 2011. Google ScholarDigital Library
X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 283--294, 2011. Google ScholarDigital Library
C. Zhao, Y. Xue, Q. Tao, L. Guo, and Z. Wang. Automated test program generation for an industrial optimizing compiler. In ICSE Workshop on Automation of Software Test (AST), pages 36--43, 2009.Google Scholar
J. Zhao, S. Nagarakatte, M. M. K. Martin, and S. Zdancewic. Formal verification of SSA-based optimizations for LLVM. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 175--186, 2013. Google ScholarDigital Library

Index Terms

Compiler validation via equivalence modulo inputs
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software notations and tools
    1. Compilers

Recommendations

Finding compiler bugs via live code mutation
OOPSLA 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications

Validating optimizing compilers is challenging because it is hard to generate valid test programs (i.e., those that do not expose any undefined behavior). Equivalence Modulo Inputs (EMI) is an effective, promising methodology to tackle this problem. ...
Read More
Compiler validation via equivalence modulo inputs
PLDI '14

We introduce equivalence modulo inputs (EMI), a simple, widely applicable methodology for validating optimizing compilers. Our key insight is to exploit the close interplay between (1) dynamically executing a program on some test inputs and (2) ...
Read More
Finding and understanding bugs in C compilers
PLDI '11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation

Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PLDI '14: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2014
619 pages
ISBN:9781450327848
DOI:10.1145/2594291
General Chair:
Michael O'Boyle
University of Edinburgh
,
Program Chair:
Keshav Pingali
University of Texas, Austin
ACM SIGPLAN Notices Volume 49, Issue 6
PLDI '14
June 2014
598 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2666356
Editor:
Andy Gill
University of Kansas, Lawrence, KS
Issue’s Table of Contents
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 June 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
automated testing
compiler testing
equivalent program variants
miscompilation
Qualifiers
- research-article
Conference

Acceptance Rates
PLDI '14 Paper Acceptance Rate52of287submissions,18%Overall Acceptance Rate406of2,067submissions,20%
More
Upcoming Conference
PLDI '24

Sponsor:

sigplan

ACM SIGPLAN Conference on Programming Language Design and Implementation

June 24 - 28, 2024

Copenhagen , Denmark
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 374
  Total Citations
  View Citations
- 1,947
  Total Downloads
- Downloads (Last 12 months)427
- Downloads (Last 6 weeks)34
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Compiler validation via equivalence modulo inputs

PLDI '14: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Finding compiler bugs via live code mutation

Compiler validation via equivalence modulo inputs

Finding and understanding bugs in C compilers