ABSTRACT
We introduce equivalence modulo inputs (EMI), a simple, widely applicable methodology for validating optimizing compilers. Our key insight is to exploit the close interplay between (1) dynamically executing a program on some test inputs and (2) statically compiling the program to work on all possible inputs. Indeed, the test inputs induce a natural collection of the original program's EMI variants, which can help differentially test any compiler and specifically target the difficult-to-find miscompilations.
To create a practical implementation of EMI for validating C compilers, we profile a program's test executions and stochastically prune its unexecuted code. Our extensive testing in eleven months has led to 147 confirmed, unique bug reports for GCC and LLVM alone. The majority of those bugs are miscompilations, and more than 100 have already been fixed.
Beyond testing compilers, EMI can be adapted to validate program transformation and analysis systems in general. This work opens up this exciting, new direction.
- ACE. SuperTest compiler test and validation suite. http://www.ace.nl/compiler/supertest.html.Google Scholar
- A. Balestrat. CCG: A random C code generator. https://github.com/Merkil/ccg/.Google Scholar
- S. Bansal and A. Aiken. Automatic generation of peephole superoptimizers. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 394--403, 2006. Google ScholarDigital Library
- Y. Chen, A. Groce, C. Zhang, W.-K. Wong, X. Fern, E. Eide, and J. Regehr. Taming compiler fuzzers. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 197--208, 2013. Google ScholarDigital Library
- P. Cuoq, B. Monate, A. Pacalet, V. Prevosto, J. Regehr, B. Yakobowski, and X. Yang. Testing static analyzers with randomly generated programs. In A. Goodloe and S. Person, editors, NASA Formal Methods, volume 7226 of Lecture Notes in Computer Science, pages 120--125. Springer Berlin Heidelberg, 2012. Google ScholarDigital Library
- C. Ellison and G. Rosu. An executable formal semantics of C with applications. In Proceedings of the 39th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 533--544, 2012. Google ScholarDigital Library
- GNU Compiler Collection. gcov. http://gcc.gnu.org/onlinedocs/gcc/Gcov.html.Google Scholar
- A. Groce, C. Zhang, E. Eide, Y. Chen, and J. Regehr. Swarm testing. In International Symposium on Software Testing and Analysis (ISSTA), pages 78--88, 2012. Google ScholarDigital Library
- T. Hoare. The verifying compiler: A grand challenge for computing research. In Modular Programming Languages, pages 25--35. Springer, 2003.Google ScholarCross Ref
- C. Holler, K. Herzig, and A. Zeller. Fuzzing with code fragments. In Proceedings of the 21st USENIX Security Symposium, 2012. Google ScholarDigital Library
- R. Joshi, G. Nelson, and K. H. Randall. Denali: A goal-directed superoptimizer. In Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 304--314, 2002. Google ScholarDigital Library
- X. Leroy. Formal certification of a compiler back-end, or: programming a compiler with a proof assistant. In Proceedings of the 33rd ACM Symposium on Principles of Programming Languages (POPL), pages 42--54. ACM Press, 2006. Google ScholarDigital Library
- X. Leroy. A formally verified compiler back-end. Journal of Automated Reasoning, 43(4):363--446, 2009. Google ScholarDigital Library
- G. Malecha, G. Morrisett, A. Shinnar, and R. Wisnesky. Toward a verified relational database management system. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 237--248, 2010. Google ScholarDigital Library
- H. Massalin. Superoptimizer -- A look at the smallest program. In Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 122--126, 1987. Google ScholarCross Ref
- W. M. McKeeman. Differential testing for software. Digital Technical Journal, 10(1):100--107, 1998.Google Scholar
- S. McPeak, D. S. Wilkerson, and S. Goldsmith. Berkeley Delta. http://delta.tigris.org/.Google Scholar
- E. Nagai, H. Awazu, N. Ishiura, and N. Takeda. Random testing of C compilers targeting arithmetic optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012), pages 48--53, 2012.Google Scholar
- E. Nagai, A. Hashimoto, and N. Ishiura. Scaling up size and number of expressions in random testing of arithmetic optimization of C compilers. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2013), pages 88--93, 2013.Google Scholar
- G. C. Necula. Translation validation for an optimizing compiler. In Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 83--94, 2000. Google ScholarDigital Library
- Plum Hall, Inc. The Plum Hall Validation Suite for C. http://www.plumhall.com/stec.html.Google Scholar
- A. Pnueli, M. Siegel, and E. Singerman. Translation validation. In Proceedings of the 4th International Conference on Tools and Algorithms for Construction and Analysis of Systems (TACAS), pages 151--166, London, UK, UK, 1998. Springer-Verlag. Google ScholarDigital Library
- J. Regehr. Embedded in academia. http://blog.regehr.org/.Google Scholar
- J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang. Test-case reduction for C compiler bugs. In Proceedings of the 2012 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 335--346, 2012. Google ScholarDigital Library
- R. Tate, M. Stepp, Z. Tatlock, and S. Lerner. Equality saturation: a new approach to optimization. In Proceedings of the 36th ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages, pages 264--276, 2009. Google ScholarDigital Library
- The Clang Team. Clang 3.4 documentation: LibTooling. http://clang.llvm.org/docs/LibTooling.html.Google Scholar
- J.-B. Tristan, P. Govereau, and G. Morrisett. Evaluating value-graph translation validation for LLVM. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 295--305, 2011. Google ScholarDigital Library
- X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 283--294, 2011. Google ScholarDigital Library
- C. Zhao, Y. Xue, Q. Tao, L. Guo, and Z. Wang. Automated test program generation for an industrial optimizing compiler. In ICSE Workshop on Automation of Software Test (AST), pages 36--43, 2009.Google Scholar
- J. Zhao, S. Nagarakatte, M. M. K. Martin, and S. Zdancewic. Formal verification of SSA-based optimizations for LLVM. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 175--186, 2013. Google ScholarDigital Library
Index Terms
- Compiler validation via equivalence modulo inputs
Recommendations
Finding compiler bugs via live code mutation
OOPSLA 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and ApplicationsValidating optimizing compilers is challenging because it is hard to generate valid test programs (i.e., those that do not expose any undefined behavior). Equivalence Modulo Inputs (EMI) is an effective, promising methodology to tackle this problem. ...
Compiler validation via equivalence modulo inputs
PLDI '14We introduce equivalence modulo inputs (EMI), a simple, widely applicable methodology for validating optimizing compilers. Our key insight is to exploit the close interplay between (1) dynamically executing a program on some test inputs and (2) ...
Finding and understanding bugs in C compilers
PLDI '11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and ImplementationCompilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to ...
Comments