skip to main content
10.1145/2908080.2908081acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article
Public Access

Into the depths of C: elaborating the de facto standards

Published:02 June 2016Publication History

ABSTRACT

C remains central to our computing infrastructure. It is notionally defined by ISO standards, but in reality the properties of C assumed by systems code and those implemented by compilers have diverged, both from the ISO standards and from each other, and none of these are clearly understood. We make two contributions to help improve this error-prone situation. First, we describe an in-depth analysis of the design space for the semantics of pointers and memory in C as it is used in practice. We articulate many specific questions, build a suite of semantic test cases, gather experimental data from multiple implementations, and survey what C experts believe about the de facto standards. We identify questions where there is a consensus (either following ISO or differing) and where there are conflicts. We apply all this to an experimental C implemented above capability hardware. Second, we describe a formal model, Cerberus, for large parts of C. Cerberus is parameterised on its memory model; it is linkable either with a candidate de facto memory object model, under construction, or with an operational C11 concurrency model; it is defined by elaboration to a much simpler Core language for accessibility, and it is executable as a test oracle on small examples. This should provide a solid basis for discussion of what mainstream C is now: what programmers and analysis tools can assume and what compilers aim to implement. Ultimately we hope it will be a step towards clear, consistent, and accepted semantics for the various use-cases of C.

References

  1. Programming Languages — C. 2011.Google ScholarGoogle Scholar
  2. ISO/IEC 9899:2011. A non-final but recent version is available at www.open-std.org/jtc1/sc22/wg14/docs/n1539.pdf.Google ScholarGoogle Scholar
  3. ANSI. Programming Languages – C: ANSI X3.159-1989. 1989.Google ScholarGoogle Scholar
  4. M. Batty, S. Owens, S. Sarkar, P. Sewell, and T. Weber. Mathematizing C++ concurrency. In Proc. POPL, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Becker, editor. Programming Languages — C++. 2011. ISO/IEC 14882:2011.Google ScholarGoogle Scholar
  6. A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem, C. Henri-Gros, A. Kamsky, S. McPeak, and D. Engler. A few billion lines of code later: using static analysis to find bugs in the real world. Commun. ACM, 53(2), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F. Besson, S. Blazy, and P. Wilke. A precise and abstract memory model for C using symbolic values. In APLAS, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  8. F. Besson, S. Blazy, and P. Wilke. A concrete memory model for CompCert. In Proc. ITP, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  9. H.-J. Boehm and S. Adve. Foundations of the C++ concurrency memory model. In Proc. PLDI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Canet, P. Cuoq, and B. Monate. A value analysis for C programs. In Proc. 9th IEEE Intl. Working Conference on Source Code Analysis and Manipulation, SCAM ’09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Chisnall, J. Matthiesen, K. Memarian, K. Nienhuis, P., and R.N.M. Watson. C memory object and value semantics: the space of de facto and ISO standards, 2016. http://www.cl.cam.ac.uk/~pes20/cerberus/.Google ScholarGoogle Scholar
  12. D. Chisnall, C. Rothwell, R. N. M. Watson, J. Woodruff, M. Vadera, S. W. Moore, M. Roe, B. Davis, and P. G. Neumann. Beyond the PDP-11: Architectural support for a memory-safe C abstract machine. In Proc. ASPLOS, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Cohen, M. Moskal, S. Tobies, and W. Schulte. A precise yet efficient memory model for C. Electron. Notes Theor. Comput. Sci. (SSV 2009), 254:85–103, October 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Cook and S. Subramanian. A formal semantics for C in Nqthm. Technical Report 517D, Trusted Information Systems, October, 1994.Google ScholarGoogle Scholar
  15. J. Criswell, N. Geoffray, and V. Adve. Memory safety for low-level software/hardware interactions. In Proceedings of the Eighteenth Usenix Security Symposium, August 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Devietti, C. Blundell, M. M. K. Martin, and S. Zdancewic. Hardbound: Architectural support for spatial safety of the C programming language. In Proc. ASPLOS, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Ellison and G. Ro¸su. An executable formal semantics of C with applications. In Proc. POPL, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Gurevich and J. K. Huggins. The semantics of the C programming language. In Proc. CSL ’92, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Hathhorn, C. Ellison, and G. Rosu. Defining the undefinedness of C. In Proc. PLDI, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Charles McEwen Ellison III. A Formal Semantics of C with Applications. PhD thesis, University of Illinois at Urbana-Champaign, 2012.Google ScholarGoogle Scholar
  21. Runtime Verification Inc. Rv-match v0.1. https://runtimeverification.com/match/download/, 2016. Downloaded 2016-03-11.Google ScholarGoogle Scholar
  22. Intel Plc. Introduction to Intel memory protection extensions, July 2013.Google ScholarGoogle Scholar
  23. T. Jim, J. G. Morrisett, D. Grossman, M. W. Hicks, J. Cheney, and Y. Wang. Cyclone: A safe dialect of C. In Proc. USENIX ATC, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Kang, C.-K. Hur, W. Mansky, D. Garbuzov, S. Zdancewic, and V. Vafeiadis. A formal C memory model supporting integer-pointer casts. In Proc. PLDI, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. W. Kernighan and D. M. Ritchie. The C Programming Language (ANSI C). Prentice Hall, 2nd edition, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Krebbers. Aliasing restrictions of C11 formalized in Coq. In Proc. CPP, LNCS 8307, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Krebbers. An operational and axiomatic semantics for nondeterminism and sequence points in C. In Proc. POPL, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Krebbers. The C standard formalized in Coq. PhD thesis, Radboud University Nijmegen, December 2015.Google ScholarGoogle Scholar
  29. R. Krebbers and F. Wiedijk. Separation logic for non-local control flow and block scope variables. In FoSSaCS, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. R. Krebbers and F. Wiedijk. A typed C11 semantics for interactive theorem proving. In Proc. CPP, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. X. Leroy, A. W. Appel, S. Blazy, and G. Stewart. The CompCert memory model, version 2. Research report RR-7987, INRIA, June 2012.Google ScholarGoogle Scholar
  32. X. Leroy and S. Blazy. Formal verification of a C-like memory model and its uses for verifying program transformations. Journal of Automated Reasoning, 41(1):1–31, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. K. Memarian, J. Matthiesen, K. Nienhuis, V. B. F. Gomes, J. Lingard, D. Chisnall, R. N. M. Watson, and P. Sewell. Cerberus, 2016.Google ScholarGoogle Scholar
  34. www.cl.cam.ac.uk/~pes20/cerberus, www.repository.cam.ac.uk/handle/1810/255730.Google ScholarGoogle Scholar
  35. D. P. Mulligan, S. Owens, K. E. Gray, T. Ridge, and P. Sewell. Lem: reusable engineering of real-world semantics. In Proc. ICFP, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. S. Nagarakatte, J. Zhao, M. M.K. Martin, and S. Zdancewic. SoftBound: highly compatible and complete spatial memory safety for C. In Proc. PLDI, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. G. C. Necula, S. McPeak, and W. Weimer. CCured: type-safe retrofitting of legacy code. In Proc. POPL, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. N. Nethercote and J. Seward. Valgrind: A framework for heavyweight dynamic binary instrumentation. In PLDI, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. K. Nienhuis, K. Memarian, and P. Sewell. An operational semantics for C/C++11 concurrency, 2016. Draft available at www.cl.cam.ac.uk/~pes20/cerberus.Google ScholarGoogle Scholar
  40. M. Norrish. C formalised in HOL. Technical Report UCAMCL-TR-453, U. Cambridge, Computer Laboratory, 1998.Google ScholarGoogle Scholar
  41. M. Norrish. Deterministic expressions in C. In ESOP, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. N. S Papaspyrou. A formal semantics for the C programming language. PhD thesis, National Technical University of Athens, 1998.Google ScholarGoogle Scholar
  43. F. Pottier and Y. Régis-Gianas. Menhir. gallium.inria.fr/~fpottier/menhir/.Google ScholarGoogle Scholar
  44. J. Regehr. blog.regehr.org/archives/721, 2012.Google ScholarGoogle Scholar
  45. J. Regehr, April 2015. blog.regehr.org/archives/1234.Google ScholarGoogle Scholar
  46. TrustInSoft. trust-in-soft.com/tis-interpreter, 2015.Google ScholarGoogle Scholar
  47. H. Tuch, G. Klein, and M. Norrish. Types, bytes, and separation logic. In Proc. POPL, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. J. Ševˇcík, V. Vafeiadis, F. Zappa Nardelli, S. Jagannathan, and P. Sewell. CompCertTSO: A verified compiler for relaxedmemory concurrency. J. ACM, 60(3), June 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. X. Wang, H. Chen, A. Cheung, Z. Jia, N. Zeldovich, and M. F. Kaashoek. Undefined behavior: what happened to my code? In Proc. APSYS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. X. Wang, N. Zeldovich, M. F. Kaashoek, and A. Solar-Lezama. Towards optimization-safe systems: Analyzing the impact of undefined behavior. In Proc. SOSP, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. R. N. M. Watson, P. G. Neumann, J. Woodruff, M. Roe, J. Anderson, D. Chisnall, B. Davis, A. Joannou, B. Laurie, S. W. Moore, S. J. Murdoch, and R. Norton. Capability hardware enhanced RISC instructions: CHERI instruction-set architecture. Technical Report UCAM-CL-TR-864, University of Cambridge, Computer Laboratory, November 2015.Google ScholarGoogle Scholar
  52. R. N. M. Watson, J. Woodruff, P. G. Neumann, S. W. Moore, J. Anderson, D. Chisnall, N. H. Dave, B. Davis, K. Gudka, B. Laurie, S. J. Murdoch, R. Norton, M. Roe, S. Son, and M. Vadera. CHERI: A hybrid capability-system architecture for scalable software compartmentalization. In IEEE Symposium on Security and Privacy, SP, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. WG14. Defect report summary for ISO/IEC 9899:1999.Google ScholarGoogle Scholar
  54. WG14. ISO/IEC 9899:2011.Google ScholarGoogle Scholar
  55. WG14. Defect report 260, September 2004.Google ScholarGoogle Scholar
  56. www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm.Google ScholarGoogle Scholar
  57. J. Woodruff, R. N. M. Watson, D. Chisnall, S. W. Moore, J. Anderson, B. Davis, B. Laurie, P. G. Neumann, R. Norton, and M. Roe. The CHERI capability model: revisiting RISC in an age of risk. In ISCA, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. In Proc. PLDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. J. Zhao, S. Nagarakatte, M. M.K. Martin, and S. Zdancewic. Formalizing the LLVM intermediate representation for verified program transformations. In Proc. POPL, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Into the depths of C: elaborating the de facto standards

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PLDI '16: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation
      June 2016
      726 pages
      ISBN:9781450342612
      DOI:10.1145/2908080
      • General Chair:
      • Chandra Krintz,
      • Program Chair:
      • Emery Berger
      • cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 51, Issue 6
        PLDI '16
        June 2016
        726 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2980983
        • Editor:
        • Andy Gill
        Issue’s Table of Contents

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 June 2016

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Author Tags

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate406of2,067submissions,20%

      Upcoming Conference

      PLDI '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader