skip to main content
10.1145/1542452.1542461acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
research-article

Eliminating the call stack to save RAM

Published:19 June 2009Publication History

ABSTRACT

Most programming languages support a call stack in the programming model and also in the runtime system.We show that for applications targeting low-power embedded microcontrollers (MCUs), RAM usage can be significantly decreased by partially or completely eliminating the runtime callstack. We present flattening, a transformation that absorbs a function into its caller, replacing function invocations and returns with jumps. Unlike inlining, flattening does not duplicate the bodies of functions that have multiple callsites. Applied aggressively, flattening results in stack elimination. Flattening is most useful in conjunction with a lifting transformation that moves global variables into a local scope.

Flattening and lifting can save RAM. However, even more benefit can be obtained by adapting the compiler to cope with properties of flattened code. First, we show that flattening adds false paths that confuse a standard live variables analysis. The resulting problems can be mitigated by breaking spurious live-range conflicts between variables using information from the unflattened callgraph. Second, we show that the impact of high register pressure due to flattened and lifted code, and consequent spills out of the register allocator, can be mitigated by improving a compiler's stack layout optimizations. We have implemented both of these improvements in GCC, and have implemented flattening and lifting as source-to-source transformations. On a collection of applications for the AVR family of 8-bit MCUs, we show that total RAM usage can be reduced by 20% by compiling flattened and lifted programs with our improved GCC.

References

  1. Martn Abadi, Mihai Budiu, Úlfar Erlingsson, and Jay Ligatti. Control flow integrity: Principles, implementations, and applications. In Proc. of the 12th ACM Conf. on Computer and Communications Security (CCS), Alexandria, VA, November 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Scott Ananian and Martin Rinard. Data size optimizations for Java programs. In Proc. of the 2003 Conf. on Languages, Compilers, and Tools for Embedded Systems (LCTES), pages 59--68, San Diego, CA, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Andrew W. Appel. Compiling with Continuations. Cambridge University Press, Cambridge, MA, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Lan S. Bai, Lei Yang, and Robert P. Dick. Automated compile-time and run-time techniques to increase usable memory in MMU-less embedded systems. In Proc. of the Intl. Conf. on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), Seoul, Korea, October 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Volker Barthelmann. Inter-task register-allocation for static operating systems. In Proc. of the Joint Conf. on Languages, Compilers, and Tools for Embedded Systems (LCTES) and Software and Compilers for Embedded Systems (SCOPES), pages 149--154, Berlin, Germany, June 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Surupa Biswas, Matthew Simpson, and Rajeev Barua. Memory overflow protection for embedded systems using run-time checks, reuse and compression. In Proc. of the Conf. on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), Washington, DC, September 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Dennis Brylow, Niels Damgaard, and Jens Palsberg. Static checking of interrupt-driven software. In Proc. of the 23rd Intl. Conf. on Software Engineering (ICSE), pages 47--56, Toronto, Canada, May 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dominique Chanet, Bjorn De Sutter, Bruno De Bus, Ludo Van Put, and Koen De Bosschere. System-wide compaction and specialization of the Linux kernel. In Proc. of the 2005 Conf. on Languages, Compilers, and Tools for Embedded Systems (LCTES), Chicago, IL, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Trishul M. Chilimbi, Bob Davidson, and James R. Larus. Cache-conscious structure definition. In Proc. of the ACM SIGPLAN 1999 Conf. on Programming Language Design and Implementation (PLDI), pages 13--24, Atlanta, GA, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Keith D. Cooper, Mary W. Hall, and Linda Torczon. An experiment with inline substitution. Software-Practice and Experience, 21(6):581--601, June 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Keith D. Cooper, Mary W. Hall, and Ken Kennedy. A methodology for procedure cloning. Computer Languages, 19(2):105--118, April 1993.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Nathan Cooprider and John Regehr. Offline compression for on-chip RAM. In Proc. of the ACM SIGPLAN 2007 Conf. on Programming Language Design and Implementation (PLDI), pages 363--372, San Diego, CA, June 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jakob Engblom. Static properties of commercial embedded real-time programs, and their implication for worst-case execution time analysis. In Proc. of the 5th IEEE Real-Time Technology and Applications Symp. (RTAS), Vancouver, Canada, June 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Aurélien Francillon and Claude Castelluccia. Code injection attacks on Harvard--architecture devices. In Proc. of the 15th ACM Conf. on Computer and Communications Security (CCS), Alexandria, VA, October 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jack Ganssle. The Art of Designing Embedded Systems. Newnes, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. David Gay, Phil Levis, Robert von Behren, Matt Welsh, Eric Brewer, and David Culler. The nesC language: A holistic approach to networked embedded systems. In Proc. of the ACM SIGPLAN 2003 Conf. on Programming Language Design and Implementation (PLDI), pages 1--11, San Diego, CA, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Dirk Grunwald and Richard Neves. Whole-program optimization for time and space efficient threads. In Proc. of the 7th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 50--59, Cambridge, MA, October 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Håkon A. Hjortland. Laser video projector. http://heim.ifi.uio.no/haakoh/avr/.Google ScholarGoogle Scholar
  19. Chris Lattner and Vikram Adve. Transparent pointer compression for linked data structures. In Proc. of the ACM Workshop on Memory System Performance (MSP), Chicago, IL, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Vladimir N. Makarov. The integrated register allocator for GCC. In Proc. of the GCC Developers' Summit, pages 77--90, Ottawa, Canada, July 2007. URL http://ols.fedoraproject.org/GCC/Reprints-2007/makarov-reprint.pdf.Google ScholarGoogle Scholar
  21. William P. McCartney and Nigamanth Sridhar. Abstractions for safe concurrent programming in networked embedded systems. In Proc. of the 4th ACM Conf. on Embedded Networked Sensor Systems (SenSys 2006), pages 167--180, Boulder, Colorado, October 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Bhuvan Middha, Matthew Simpson, and Rajeev Barua. MTSS: Multi task stack sharing for embedded systems. In Proc. of the Intl. Conf. on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), San Francisco, CA, September 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. George C. Necula, Scott McPeak, S. P. Rahul, and Westley Weimer. CIL: Intermediate language and tools for analysis and transformation of C programs. In Proc. of the Intl. Conf. on Compiler Construction (CC), pages 213--228, Grenoble, France, April 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ozcan Ozturk, Mahmut Kandemir, and Mary Jane Irwin. Increasing on-chip memory space utilization for embedded chip multiprocessors through data compression. In Proc. of the 3rd Intl. Conf. on Hardware/Software Codesign and System Synthesis (CODES+ISSS), pages 87--92, Jersey City, NJ, September 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Paparazzi. The Paparazzi project, 2006. http://www.nongnu.org/paparazzi.Google ScholarGoogle Scholar
  26. Rodric M. Rabbah and Krishna V. Palem. Data remapping for design space optimization of embedded memory systems. ACM Trans. Embedded Computing Systems (TECS), 2(2):1--32, May 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. John Regehr, Alastair Reid, and Kirk Webb. Eliminating stack overflow by abstract interpretation. In Proc. of the 3rd Intl. Conf. on Embedded Software (EMSOFT), pages 306--322, Philadelphia, PA, October 2003.Google ScholarGoogle ScholarCross RefCross Ref
  28. TinyOS. tinyos.net. http://www.tinyos.net.Google ScholarGoogle Scholar
  29. Ben L. Titzer. Virgil: Objects on the head of a pin. In Proc. of the ACM Conf. on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 191--208, Portland, OR, October 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Ben L. Titzer, Daniel Lee, and Jens Palsberg. Avrora: Scalable sensor network simulation with precise timing. In Proc. of the 4th Intl. Conf. on Information Processing in Sensor Networks (IPSN), pages 44--53, Los Angeles, CA, April 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Christian Wawersich, Michael Stilkerich, and Wolfgang Schröder-Preikschat. An OSEK/VDX-based Multi-JVM for automotive appliances. In Proc. of the Intl. Embedded Systems Symposium, Irvine, CA, 2007.Google ScholarGoogle Scholar
  32. Lei Yang, Robert P. Dick, Haris Lekatsas, and Srimat Chakradhar. On-line memory compression for embedded systems. ACM Trans. Embedded Computing Systems (TECS), 2006.Google ScholarGoogle Scholar
  33. Youtao Zhang and Rajiv Gupta. Compressing heap data for improved memory performance. Software--Practice and Experience, 36(10):1081--1111, August 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Eliminating the call stack to save RAM

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          LCTES '09: Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
          June 2009
          188 pages
          ISBN:9781605583563
          DOI:10.1145/1542452
          • cover image ACM SIGPLAN Notices
            ACM SIGPLAN Notices  Volume 44, Issue 7
            LCTES '09
            July 2009
            176 pages
            ISSN:0362-1340
            EISSN:1558-1160
            DOI:10.1145/1543136
            Issue’s Table of Contents

          Copyright © 2009 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 June 2009

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          LCTES '09 Paper Acceptance Rate18of81submissions,22%Overall Acceptance Rate116of438submissions,26%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader