skip to main content
10.1145/2889443.2889453acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmodularityConference Proceedingsconference-collections
short-paper

Fault tolerance with aspects: a feasibility study

Published:14 March 2016Publication History

ABSTRACT

To enable correct program execution on unreliable hardware, software can be made fault-tolerant by adding program statements or machine instructions for fault detection and recovery. Manually modifying programs does not scale, and extending compilers to emit additional machine instructions lacks flexibility. However, since software-implemented hardware fault tolerance (SIHFT) can be understood as a cross-cutting concern, we propose aspect-oriented programming as a suitable implementation technique. We prove this proposition by implementing an AN encoder based on AspectC++. In terms of performance and fault coverage, we achieve comparable results to existing compiler-based solutions.

References

  1. A. Avizienis. Arithmetic error codes: Cost and effectiveness studies for application in digital system design. IEEE Trans. on Computers, C-20(11):1322–1331, 1971. ISSN 0018-9340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Borchert, H. Schirmeier, and O. Spinczyk. Generative software-based memory error detection and correction for operating system data structures. In Proc. of DSN’13, pages 1–12. IEEE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Fetzer, U. Schiffel, and M. Süßkraut. AN-encoding compiler: Building safety-critical systems with commodity hardware. In Proc. of SAFECOMP’09, LNCS/5775. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. O. Goloubeva, M. Rebaudengo, M. S. Reorda, and M. Violante. Software-Implemented Hardware Fault Tolerance. Springer, 2006. ISBN 0387260609. Google ScholarGoogle ScholarCross RefCross Ref
  5. G. Kiczales, A. Mendhekar, J. Lamping, C. Maeda, C. V. Lopes, J.-M. Loingtier, and J. Irwin. Aspect-oriented programming. In Proc. of ECOOP’97, LNCS/1241. Springer, June 1997.Google ScholarGoogle ScholarCross RefCross Ref
  6. G. Kiczales, E. Hilsdale, J. Hugunin, M. Kersten, J. Palm, and W. G. Griswold. An overview of AspectJ. In Proc. of ECOOP’01, LNCS/2072, pages 327–353. Springer, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Kuvaiskii and C. Fetzer. ∆-encoding: Practical encoded processing. In Proc. of DSN’15. IEEE, June 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In Proc. of PLDI’05, pages 190–200. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. E. B. Nightingale, J. R. Douceur, and V. Orgovan. Cycles, cells and platters: An empirical analysis of hardware failures on a million consumer PCs. In Proc. EuroSys’11. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. N. Oh, P. P. Shirvani, and E. J. McCluskey. Error detection by duplicated instructions in super-scalar processors. IEEE Trans. on Reliability, 51(1):63–75, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  11. M. Rebaudengo, M. S. Reorda, M. Torchiano, and M. Violante. Soft-error detection through software fault-tolerance techniques. In Proc. of DFT’99, pages 210–218. IEEE, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Rebaudengo, M. Reorda, M. Violante, and M. Torchiano. A source-to-source compiler for generating dependable software. In Proc. of SCAM’01, pages 33–42. IEEE, 2001..Google ScholarGoogle Scholar
  13. G. A. Reis, J. Chang, N. Vachharajani, R. Rangan, and D. I. August. SWIFT: Software implemented fault tolerance. In Proc. of CGO’05, pages 243–254. IEEE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. A. Rink and J. Castrillon. Improving code generation for software-based error detection. In Proc. of REES’15. To appear, 2015.Google ScholarGoogle Scholar
  15. N. A. Rink, D. Kuvaiskii, J. Castrillon, and C. Fetzer. Compiling for resilience: the performance gap. In Proc. of ERPP’15. Edacentrum, 2015.Google ScholarGoogle Scholar
  16. Roadmap of AspectC++. URL http://www.aspectc. org/Roadmap.php. Visited: 2015-11-06.Google ScholarGoogle Scholar
  17. U. Schiffel. Hardware Error Detection Using AN-Codes. PhD thesis, TU Dresden, Germany, 2011.Google ScholarGoogle Scholar
  18. B. Schroeder, E. Pinheiro, and W.-D. Weber. DRAM errors in the wild: A large-scale field study. In Proc. of SIGMETRICS’09, pages 193–204. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. O. Spinczyk, A. Gal, and W. Schröder-Preikschat. AspectC++: An aspect-oriented extension to the C++ programming language. In Proc. of CRPIT’02. Aus. Com. Soc., 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. Tarr, H. Ossher, W. Harrison, and J. Stanley M. Sutton. N degrees of separation: multi-dimensional separation of concerns. In Proc. of ISCE’99, pages 107–119. IEEE, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fault tolerance with aspects: a feasibility study

        Recommendations

        Reviews

        Scott Arthur Moody

        Writing programs to manage and mitigate faults is a complex and usually custom process. Are there approaches to help one easily create a correct and reliable program when running on unreliable hardware This paper reports on the goals where fault tolerance can be supported more seamlessly by adopting aspect-oriented programming (AOP). AOP is applied on a low-cost virtual platform called software implemented hardware fault tolerance (SIHFT). Since SIHFT code is both declarative and separated from the program code, AOP becomes an attractive approach. The authors describe how AOP has four parts: definitions (core concerns), aspects (cross-cutting concerns), pointcuts, and advice functions. These are used to both add annotations in code and also to join (or weave) existing compilation units with specific aspect code. AOP tools like AspectC++ are discussed as examples with supporting code fragments. By applying AOP to building fault-tolerant programs, those cross-cutting concerns that deal directly with faults can be compiled (or woven) into the resulting program easier. This paper also shows how automatic metric collection might be one of those concerns where using AOP adds fault coverage used when evaluating any performance penalties. For example, AOP can be used in large programs by injecting thousands of faults and exploring how "tolerant" the resulting program behaves. While the authors show results on performance, they acknowledge that more experiments are needed. They also identify some downsides, such as how the code weaving can be at a coarser level than many would like, relieving some desired control over the generated code. Online Computing Reviews Service

        Access critical reviews of Computing literature here

        Become a reviewer for Computing Reviews.

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          MODULARITY 2016: Proceedings of the 15th International Conference on Modularity
          March 2016
          145 pages
          ISBN:9781450339957
          DOI:10.1145/2889443

          Copyright © 2016 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 March 2016

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

          Acceptance Rates

          Overall Acceptance Rate41of139submissions,29%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader