ABSTRACT
To enable correct program execution on unreliable hardware, software can be made fault-tolerant by adding program statements or machine instructions for fault detection and recovery. Manually modifying programs does not scale, and extending compilers to emit additional machine instructions lacks flexibility. However, since software-implemented hardware fault tolerance (SIHFT) can be understood as a cross-cutting concern, we propose aspect-oriented programming as a suitable implementation technique. We prove this proposition by implementing an AN encoder based on AspectC++. In terms of performance and fault coverage, we achieve comparable results to existing compiler-based solutions.
- A. Avizienis. Arithmetic error codes: Cost and effectiveness studies for application in digital system design. IEEE Trans. on Computers, C-20(11):1322–1331, 1971. ISSN 0018-9340. Google ScholarDigital Library
- C. Borchert, H. Schirmeier, and O. Spinczyk. Generative software-based memory error detection and correction for operating system data structures. In Proc. of DSN’13, pages 1–12. IEEE, 2013. Google ScholarDigital Library
- C. Fetzer, U. Schiffel, and M. Süßkraut. AN-encoding compiler: Building safety-critical systems with commodity hardware. In Proc. of SAFECOMP’09, LNCS/5775. Springer, 2009. Google ScholarDigital Library
- O. Goloubeva, M. Rebaudengo, M. S. Reorda, and M. Violante. Software-Implemented Hardware Fault Tolerance. Springer, 2006. ISBN 0387260609. Google ScholarCross Ref
- G. Kiczales, A. Mendhekar, J. Lamping, C. Maeda, C. V. Lopes, J.-M. Loingtier, and J. Irwin. Aspect-oriented programming. In Proc. of ECOOP’97, LNCS/1241. Springer, June 1997.Google ScholarCross Ref
- G. Kiczales, E. Hilsdale, J. Hugunin, M. Kersten, J. Palm, and W. G. Griswold. An overview of AspectJ. In Proc. of ECOOP’01, LNCS/2072, pages 327–353. Springer, 2001. Google ScholarDigital Library
- D. Kuvaiskii and C. Fetzer. ∆-encoding: Practical encoded processing. In Proc. of DSN’15. IEEE, June 2015. Google ScholarDigital Library
- C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In Proc. of PLDI’05, pages 190–200. ACM, 2005. Google ScholarDigital Library
- E. B. Nightingale, J. R. Douceur, and V. Orgovan. Cycles, cells and platters: An empirical analysis of hardware failures on a million consumer PCs. In Proc. EuroSys’11. ACM, 2011. Google ScholarDigital Library
- N. Oh, P. P. Shirvani, and E. J. McCluskey. Error detection by duplicated instructions in super-scalar processors. IEEE Trans. on Reliability, 51(1):63–75, 2002.Google ScholarCross Ref
- M. Rebaudengo, M. S. Reorda, M. Torchiano, and M. Violante. Soft-error detection through software fault-tolerance techniques. In Proc. of DFT’99, pages 210–218. IEEE, 1999. Google ScholarDigital Library
- M. Rebaudengo, M. Reorda, M. Violante, and M. Torchiano. A source-to-source compiler for generating dependable software. In Proc. of SCAM’01, pages 33–42. IEEE, 2001..Google Scholar
- G. A. Reis, J. Chang, N. Vachharajani, R. Rangan, and D. I. August. SWIFT: Software implemented fault tolerance. In Proc. of CGO’05, pages 243–254. IEEE, 2005. Google ScholarDigital Library
- N. A. Rink and J. Castrillon. Improving code generation for software-based error detection. In Proc. of REES’15. To appear, 2015.Google Scholar
- N. A. Rink, D. Kuvaiskii, J. Castrillon, and C. Fetzer. Compiling for resilience: the performance gap. In Proc. of ERPP’15. Edacentrum, 2015.Google Scholar
- Roadmap of AspectC++. URL http://www.aspectc. org/Roadmap.php. Visited: 2015-11-06.Google Scholar
- U. Schiffel. Hardware Error Detection Using AN-Codes. PhD thesis, TU Dresden, Germany, 2011.Google Scholar
- B. Schroeder, E. Pinheiro, and W.-D. Weber. DRAM errors in the wild: A large-scale field study. In Proc. of SIGMETRICS’09, pages 193–204. ACM, 2009. Google ScholarDigital Library
- O. Spinczyk, A. Gal, and W. Schröder-Preikschat. AspectC++: An aspect-oriented extension to the C++ programming language. In Proc. of CRPIT’02. Aus. Com. Soc., 2002. Google ScholarDigital Library
- P. Tarr, H. Ossher, W. Harrison, and J. Stanley M. Sutton. N degrees of separation: multi-dimensional separation of concerns. In Proc. of ISCE’99, pages 107–119. IEEE, 1999. Google ScholarDigital Library
Index Terms
- Fault tolerance with aspects: a feasibility study
Recommendations
An Aspect-Oriented Approach to Assessing Fault Tolerance
MILCOM '14: Proceedings of the 2014 IEEE Military Communications ConferenceFault tolerance and survivability are important aspects of many business-critical and mission-critical systems but it is still difficult to assess how well fault tolerance techniques work. Ensuring fault tolerance in military communication systems is ...
On Hardware Resource Consumption for Aspect-Oriented Implementation of Fault Tolerance
EDCC '10: Proceedings of the 2010 European Dependable Computing ConferenceSoftware-implemented fault tolerance is a widely used technique for achieving high dependability in cost-sensitive applications. One approach to implementing fault tolerance in software is to use aspect-oriented programming (AOP). This paper ...
Comments