skip to main content
10.1145/1529282.1529657acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

FlashBox: a system for logging non-deterministic events in deployed embedded systems

Published: 08 March 2009 Publication History

Abstract

The ability to postmortem failures in deployed systems due to non-deterministic events is useful in crash investigations. With this goal in mind, we propose FlashBox - a system that acts as a black box for embedded systems, recording non-deterministic events (interrupts). The FlashBox hardware consists of a microcontroller and flash memory. The FlashBox software is an extension to a compiler, enabling recording capabilities at various granularities. There are no source code modifications required to use FlashBox and no assumptions made on processor capabilities such as hardware counters. The FlashBox log can be used for faithful replay with a goal to isolate faults and reason about failure.
We present a prototype implementation of FlashBox that logs non-deterministic events on an AVR ATMega169 microcontroller. The FlashBox prototype consists of a 8051 microcontroller with flash memory. The avr-gcc compiler has been extended to log non-deterministic events. Based on our experimental results, FlashBox results in 10-23% overhead while providing capability to log non-deterministic events at instruction level granularity. With decreasing cost of flash memories, FlashBox provides a low cost logging mechanism. The use of standard I/O communication protocols enhances portability, enabling ease of integration for different classes of embedded systems.

References

[1]
Atmel. AVR RISC processors. http://www.atmel.com/products/avr/, 2007.
[2]
D. Brylow, N. Damgaard, and J. Palsberg. Static checking of interrupt-driven software. ICSE, 00:0047, 2001.
[3]
T. A. Cargill and B. N. Locanthi. Cheap hardware support for software debugging and profiling. In ASPLOS-II, pages 82--83, Los Alamitos, CA, USA, 1987. IEEE Computer Society Press.
[4]
G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. Revirt: enabling intrusion analysis through virtual-machine logging and replay. SIGOPS Oper. Syst. Rev., 36(SI):211--224, 2002.
[5]
E. N. M. Elnozahy, L. Alvisi, Y.-M. Wang, and D. B. Johnson. A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv., 34(3):375--408, 2002.
[6]
C. Fidge and P. Cook. Model checking interrupt-dependent software. ASPEC, 0:51--58, 2005.
[7]
S. Garfinkel. History's worst software bugs. Byte Magazine, November 2005.
[8]
D. Geels, G. Altekar, S. Shenker, and I. Stoica. Replay debugging for distributed applications (awarded best paper). In USENIX Annual Technical Conference, General Track, pages 289--300, 2006.
[9]
S. T. King, G. W. Dunlap, and P. M. Chen. Debugging operating systems with time-traveling virtual machines. In USENIX Annual Technical Conference, General Track, pages 1--15, 2005.
[10]
G. Lawton. Improved flash memory grows in popularity. Computer, 39(1):16--18, 2006.
[11]
N. G. Leveson. Software safety in embedded computer systems. Commun. ACM, 34(2):34--46, 1991.
[12]
N. G. Leveson and C. S. Turner. An investigation of the therac-25 accidents. volume 26, pages 18--41, Los Alamitos, CA, USA, 1993. IEEE Computer Society.
[13]
B. Liblit, A. Aiken, A. X. Zheng, and M. I. Jordan. Bug isolation via remote program sampling. In PLDI 2003, pages 141--154, New York, NY, USA, 2003. ACM Press.
[14]
A. Malik, B. Moyer, and D. Cermak. A Lower Power Unified Cache Architecture Providing Power and Performance Flexibility. In Internationl Symposium on Low Power Electronics and Design, 2000.
[15]
S. Narayanasamy, G. Pokam, and B. Calder. Bugnet: Recording application-level execution for deterministic replay debugging. IEEE Micro, 26(1):100--109, 2006.
[16]
B. Plattner. Real-time execution monitoring. pages 55--63, 1995.
[17]
F. Qin, J. Tucek, J. Sundaresan, and Y. Zhou. Rx: treating bugs as allergies---a safe method to survive software failures. In SOSP 2005, pages 235--248, New York, NY, USA, 2005. ACM Press.
[18]
J. Regehr. Random testing of interrupt-driven software. In EMSOFT '05: Proceedings of the 5th ACM international conference on Embedded software, pages 290--298, New York, NY, USA, 2005. ACM.
[19]
Y. Saito. Jockey: a user-space library for record-replay debugging. In ACM AADEBUG 2005, pages 69--76, New York, NY, USA, 2005. ACM Press.
[20]
J. Slye and E. Elnozahy. Supporting nondeterministic execution in fault-tolerant systems. FTCS, 00:250, 1996.
[21]
J. H. Slye and E. Elnozahy. Support for software interrupts in log-based rollback-recovery. IEEE Transactions on Computers, 47(10):1113--1123, 1998.
[22]
S. M. Srinivasan, S. Kandula, C. R. Andrews, and Y. Zhou. Flashback: a lightweight extension for rollback and deterministic replay for software debugging. In USENIX ATEC 2004, pages 3--3, Berkeley, CA, USA, 2004. USENIX Association.
[23]
R. Strom and S. Yemini. Optimistic recovery in distributed systems. ACM Trans. Comput. Syst., 3(3):204--226, 1985.
[24]
H. Thane. Monitoring, Testing and Debugging of Distributed Real-Time Systems. PhD thesis, May 2000.
[25]
H. Thane. Time machines and black box recorders for embedded systems software. ERCIM News, (52):32--33, January 2003.
[26]
H. Thane and D. Sundmark. Debugging using time machines: replay your embedded system's history. In Real-Time and Embedded Computing Conference, page Kap 22, Milan, Italy, November 2001.
[27]
H. Thane, D. Sundmark, J. Huselius, and A. Pettersson. Replay debugging of real-time systems using time machines. In First International Workshop on Parallel and Distributed Systems: Testing and Debugging (PADTAD), pages 288--295, Nice, France, April 2003. ACM.
[28]
B. L. Titzer and J. Palsberg. Nonintrusive precision instrumentation of microcontroller software. In LCTES '05: Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, pages 59--68, New York, NY, USA, 2005. ACM Press.
[29]
J. J. P. Tsai, K.-Y. Fang, H.-Y. Chen, and Y.-D. Bi. A noninterference monitoring and replay mechanism for real-time software testing and debugging. IEEE Trans. Softw. Eng., 16(8):897--916, 1990.
[30]
USDoT. National highway traffic safety administration nhtsa 03v-240. www.autotechdaily.com/pdfs/T02-05-03.pdf, 2003.

Cited By

View all
  • (2020)Software-Based Monitoring and Analysis of a USB Host Controller Subject to Electrostatic Discharge2020 CSI/CPSSI International Symposium on Real-Time and Embedded Systems and Technologies (RTEST)10.1109/RTEST49666.2020.9140117(1-7)Online publication date: Jun-2020
  • (2020)New Frontiers in IoT: Networking, Systems, Reliability, and Security ChallengesIEEE Internet of Things Journal10.1109/JIOT.2020.30076907:12(11330-11346)Online publication date: Dec-2020
  • (2018)Secure Data Recording and Bio-Inspired Functional Integrity for Intelligent Robots2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS.2018.8593994(8723-8728)Online publication date: Oct-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '09: Proceedings of the 2009 ACM symposium on Applied Computing
March 2009
2347 pages
ISBN:9781605581668
DOI:10.1145/1529282
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 March 2009

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

SAC09
Sponsor:
SAC09: The 2009 ACM Symposium on Applied Computing
March 8, 2009 - March 12, 2008
Hawaii, Honolulu

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Software-Based Monitoring and Analysis of a USB Host Controller Subject to Electrostatic Discharge2020 CSI/CPSSI International Symposium on Real-Time and Embedded Systems and Technologies (RTEST)10.1109/RTEST49666.2020.9140117(1-7)Online publication date: Jun-2020
  • (2020)New Frontiers in IoT: Networking, Systems, Reliability, and Security ChallengesIEEE Internet of Things Journal10.1109/JIOT.2020.30076907:12(11330-11346)Online publication date: Dec-2020
  • (2018)Secure Data Recording and Bio-Inspired Functional Integrity for Intelligent Robots2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS.2018.8593994(8723-8728)Online publication date: Oct-2018
  • (2017)The Case for an Ethical Black BoxTowards Autonomous Robotic Systems10.1007/978-3-319-64107-2_21(262-273)Online publication date: 20-Jul-2017
  • (2016)A Reference Model for Monitoring IoT WSN-Based ApplicationsSensors10.3390/s1611181616:11(1816)Online publication date: 30-Oct-2016
  • (2016)Soft Failure Mechanisms and PCB Design MeasuresSystem Level ESD Co‐Design10.1002/9781118861899.ch6(169-233)Online publication date: 13-May-2016
  • (2015)TARDISProceedings of the 14th International Conference on Information Processing in Sensor Networks10.1145/2737095.2737096(286-297)Online publication date: 13-Apr-2015
  • (2012)SPI-SNOOPERProceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis10.1145/2380445.2380460(53-62)Online publication date: 7-Oct-2012
  • (2012)A Classification of the Debugging Techniques of Wireless Sensor NetworksProceedings of the 2012 International Conference on Advances in Computing and Communications10.1109/ICACC.2012.12(51-57)Online publication date: 9-Aug-2012
  • (2012)Tracing and recording interrupts in embedded softwareJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2012.06.00358:9(372-385)Online publication date: 1-Oct-2012
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media