skip to main content
10.1145/1134760.1134771acmconferencesArticle/Chapter ViewAbstractPublication PagesveeConference Proceedingsconference-collections
Article

A new approach to real-time checkpointing

Published: 14 June 2006 Publication History

Abstract

The progress towards programming methodologies that simplify the work of the programmer involves automating, whenever possible, activities that are secondary to the main task of designing algorithms and developing applications. Automatic memory management, using garbage collection, and automatic persistence, using checkpointing, are both examples of mechanisms that operate behind the scenes, simplifying the work of the programmer. Implementing such mechanisms in the presence of real-time constraints, however, is particularly difficult.In this paper we review the behavior of traditional copy-on-write implementations of checkpointing in the context of real-time systems, and we show how such implementations may, in pathological cases, seriously impair the ability of the user code to meet its deadlines. We discuss the source of the problem, supply benchmarks, and discuss possible remedies. We subsequently propose a novel approach that does not rely on copy-on-write and that, while more expensive in terms of CPU time overhead, is unaffected by pathological user code. We also describe our implementation of the proposed solution, based on the Ovm RTSJ Java Virtual Machine, and we discuss our experimental results.

References

[1]
A. W. Appel and K. Li. Virtual memory primitives for user programs. In 4th International Conference on Architectural Support for Programming Languages and Operating System (ASPLOS), volume 26, pages 96--107, New York, NY, 1991. ACM Press.]]
[2]
D. F. Bacon, P. Cheng, and V. T. Rajan. A real-time garbage collector with low overhead and consistent utilization. In POPL '03: Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 285--298, New York, NY, USA, 2003. ACM Press.]]
[3]
R. Bettati, N. Bowen, and J. Chung. Checkpointing imprecise computation. In IEEE Workshop on Imprecise and Approximate Computation, pages 45--49, Phoenix, AZ, Dec. 1992.]]
[4]
G. Bollella, T. Canham, V. Carson, V. Champlin, D. Dvorak, B. Giovannoni, M. Indictor, K. Meyer, A. Murray, and K. Reinholtz. Programming with non-heap memory in the Real-Time Specification for Java. In OOPSLA '03: Companion of the 18th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, pages 361--369, New York, NY, USA, 2003. ACM Press.]]
[5]
G. Bollella, J. Gosling, B. Brosgol, P. Dibble, S. Furr, and M. Turnbull. The Real-Time Specification for Java. Java Series. Addison-Wesley, June 2000.]]
[6]
G. Bollella, J. Gosling, B. Brosgol, J. Gosling, P. Dibble, S. Furr, M. Turnbull, T. J. Bergin, and R. G. Gibson. The Real-Time Specification for Java. Addison-Wesley, New York, NY, 2000.]]
[7]
G. Candea, J. Cutler, and A. Fox. Improving availability with recursive microreboots: a soft-state system case study. Perform. Eval., 56(1-4):213--248, 2004.]]
[8]
E. W. Dijkstra, L. Lamport, A. J. Martin, C. S. Scholten, and E. F. M. Steffens. On-the-fly garbage collection: An exercise in cooperation. In Lecture Notes in Computer Science, No. 46. Springer-Verlag, New York, 1976.]]
[9]
E. N. Elnozahy, D. B. Johnson, and W. Zwaenepoel. The performance of consistent checkpointing. In Symposium on Reliable Distributed Systems, pages 39--47, 1992.]]
[10]
R. Geist, R. Reynolds, and J. Westall. Selection of a checkpoint interval in a critical-task environment. IEEE Trans. Reliability, 37(4):395--400, 1988.]]
[11]
V. Grassi, L. Donatiello, and S. Tucci. On the optimal checkpointing of critical tasks and transaction-oriented systems. IEEE Trans. Softw. Eng., 18(1):72--77, 1992.]]
[12]
C. M. Krishna, Y.-H. Lee, and K. G. Shin. Optimization criteria for checkpoint placement. Commun. ACM, 27(10):1008--1012, 1984.]]
[13]
S. Kwak, B. Choi, and B. Kim. An optimal checkpointing-strategy for real-time control systems under transient faults. IEEE Transactions on Reliability, 50(3):293--301, September 2001.]]
[14]
S. W. Kwak, B.-J. Choi, and B. K. Kim. Checkpointing strategy for multiple real-time tasks. In 7th International Workshop on Real-Time Computing and Applications Symposium (RTCSA 2000), pages 12--14, Cheju Island, South Korea, Dec. 2000.]]
[15]
H. Lee, H. Shin, and S. L. Min. Worst case timing requirement of real-time tasks with time redundancy. In 6th International Workshop on Real-Time Computing and Applications Symposium (RTCSA '99), pages 410--413, Hong Kong, China, Dec. 1999. IEEE Computer Society.]]
[16]
K. Li, J. F. Naughton, and J. S. Plank. Real-time, concurrent checkpoint for parallel programs. SIGPLAN Not., 25(3):79--88, 1990.]]
[17]
K. Li, J. F. Naughton, and J. S. Plank. Low-latency, concurrent checkpointing for parallel programs. IEEE Trans. Parallel Distrib. Syst., 5(8):874--879, 1994.]]
[18]
J. S. Plank, M. Beck, G. Kingsley, and K. Li. Libckpt: Transparent checkpointing under Unix. In Usenix Winter Technical Conference, pages 213--223, January 1995.]]
[19]
J. S. Plank, K. Li, and M. A. Puening. Diskless checkpointing. IEEE Transactions on Parallel and Distributed Systems, 9(10):972--986, October 1998.]]
[20]
S. Punnekkat and A. Burns. Analysis of checkpointing for schedulability of real-time systems. In 4th International Workshop on Real-Time Computing Systems and Applications (RTCSA '97), pages 198--205. IEEE Computer Society, Oct. 1997.]]
[21]
A. Ranganathan and S. Upadhyaya. Simulation analysis of a dynamic checkpointing strategy for real-time systems. In 27th Annual Simulation Symposium, pages 181--187, La Jolla, CA, Apr. 1994. IEEE Computer Society.]]
[22]
A. B. S. Punnekkat and R. Davis. Analysis of checkpointing for real-time systems. Real-Time Systems Journal, 20(1):83--102, Jan 2001.]]
[23]
K. G. Shin, T.-H. Lin, and Y.-H. Lee. Optimal checkpointing of real-time tasks. IEEE Trans. Comput., 36(11):1328--1341, 1987.]]
[24]
J. M. Stichnoth, G.-Y. Lueh, and M. Cierniak. Support for garbage collection at every instruction in a Java compiler. In Proceedings of the ACM SIGPLAN '99 Conference on Programming Language Design and Implementation, Atlanta, Georgia, May 1--4, 1999.]]
[25]
A. N. Tantawi and M. Ruschitzka. Performance analysis of checkpointing strategies. ACM Trans. Comput. Syst., 2(2):123--144, 1984.]]
[26]
D. Ungar. Generation scavenging: A non-disruptive high performance storage reclamation algorithm. In Software Development Environments (SDE), pages 157--167, 1984.]]
[27]
N. H. Vaidya. On checkpoint latency. Technical report, College Station, TX, USA, 1995.]]
[28]
N. H. Vaidya. Impact of checkpoint latency on overhead ratio of a checkpointing scheme. IEEE Trans. Comput., 4 (8):942--947, 1997.]]
[29]
N. H. Vaidya. Staggered consistent checkpointing. IEEE Trans. Parallel Distrib. Syst., 10(7):694--702, 1999.]]
[30]
J. C. Wu and S. A. Brandt. Storage access support for soft real-time applications. In 10th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2004), Toronto, Canada, May 2004.]]
[31]
Y. Zhang and K. Chakrabarty. Energy-aware adaptive checkpointing in embedded real-time systems. In 2003 Design, Automation and Test in Europe Conference and Exposition (DATE 2003), pages 10918--10925, Munich, Germany, Mar. 2003. IEEE Computer Society.]]
[32]
Y. Zhang and K. Chakrabarty. Fault recovery based on checkpointing for hard real-time embedded systems. In 18th IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2003), pages 320--327, Boston, MA, Nov. 2003.]]
[33]
Y. Zhang and K. Chakrabarty. Dynamic adaptation for fault tolerance and power management in embedded real-time systems. Trans. on Embedded Computing Sys., 3(2):336--360, 2004.]]

Cited By

View all
  • (2018)Eliminating object reference checks by escape analysis on real-time Java virtual machineCluster Computing10.1007/s10586-018-2145-8Online publication date: 27-Feb-2018
  • (2016)Leveraging Managed Runtime Systems to Build, Analyze, and Optimize Memory GraphsACM SIGPLAN Notices10.1145/3007611.289225351:7(131-143)Online publication date: 25-Mar-2016
  • (2016)Leveraging Managed Runtime Systems to Build, Analyze, and Optimize Memory GraphsProceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/2892242.2892253(131-143)Online publication date: 25-Mar-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
VEE '06: Proceedings of the 2nd international conference on Virtual execution environments
June 2006
194 pages
ISBN:1595933328
DOI:10.1145/1134760
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Java
  2. checkpoint
  3. real-time
  4. virtual machine

Qualifiers

  • Article

Conference

VEE06

Acceptance Rates

Overall Acceptance Rate 80 of 235 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Eliminating object reference checks by escape analysis on real-time Java virtual machineCluster Computing10.1007/s10586-018-2145-8Online publication date: 27-Feb-2018
  • (2016)Leveraging Managed Runtime Systems to Build, Analyze, and Optimize Memory GraphsACM SIGPLAN Notices10.1145/3007611.289225351:7(131-143)Online publication date: 25-Mar-2016
  • (2016)Leveraging Managed Runtime Systems to Build, Analyze, and Optimize Memory GraphsProceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/2892242.2892253(131-143)Online publication date: 25-Mar-2016
  • (2016)Evaluating Online Global Recovery with Fenix Using Application-Aware In-Memory Checkpointing Techniques2016 45th International Conference on Parallel Processing Workshops (ICPPW)10.1109/ICPPW.2016.56(346-355)Online publication date: Aug-2016
  • (2015)Supporting fault-tolerance in a compositional real-time scheduling frameworkACM SIGBED Review10.1145/2782753.278275412:2(7-15)Online publication date: 20-May-2015
  • (2012)Combining Partial Redundancy and Checkpointing for HPCProceedings of the 2012 IEEE 32nd International Conference on Distributed Computing Systems10.1109/ICDCS.2012.56(615-626)Online publication date: 18-Jun-2012
  • (2008)Anti-DDoS Virtualized Operating SystemProceedings of the 2008 Third International Conference on Availability, Reliability and Security10.1109/ARES.2008.120(667-674)Online publication date: 4-Mar-2008
  • (2006)Schedulable persistence system for teal-time applications in virtual machineProceedings of the 6th ACM & IEEE International conference on Embedded software10.1145/1176887.1176916(195-204)Online publication date: 22-Oct-2006

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media