skip to main content
10.5555/1218112.1218297acmconferencesArticle/Chapter ViewAbstractPublication PageswscConference Proceedingsconference-collections
Article

Incremental checkpointing with application to distributed discrete event simulation

Published: 03 December 2006 Publication History

Abstract

Checkpointing is widely used in robust fault-tolerant applications. We present an efficient incremental checkpointing mechanism. It requires to record only the state changes and not the complete state. After the creation of a checkpoint, state changes are logged incrementally as records in memory, with which an application can spontaneously roll back later. This incrementalism allows us to implement checkpointing with high performance. Only small constant time is required for checkpoint creation and state recording. Rollback requires linear time in the number of recorded state changes, which is bounded by the number of state variables times the number of checkpoints. We implement a Java source transformer that automatically converts an existing application into a behavior-preserving one with checkpointing functionality. This transformation is application-independent and application-transparent. A wide range of applications can benefit from this technique. Currently, it has been used for distributed discrete event simulation using the Time Warp technique.

References

[1]
Boehm, H.-J., and A. J. Demers. 1997. A garbage collector for C and C++. <http://www.hpl.hp.com/personal/Hans_Boehm/gc/>.
[2]
Brooks, C., E. A. Lee, X. Liu, S. Neuendorffer, Y. Zhao, and H. Zheng. 2005. Ptolemy II - heterogeneous concurrent modeling and design in Java. Technical Report UCB/ERL M05/21, EECS, UC Berkeley.
[3]
Bruce, D. 1995. The treatment of state in optimistic systems. In Proceedings of the 9th Workshop on Parallel and Distributed Simulation, 40--49: IEEE Computer Society.
[4]
Das, S., R. Fujimoto, K. Panesar, D. Allison, and M. Hybinette. 1994. GTW: a time warp system for shared memory multiprocessors. In Proceedings of the 26th Winter Simulation Conference, 1332--1339.
[5]
Diwan, A., K. S. McKinley, and J. E. B. Moss. 1998. Typebased alias analysis. In PLDI '98: Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, 106--117. New York, NY, USA: ACM Press.
[6]
Fowler, M. 1999. Refactoring: improving the design of existing code. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc.
[7]
Greenwald, R., R. Stackowiak, and J. Stern. 2001, June. Oracle essentials: Oracle9 i, Oracle8 i and Oracle8. 2nd edition. O'Reilly & Associates, Inc.
[8]
Jefferson, D. R. 1985. Virtual time. ACM Transactions on Programming Language and Systems 7 (3): 404--425.
[9]
Kiczales, G., J. Lamping, A. Menhdhekar, C. Maeda, C. Lopes, J.-M. Loingtier, and J. Irwin. 1997. Aspectoriented programming. In Proceedings of the European Conference on Object-Oriented Programming, ed. M. Akşit and S. Matsuoka, Volume 1241, 220--242. Berlin, Heidelberg, and New York: Springer-Verlag.
[10]
Lawall, J. L., and G. Muller. 2000. Efficient incremental checkpointing of Java programs. In Proceedings of the International Conference on Dependable Systems and Networks, 61--70. New York, NY, USA: IEEE.
[11]
Olson, M. A., K. Bostic, and M. I. Seltzer. 1999. Berkeley DB. In USENIX Annual Technical Conference, FREENIX Track, 183--191.
[12]
Rönngren, R., M. Liljenstam, R. Ayani, and J. Montagnat. 1996. Transparent incremental state saving in time warp parallel discrete event simulation. In Proceedings of the Workshop on Parallel and Distributed Simulation, 70--77.
[13]
Shavit, N., and D. Touitou. 1995. Software transactional memory. In Symposium on Principles of Distributed Computing, 204--213.
[14]
Steinman, J. S. 1993. Incremental state saving in SPEEDES using C++. In Proceedings of the Winter Simulation Conference, 687--696.
[15]
Strom, R., and S. Yemini. 1985. Optimistic recovery in distributed systems. ACM Transactions on Programming Language and Systems 3 (3): 204--226.
[16]
West, D., and K. Panesar. 1996. Automatic incremental state saving. In Proceedings of the Workshop on Parallel and Distributed Simulation, 78--85.

Cited By

View all
  • (2016)Rollback-based simulation for the design of continuous/discrete simulation toolsProceedings of the Summer Computer Simulation Conference10.5555/3015574.3015606(1-8)Online publication date: 24-Jul-2016
  • (2016)Introducing Triquetrum, A Possible Future for Kepler and Ptolemy II1Procedia Computer Science10.1016/j.procs.2016.05.54680:C(2449-2454)Online publication date: 1-Jun-2016
  • (2009)A fully distributed data collection method for HLA based distributed simulationsProceedings of the 2009 Summer Computer Simulation Conference10.5555/2349508.2349553(337-347)Online publication date: 13-Jul-2009

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSC '06: Proceedings of the 38th conference on Winter simulation
December 2006
2429 pages
ISBN:1424405017

Sponsors

  • IIE: Institute of Industrial Engineers
  • ASA: American Statistical Association
  • IEICE ESS: Institute of Electronics, Information and Communication Engineers, Engineering Sciences Society
  • IEEE-CS\DATC: The IEEE Computer Society
  • SIGSIM: ACM Special Interest Group on Simulation and Modeling
  • NIST: National Institute of Standards and Technology
  • (SCS): The Society for Modeling and Simulation International
  • INFORMS-CS: Institute for Operations Research and the Management Sciences-College on Simulation

Publisher

Winter Simulation Conference

Publication History

Published: 03 December 2006

Check for updates

Qualifiers

  • Article

Conference

WSC06
Sponsor:
  • IIE
  • ASA
  • IEICE ESS
  • IEEE-CS\DATC
  • SIGSIM
  • NIST
  • (SCS)
  • INFORMS-CS
WSC06: Winter Simulation Conference 2006
December 3 - 6, 2006
California, Monterey

Acceptance Rates

WSC '06 Paper Acceptance Rate 177 of 252 submissions, 70%;
Overall Acceptance Rate 3,413 of 5,075 submissions, 67%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2016)Rollback-based simulation for the design of continuous/discrete simulation toolsProceedings of the Summer Computer Simulation Conference10.5555/3015574.3015606(1-8)Online publication date: 24-Jul-2016
  • (2016)Introducing Triquetrum, A Possible Future for Kepler and Ptolemy II1Procedia Computer Science10.1016/j.procs.2016.05.54680:C(2449-2454)Online publication date: 1-Jun-2016
  • (2009)A fully distributed data collection method for HLA based distributed simulationsProceedings of the 2009 Summer Computer Simulation Conference10.5555/2349508.2349553(337-347)Online publication date: 13-Jul-2009

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media