skip to main content
10.5555/1162708.1162915acmconferencesArticle/Chapter ViewAbstractPublication PageswscConference Proceedingsconference-collections
Article

A framework for fault-tolerance in HLA-based distributed simulations

Published: 04 December 2005 Publication History

Abstract

The widespread use of simulation in future military systems depends, among others, on the degree of reuse and availability of simulation models. Simulation support in such systems must also cope with failure in software or hardware. Research in fault-tolerant distributed simulation, especially in the context of the High Level Architecture (HLA), has been quite sparse. Nor does the HLA standard itself cover fault-tolerance extensively. This paper describes a framework, named Distributed Resource Management System (DRMS), for robust execution of federations. The implementation of the framework is based on Web Services and Semantic Web technology, and provides fundamental services and a consistent mechanism for description of resources managed by the environment. To evaluate the proposed framework, a federation has been developed that utilizes time-warp mechanism for synchronization. In this paper, we describe our approach to fault tolerance and give an example to illustrate how DRMS behaves when it faces faulty federates.

References

[1]
Berchtold, C., and M. Hezel. 2001. An architecture for fault-tolerant HLA-based simulation. In Proceedings of the 15th European Simulation Multiconference, 616--620. Prague, Czech Republic.
[2]
Bononi, L., G. D'Angelo, and L. Donatiello. 2003. HLA-based adaptive distributed simulation of wireless mobile systems. In Proceedings of the 17th Workshop on Parallel and Distributed Simulation, 40--49. San Diego, California.
[3]
Cai, W., S. Turner, and H. Zhao. 2002. A load management system for running HLA-based distributed simulations over the grid. In Proceedings of the 6th IEEE International Workshop on Distributed Simulation and Real-Time Applications, 7--14. Fort Worth, Texas.
[4]
Damani, O. P., and V. K. Garg. 1998. Fault-tolerant distributed simulation. In Proceedings of the 12th Workshop on Parallel and Distributed Simulation, 38--45. Alberta, Canada.
[5]
Eklöf, M., M. Sparf, and F. Moradi. 2004. Peer-to-peer-based resource management in support of HLA-based simulations. Simulation 80: 181--190.
[6]
Eklöf, M., J. Ulriksson, and F. Moradi. 2003. NetSim: An environment for network based modeling and simulation. In Proceedings of the NATO RTO Symposium on C3I and M&S Interoperability. Antalya, Turkey.
[7]
Huang, J., M. Tung, K. Wang, L. Hui, M. Lee, J. Wu, and S. Wai. 2003. Smart time management: The unified time management mechanism. In Proceedings of the 2003 European Simulation Interoperability Workshop. Stockholm, Sweden.
[8]
Jefferson, D. 1985. Virtual time. ACM Transactions on Programming Languages and Systems 7: 404--425.
[9]
Kiesling, T. 2003. Fault-tolerant distributed simulation: A position paper {online}. Available via http://fakinf.informatik.unibw-muen-chen.de/~tkiesling/documents/ftds-position-paper.pdf {accessed March 21, 2005}.
[10]
Lüthi, J., and S. Großmann. 2001. The resource sharing system: Dynamic federate mapping for HLA-based distributed simulation. In Proceedings of the 15th Workshop on Parallel and Distributed Simulation, 91--98. Lake Arrowhead, California.
[11]
Lüthi, J., and C. Berchtold. 2000. Concepts for dependable distributed discrete event simulation. In Proceedings of the 14th International European Simulation Multi-Conference, 59--66. Ghent, Belgium.
[12]
McBride, B. 2002. Jena: A semantic web toolkit. IEEE Internet Computing 6: 55--59.
[13]
McGuniess, D. L., and F. van Harmelen. 2004. OWL web ontology language overview {online}. Available via http://www.w3.org/TR/owl-features/ {accessed March 21, 2005}.
[14]
Möller, B., M. Karlsson, and B. Löfstrand. 2005. Developing fault tolerant federations using HLA evolved. In Proceedings of the 2005 Spring Simulation Interoperability Workshop. San Diego, California.
[15]
Saleem, U. 2004. Developing java web services with AXIS {online}. Available via http://www.developer.com/java/web/article.php/3443951 {accessed March 21, 2005}.
[16]
Tan, G., A. Persson, and R. Ayani. 2004. HLA federate migration. In Proceedings of the 38th Annual Simulation Symposium, 243--250. San Diego, California.
[17]
Vardanega, F., and C. Maziero. 2001. A generic rollback manager for optimistic HLA simulations. In Proceedings of the 4th IEEE International Workshop on Distributed Simulation and Real-Time Applications, 79--85. San Francisco, California.
[18]
Wang, X., S. J. Turner, M. Y. H. Low, and B. P. Gan. 2004. Optimistic synchronization in HLA based distributed simulation. In Proceedings of the 18th Workshop on Parallel and Distributed Simulation, 123--130. Kufstein, Austria.
[19]
Yan, H., Y. Zhang, G. Sun, and L. Zhong. 2003. Research on time warp mechanism in HLA. In Proceedings of the 2nd International Conference on Machine Learning and Cybernetics, 1092--1095, Xi-an, China.

Cited By

View all
  • (2016)Fault-Tolerant Adaptive Parallel and Distributed SimulationProceedings of the 20th International Symposium on Distributed Simulation and Real-Time Applications10.1109/DS-RT.2016.11(37-44)Online publication date: 21-Sep-2016
  • (2015)Autonomous Orchestration of Distributed Discrete Event Simulations in the Presence of Resource UncertaintyACM Transactions on Autonomous and Adaptive Systems10.1145/274634510:3(1-20)Online publication date: 1-Sep-2015
  • (2010)A replication structure for efficient and fault-tolerant parallel and distributed simulationsProceedings of the 2010 Spring Simulation Multiconference10.1145/1878537.1878695(1-10)Online publication date: 11-Apr-2010
  • Show More Cited By
  1. A framework for fault-tolerance in HLA-based distributed simulations

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WSC '05: Proceedings of the 37th conference on Winter simulation
    December 2005
    2769 pages
    ISBN:0780395190

    Sponsors

    Publisher

    Winter Simulation Conference

    Publication History

    Published: 04 December 2005

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    WSC '05 Paper Acceptance Rate 209 of 316 submissions, 66%;
    Overall Acceptance Rate 3,413 of 5,075 submissions, 67%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)Fault-Tolerant Adaptive Parallel and Distributed SimulationProceedings of the 20th International Symposium on Distributed Simulation and Real-Time Applications10.1109/DS-RT.2016.11(37-44)Online publication date: 21-Sep-2016
    • (2015)Autonomous Orchestration of Distributed Discrete Event Simulations in the Presence of Resource UncertaintyACM Transactions on Autonomous and Adaptive Systems10.1145/274634510:3(1-20)Online publication date: 1-Sep-2015
    • (2010)A replication structure for efficient and fault-tolerant parallel and distributed simulationsProceedings of the 2010 Spring Simulation Multiconference10.1145/1878537.1878695(1-10)Online publication date: 11-Apr-2010
    • (2010)Federate Fault Tolerance in HLA-Based SimulationProceedings of the 2010 IEEE Workshop on Principles of Advanced and Distributed Simulation10.1109/PADS.2010.5471663(3-12)Online publication date: 17-May-2010
    • (2006)Evaluation of a Fault-Tolerance Mechanism for HLA-Based Distributed SimulationsProceedings of the 20th Workshop on Principles of Advanced and Distributed Simulation10.1109/PADS.2006.18(175-182)Online publication date: 24-May-2006

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media