research-article

Optimizing time warp simulation with reinforcement learning techniques

Authors:

Carl TropperAuthors Info & Claims

WSC '07: Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come

Pages 577 - 584

Published: 09 December 2007 Publication History

Abstract

Adaptive Time Warp protocols in the literature are usually based on a pre-defined analytic model of the system, expressed as a closed form function that maps system state to control parameter. The underlying assumption is that this model itself is optimal. In this paper we present a new approach that utilizes Reinforcement Learning techniques, also known as simulation-based dynamic programming. Instead of assuming an optimal control strategy, the very goal of Reinforcement Learning is to find the optimal strategy through simulation. A value function that captures the history of system feedbacks is used, and no prior knowledge of the system is required. Our reinforcement learning techniques were implemented in a distributed VLSI simulator with the objective of finding the optimal size of a bounded time window. Our experiments using two benchmark circuits indicated that it was successful in doing so.

References

[1]

Das, S. 2000, April. Adaptive protocols for parallel discrete event simulation. Journal of the operational research society (JORS) 51 (4): 385--394.

[2]

Ferscha, A. 1995, July. Probabilistic adaptive direct optimism control in time warp. Proceedings of the ninth workshop on Parallel and distributed simulation:120--129.

Digital Library

[3]

Gosavi, A. 2003. Simulation-based optimization: parametric optimization techniques and reinforcement learning. Kluwer Academic Publishers.

Digital Library

[4]

Kaelbling, L., M. Littman, and A. Moore. 1996. Reinforcement learning: a survey. Journal of artificial intelligence research 4:237--285.

Digital Library

[5]

Lin, Y., B. Preiss, W. Loucks, and E. Lazowska. 1993, May. Selecting the checkpoint interval in time warp simulation. Proceedings of the seventh workshop on Parallel and distributed simulation:3--10.

Digital Library

[6]

Lubachevsky, B., A. Shwartz, and A. Weiss. 1991, April. An analysis of rollback-based simulation. ACM transaction on modeling and computer simulation 1 (2): 154--193.

Digital Library

[7]

Palaniswamy, A., and P. Wilsey. 1993, March. Adaptive bounded time windows in an optimistically synchronized simulator. Great lakes VLSI conference:114--118.

[8]

Panait, L., and S. Luke. 2005, November. Cooperative multiagent learning: the state of the art. Autonomous Agents and Multi-agent Systems 11 (3): 387--434.

Digital Library

[9]

Panesar, K., and R. Fujimoto. 1997. Adaptive flow control in time warp. Proceedings of the 11th workshop on parallel and distributed simulation: 108--115.

Digital Library

[10]

Parent, J., K. Verbeeck, and J. Lemeire. 2002. Adaptive load balancing of parallel applications with reinforcement learning on heterogeneous networks. Proceedings of international symposium DCABES.

[11]

Reynolds, P. 1988. A spectrum of options for paralle simulation. Proceedings of the 1988 winter simulation conference:325--332.

Digital Library

[12]

Russell, S., and P. Norvig. 2003. Artificial intelligence: a modern approach. Prentice Hall.

Digital Library

[13]

Schaerf, A., Y. Shoham, and M. Tennenholtz. 1995. Adaptive load balancing: a study in multi-agent learning. Journal of artificial intelligence research 2:475--500.

Digital Library

[14]

Sokol, L., D. Briscoe, and A. Wieland. 1988, July. Mtw: a strategy fo scheduling discrete simulation events for concurrent execution. Proceedings of the SCS multi-conference on distributed simulation 19 (3): 34--42.

[15]

Sutton, R., and A. G. Barto. 1998. Reinforcement learning: an introduction. The MIT Press.

Digital Library

Cited By

Yadav AShrivastava S(2010)Evaluation of reinforcement learning techniquesProceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia10.1145/1963564.1963578(88-92)Online publication date: 27-Dec-2010
https://dl.acm.org/doi/10.1145/1963564.1963578
Meraji SZhang WTropper C(2010)On the scalability and dynamic load-balancing of optimistic gate level simulationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2010.204904429:9(1368-1380)Online publication date: 1-Sep-2010
https://dl.acm.org/doi/10.1109/TCAD.2010.2049044
Wang JTropper CDunkin AIngalls R(2009)Using genetic algorithms to limit the optimism in time warpWinter Simulation Conference10.5555/1995456.1995620(1180-1188)Online publication date: 13-Dec-2009
https://dl.acm.org/doi/10.5555/1995456.1995620
Show More Cited By

Recommendations

Evaluation of reinforcement learning techniques
IITM '10: Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia

Reinforcement learning is became one of the most important approaches to machine intelligence. Now RL is widely use by different research field as intelligent control, robotics and neuroscience. It provides us possible solution within unknown ...
Selecting GVT interval for time-warp-based distributed simulation using reinforcement learning technique
SpringSim '09: Proceedings of the 2009 Spring Simulation Multiconference

In a Time-Warp-based distributed simulation system, a simulation process must save its states and events to handle rollbacks. Periodically, the global minimum of the timestamps of events and messages in the entire system is calculated. This value is ...
Optimizing the time warp protocol with learning techniques

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WSC '07: Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come

December 2007

2659 pages

ISBN:1424413060

General Chair:
Jeff Tew

Sponsors

IIE: Institute of Industrial Engineers
INFORMS-SIM: Institute for Operations Research and the Management Sciences: Simulation Society
ASA: American Statistical Association
IEEE/SMC: Institute of Electrical and Electronics Engineers: Systems, Man, and Cybernetics Society
SIGSIM: ACM Special Interest Group on Simulation and Modeling
NIST: National Institute of Standards and Technology
(SCS): The Society for Modeling and Simulation International

Publisher

IEEE Press

Publication History

Published: 09 December 2007

Check for updates

Qualifiers

Research-article

Conference

WSC07

Sponsor:

IIE
INFORMS-SIM
ASA
IEEE/SMC
SIGSIM
NIST
(SCS)

WSC07: Winter Simulation Conference

December 9 - 12, 2007

Washington D.C.

Acceptance Rates

WSC '07 Paper Acceptance Rate 152 of 244 submissions, 62%;

Overall Acceptance Rate 3,413 of 5,075 submissions, 67%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
139
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yadav AShrivastava S(2010)Evaluation of reinforcement learning techniquesProceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia10.1145/1963564.1963578(88-92)Online publication date: 27-Dec-2010
https://dl.acm.org/doi/10.1145/1963564.1963578
Meraji SZhang WTropper C(2010)On the scalability and dynamic load-balancing of optimistic gate level simulationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2010.204904429:9(1368-1380)Online publication date: 1-Sep-2010
https://dl.acm.org/doi/10.1109/TCAD.2010.2049044
Wang JTropper CDunkin AIngalls R(2009)Using genetic algorithms to limit the optimism in time warpWinter Simulation Conference10.5555/1995456.1995620(1180-1188)Online publication date: 13-Dec-2009
https://dl.acm.org/doi/10.5555/1995456.1995620
Wang JTropper CWainer GShaffer CMcGraw RChinni M(2009)Selecting GVT interval for time-warp-based distributed simulation using reinforcement learning techniqueProceedings of the 2009 Spring Simulation Multiconference10.5555/1639809.1639860(1-7)Online publication date: 22-Mar-2009
https://dl.acm.org/doi/10.5555/1639809.1639860

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents