skip to main content
10.5555/1218112.1218432acmconferencesArticle/Chapter ViewAbstractPublication PageswscConference Proceedingsconference-collections
Article

A reinforcement learning algorithm to minimize the mean tardiness of a single machine with controlled capacity

Published: 03 December 2006 Publication History

Abstract

In this work, we consider the problem of scheduling arriving jobs to a single machine where the objective is to minimize the mean tardiness. The scheduler has the option of reducing the processing time by half through the employment of an extra worker for an extra cost per job (setup cost). The scheduler can also choose from a number of dispatching rules. To find a good policy to be followed by the scheduler, we implemented a λ-SMART algorithm to do an on-line optimization for the studied system. The found policy is only optimal with respect to the state representation and set of actions available, however, we believe that the developed policies are easy to implement and would result in considerable savings as shown by the numerical experiments conducted.

References

[1]
Gosavi, A. 2004. Reinforcement learning for long-run average cost, European Journal of Operational Research, vol. 155: 654--674.
[2]
Gosavi, A., N. Bandla, and T. Das. 2002. A reinforcement learning approach to airline seat allocation for multiple fare classes with overbooking, IIE transactions, 34 (9), 729--742.
[3]
Kaelbling, L., M. Littman, and A. Moore. 1996. Reinforcement Learning, a survey, Journal of Artificial Intelligence Research vol. 4, pp 237--285.
[4]
Lee, Y. H., K. Bhaskaran, and M. Pinedo. 1997. A heuristic to minimize the total weighted tardiness with sequence-dependent setups, IIE transactions, 29:45--52.
[5]
Mahadevan, S. 1996. Average reward reinforcement learning: foundations, algorithms, and empirical results, Machine Learning 22(1): 159--195.
[6]
Pinedo, M. 2001. Scheduling: Theory, Algorithms and Systems, Prentice Hall, 2nd edition.
[7]
Sutton, R., and A. G. Barto. 1998. Reinforcement Learning, The MIT press, Cambridge, Massachusetts.
[8]
Wang, Yi-Chi. 2003. Application of reinforcement learning to multi-agent production system, Ph.D. dissertation, Department of Industrial Engineering, Mississippi State University.
[9]
Watkins, C. J. 1989. Learning from delayed rewards, Ph.D. thesis, Kings College, Cambridge, England.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSC '06: Proceedings of the 38th conference on Winter simulation
December 2006
2429 pages
ISBN:1424405017

Sponsors

  • IIE: Institute of Industrial Engineers
  • ASA: American Statistical Association
  • IEICE ESS: Institute of Electronics, Information and Communication Engineers, Engineering Sciences Society
  • IEEE-CS\DATC: The IEEE Computer Society
  • SIGSIM: ACM Special Interest Group on Simulation and Modeling
  • NIST: National Institute of Standards and Technology
  • (SCS): The Society for Modeling and Simulation International
  • INFORMS-CS: Institute for Operations Research and the Management Sciences-College on Simulation

Publisher

Winter Simulation Conference

Publication History

Published: 03 December 2006

Check for updates

Qualifiers

  • Article

Conference

WSC06
Sponsor:
  • IIE
  • ASA
  • IEICE ESS
  • IEEE-CS\DATC
  • SIGSIM
  • NIST
  • (SCS)
  • INFORMS-CS
WSC06: Winter Simulation Conference 2006
December 3 - 6, 2006
California, Monterey

Acceptance Rates

WSC '06 Paper Acceptance Rate 177 of 252 submissions, 70%;
Overall Acceptance Rate 3,413 of 5,075 submissions, 67%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 104
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media