research-article

Scaling ant colony optimization with hierarchical reinforcement learning partitioning

Authors:

Gilbert L. PetersonAuthors Info & Claims

GECCO '08: Proceedings of the 10th annual conference on Genetic and evolutionary computation

Pages 25 - 32

https://doi.org/10.1145/1389095.1389100

Published: 12 July 2008 Publication History

Abstract

This paper merges hierarchical reinforcement learning (HRL) with ant colony optimization (ACO) to produce a HRL ACO algorithm capable of generating solutions for large domains. This paper describes two specific implementations of the new algorithm: the first a modification to Dietterich's MAXQ-Q HRL algorithm, the second a hierarchical ant colony system algorithm. These implementations generate faster results, with little to no significant change in the quality of solutions for the tested problem domains. The application of ACO to the MAXQ-Q algorithm replaces the reinforcement learning, Q-learning, with the modified ant colony optimization method, Ant-Q. This algorithm, MAXQ-AntQ, converges to solutions not significantly different from MAXQ-Q in 88% of the time. This paper then transfers HRL techniques to the ACO domain and traveling salesman problem (TSP). To apply HRL to ACO, a hierarchy must be created for the TSP. A data clustering algorithm creates these subtasks, with an ACO algorithm to solve the individual and complete problems. This paper tests two clustering algorithms, k-means and G-means. The results demonstrate the algorithm with data clustering produces solutions 20 times faster with 5-10% decrease in solution quality due to the effects of clustering.

References

[1]

A. G. Barto and S. Mahadevan. Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(1-2):41--77, 2003.

Digital Library

[2]

E. Bonabeau, M. Dorigo, and G. Theraulaz. Swarm Intelligence: From Natural to Artificial Systems. Oxford University Press, Inc., 198 Madison Avenue, New York, New York 10016, 1999.

Digital Library

[3]

T. G. Dietterich. Hierarchical reinforcement learning with the maxq value function decomposition. Journal of Artificial Intelligence, (13):227--303, Nov. 2000.

Digital Library

[4]

M. Dorigo, M. Birattari, and T. Stutzle. Ant colony optimization. Computational Intellligence Magazine, IEEE, 1(4):28--39, 2006.

Digital Library

[5]

M. Dorigo and L. M. Gambardella. Ant colony system: A cooperative learning approach to the traveling salesman problem. IEEE Transactions on Evolutionary Computation, 1(1):53--66, April 1997.

Digital Library

[6]

L. M. Gambardella and M. Dorigo. Ant-q: A reinforcement learning approach to the traveling salesman problem. In International Conference on Machine Learning, pages 252--260, 1995.

[7]

G. Hamerly and C. Elkan. Learning the k in k-means. In Proceedings of the Seventeenth Annual Conference on Neural Information Processing Systems (NIPS), pages 281--288, 2003.

[8]

T. Jaakkola, M. I. Jordan, and S. P. Singh. On the convergence of stochastic iterative dynamic programming algorithms. Technical report, Cambridge, MA, USA, 1993.

Digital Library

[9]

J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Symposium on Mathematical Statistics and Probabilities, pages 281--297, 1967.

[10]

R. Parr and S. Russell. Reinforcement learning with hierarchies of machines, Jan. 17 1997.

[11]

D. Pelleg and A. Moore. X-means: Extending k-means with e±cient estimation of the number of clusters. ICML 2000, 2000.

Digital Library

[12]

G. Reinelt. Tsplib 95, 2007.

[13]

W. D. Smart. Explicit manifold representations for value-functions in reinforcement learning. In Proceedings of the Eighth International Symposium on Artificial Intelligence and Mathematics, January 2004. Paper number AI&M 25-2004.

[14]

T. Stutzle and H. Hoos. The max-min ant system and local search for combinatorial optimization problems, 1999.

[15]

R. Sutton and A. G. Barto. Introduction to Reinforcement Learning. MIT Press, Cambridge, MA, 1998.

Digital Library

[16]

R. S. Sutton, D. Precup, and S. Singh. Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artif. Intell., 112(1-2):181--211, 1999.

Digital Library

[17]

C. Walshaw. A Multilevel Approach to the Travelling Salesman Problem. Tech. Rep. 00/IM/63, Comp. Math. Sci., Univ. Greenwich, London SE10 9LS, UK, August 2000.

[18]

C. Watkins. Learning From Delayed Rewards. PhD thesis, King's College, Oxford, May 1989.

Cited By

Mishra BDehuri SCho S(2015)Swarm Intelligence in Multiple and Many Objectives Optimization: A Survey and Topical Study on EEG Signal AnalysisMulti-objective Swarm Intelligence10.1007/978-3-662-46309-3_2(27-73)Online publication date: 11-Mar-2015
https://doi.org/10.1007/978-3-662-46309-3_2
Peterson GMayer CCousin K(2011)WoLF Ant2011 IEEE Congress of Evolutionary Computation (CEC)10.1109/CEC.2011.5949726(995-1002)Online publication date: Jun-2011
https://doi.org/10.1109/CEC.2011.5949726
Kamei KIshikawa M(2010)Adaptive decision making in ant colony system by reinforcement learningProceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I10.5555/1939659.1939738(609-617)Online publication date: 22-Nov-2010
https://dl.acm.org/doi/10.5555/1939659.1939738
Show More Cited By

Index Terms

Scaling ant colony optimization with hierarchical reinforcement learning partitioning
1. Computing methodologies
  1. Artificial intelligence
    1. Search methodologies
      1. Heuristic function construction
  2. Machine learning

Recommendations

Extended trail reinforcement strategies for ant colony optimization
SEMCCO'11: Proceedings of the Second international conference on Swarm, Evolutionary, and Memetic Computing - Volume Part I

Ant colony optimization (ACO) is a metaheuristic inspired by the foraging behavior of biological ants that was successfully applied for solving computationally hard problems. The fundamental idea that drives the ACO is the usage of pheromone trails for ...
Hybridization strategies for continuous ant colony optimization and particle swarm optimization applied to data clustering

Ant colony optimization (ACO) and particle swarm optimization (PSO) are two popular algorithms in swarm intelligence. Recently, a continuous ACO named ACOR was developed to solve the continuous optimization problems. This study incorporated ACOR with ...
An improved ant colony optimization algorithm with strengthened pheromone updating mechanism for constraint satisfaction problem

Constraint satisfaction problem (CSP) is a fundamental problem in the field of constraint programming. To tackle this problem more efficiently, an improved ant colony optimization algorithm is proposed. In order to further improve the convergence speed ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

GECCO '08: Proceedings of the 10th annual conference on Genetic and evolutionary computation

July 2008

1814 pages

ISBN:9781605581309

DOI:10.1145/1389095

Conference Chair:
Conor Ryan
University of Limerick, Ireland
,
Editor:
Maarten Keijzer
Chordiant Software International B.V., The Netherlands

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

GECCO08

Sponsor:

GECCO08: Genetic and Evolutionary Computation Conference

July 12 - 16, 2008

GA, Atlanta, USA

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
414
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)1

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mishra BDehuri SCho S(2015)Swarm Intelligence in Multiple and Many Objectives Optimization: A Survey and Topical Study on EEG Signal AnalysisMulti-objective Swarm Intelligence10.1007/978-3-662-46309-3_2(27-73)Online publication date: 11-Mar-2015
https://doi.org/10.1007/978-3-662-46309-3_2
Peterson GMayer CCousin K(2011)WoLF Ant2011 IEEE Congress of Evolutionary Computation (CEC)10.1109/CEC.2011.5949726(995-1002)Online publication date: Jun-2011
https://doi.org/10.1109/CEC.2011.5949726
Kamei KIshikawa M(2010)Adaptive decision making in ant colony system by reinforcement learningProceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I10.5555/1939659.1939738(609-617)Online publication date: 22-Nov-2010
https://dl.acm.org/doi/10.5555/1939659.1939738
Kamei KIshikawa M(2010)Adaptive Decision Making in Ant Colony System by Reinforcement LearningNeural Information Processing. Theory and Algorithms10.1007/978-3-642-17537-4_74(609-617)Online publication date: 2010
https://doi.org/10.1007/978-3-642-17537-4_74

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten