ABSTRACT
We introduce flexible algorithms that can automatically learn mappings from images to actions by interacting with their environment. They work by introducing an image classifier in front of a Reinforcement Learning algorithm. The classifier partitions the visual space according to the presence or absence of highly informative local descriptors. The image classifier is incrementally refined by selecting new local descriptors when perceptual aliasing is detected. Thus, we reduce the visual input domain down to a size manageable by Reinforcement Learning, permitting us to learn direct percept-to-action mappings. Experimental results on a continuous visual navigation task illustrate the applicability of the framework.
- Bellman. R. (1957). Dynamic programming. Princeton University Press. Google ScholarDigital Library
- Bertsekas, D., & Tsitsiklis, J. (1996). Neuro-dynamic programming. Athena Scientific. Google ScholarDigital Library
- Chapman, D., & Kaelbling, L. (1991). Input generalization in delayed reinforcement learning: An algorithm and performance comparisons. Proc. of the 12th International Joint Conference on Artificial Intelligence (IJCAI) (pp. 726--731). Sydney.Google Scholar
- Chrisman, L. (1992). Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. National Conference on Artificial Intelligence (pp. 183--188).Google Scholar
- Coelho, J., Piater, J., & Grupen, R. (2001). Developing haptic and visual perceptual categories for reaching and grasping with a humanoid robot. Robotics and Autonomous Systems, 37, 195--218.Google ScholarCross Ref
- Gibson, E., & Spelke, E. (1983). The development of perception. Handbook of child psychology vol. iii: Cognitive development, chapter 1, 2--76. Wiley.Google Scholar
- Gouet, V., & Boujemaa, N. (2001). Object-based queries using color points of interest. IEEE Workshop on Content-Based Access of Image and Video Libraries (pp. 30--36). Kauai (HI, USA). Google ScholarDigital Library
- Lowe, D. (1999). Object recognition from local scale-invariant features. International Conference on Computer Vision (pp. 1150--1157). Corfu, Greece. Google ScholarDigital Library
- McCallum, R. (1996). Reinforcement learning with selective perception and hidden state. Doctoral dissertation, University of Rochester, New York. Google ScholarDigital Library
- Mikolajczyk, K., & Schmid, C. (2003). A performance evaluation of local descriptors. IEEE Conference on Computer Vision and Pattern Recognition (pp. 257--263). Madison (WI, USA).Google ScholarCross Ref
- Munos, R., & Moore, A. (2002). Variable resolution discretization in optimal control. Machine Learning, 49, 291--323. Google ScholarDigital Library
- Piater, J. (2001). Visual feature learning. Doctoral dissertation, University of Massachusetts, Computer Science Department, Amherst (MA, USA). Google ScholarDigital Library
- Quinlan, J. (1993). C4.5: Programs for machine learning. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
- Scalzo, F., & Piater, J. (2005). Task-driven learning of spatial combinations of visual features. Proc. of the IEEE Workshop on Learning in Computer Vision and Pattern Recognition. San Diego (CA, USA). Google ScholarDigital Library
- Schmid, C., & Mohr, R. (1997). Local greyvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 530--535. Google ScholarDigital Library
- Schmid, C., Mohr, R., & Bauckhage, C. (2000). Evaluation of interest point detectors. International Journal of Computer Vision, 37, 151--172. Google ScholarDigital Library
- Schyns, P., & Rodet, L. (1997). Categorization creates functional features. Journ. of Experimental Psychology: Learning, Memory and Cognition, 23, 681--696.Google ScholarCross Ref
- Singh, S., Jaakkola, T., & Jordan, M. (1995). Reinforcement learning with soft state aggregation. Advances in Neural Information Processing Systems (pp. 361--368). MIT Press.Google Scholar
- Sutton, R., & Barto, A. (1998). Reinforcement learning, an introduction. MIT Press. Google ScholarDigital Library
- Uther, W. T. B., & Veloso, M. M. (1998). Tree based discretization for continuous state space reinforcement learning. Proc. of the 15th National Conference on Artificial Intelligence (AAAI) (pp. 769--774). Madison (WI, USA). Google ScholarDigital Library
- Watkins, C. (1989). Learning from delayed rewards. Doctoral dissertation, King's College, Cambridge.Google Scholar
- Whitehead, S., & Ballard, D. (1991). Learning to perceive and act by trial and error. Machine Learning, 7, 45--83. Google ScholarDigital Library
- Interactive learning of mappings from visual percepts to actions
Recommendations
Task-Driven discretization of the joint space of visual percepts and continuous actions
ECML'06: Proceedings of the 17th European conference on Machine LearningWe target the problem of closed-loop learning of control policies that map visual percepts to continuous actions. Our algorithm, called Reinforcement Learning of Joint Classes (RLJC), adaptively discretizes the joint space of visual percepts and ...
Knowledge of opposite actions for reinforcement learning
Abstract: Reinforcement learning (RL) is one of the machine intelligence techniques with several characteristics that make it suitable for solving real-world problems. However, RL agents generally face a very large state space in many applications. They ...
Comments