ABSTRACT
We report on our reinforcement learning work on Cobot, a software agent that resides in the well-known online chat community LambdaMOO. Our initial work on Cobot~\cite{cobotaaai} provided him with the ability to collect {\em social statistics\/} and report them to users in a reactive manner. Here we describe our application of reinforcement learning to allow Cobot to proactively take actions in this complex social environment, and adapt his behavior from multiple sources of human reward. After 5 months of training, Cobot received 3171 reward and punishment events from 254 different Lambda\-MOO users, and learned nontrivial preferences for a number of users. Cobot modifies his behavior based on his current state in an attempt to maximize reward. Here we describe LambdaMOO and the state and action spaces of Cobot, and report the statistical results of the learning experiment.
- Eisenberg, A. (2000). Find Me a File, Cache Me a Catch. New York Times, February 10, 2000. http://www.nytimes.com/library/tech/00/02/circuits/ articles/10matc.html.Google Scholar
- Foner, L. (1997). Entertaining Agents: a Sociological Case Study. InProceedings of the First International Conference onAutonomous Agents. Google ScholarDigital Library
- Isbell, C. L., Kearns, M., Kormann, D., Singh, S., and Stone, P. (2000). Cobot in LambdaMOO: A Social Statistics Agent. To appear in Proceedings of AAAI-2000. Google ScholarDigital Library
- Mauldin, M. (1994). Chatterbots, TinyMUDs, and the Turing Test: Entering the Loebner Prize Competition. In Proceedings of the Twelfth National Conference on Artificial Intelligence. Google ScholarDigital Library
- Shelton, C. R. (2000). Balancing Multiple Sources of Reward in Reinforcement Learning. Submitted for publication in Neural Information Processing Systems-2000.Google Scholar
- Singh, S., Kearns, M., Littman, D., and Walker, M. (2000). Empirical Evaluation of a Reinforcement Learning Dialogue System. To appear in Proceedings of AAAI-2000. Google ScholarDigital Library
- Stone, P. andVeloso, M. (1999). Team partitioned, opaque transition reinforcement learning. In Proceedings of the Third Annual Conference onAutonomous Agents, pages 206-212. ACM Press. Google ScholarDigital Library
- Sutton, R. S. and Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA. Google ScholarDigital Library
- Sutton, R. S., McAllester, D., Singh, S., and Mansour, Y. (1999). Policy gradient methods for reinforcement learning with function approximation. In Neural Information Processing Systems-1999.Google Scholar
Index Terms
- A social reinforcement learning agent
Recommendations
Group-Agent Reinforcement Learning
Artificial Neural Networks and Machine Learning – ICANN 2023AbstractIt can largely benefit the reinforcement learning (RL) process of each agent if multiple geographically distributed agents perform their separate RL tasks cooperatively. Different from multi-agent reinforcement learning (MARL) where multiple ...
Multi-Agent Inverse Reinforcement Learning
ICMLA '10: Proceedings of the 2010 Ninth International Conference on Machine Learning and ApplicationsLearning the reward function of an agent by observing its behavior is termed inverse reinforcement learning and has applications in learning from demonstration or apprenticeship learning. We introduce the problem of multi-agent inverse reinforcement ...
Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent SystemsRecent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Comments