|
ABSTRACT
The "Collective Intelligence" (COIN) framework concerns the design of collectives of agents so that as those agents strive to maximize their individual utility functions, their interaction causes a provided "world" utility function concerning the entire collective to be also maximized. Here we show how to extend that framework to scenarios having Markovian dynamics when no re-evolution of the system from counter-factual initial conditions (an often expensive calculation) is permitted. Our approach transforms the(time-extended) argument of each agent's utility function before evaluating that function. This transformation has benefits in scenarios not involving Markovian dynamics, in particular scenarios where not all of the arguments of an agent's utility function are observable. We investigate this transformation in simulations involving both linear and quadratic (nonlinear) dynamics. In addition, we find that a certain subset of these transformations, which result in utilities that have low "opacity (analogous to having high signal to noise) but are not "factored" (analogous to not being incentive compatible), reliably improve performance over that arising with factored utilities. We also present a Taylor Series method for the fully general nonlinear case.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
W. B. Arthur. Complexity in economic theory: Inductive reasoning and bounded rationality. The American Economic Review, 84(2): 406--411, May 1994.
|
| |
2
|
C. Boutilier. Multiagent systems: Challenges and opportunities for decision theoretic planning. AI Magazine, 20: 35--43, winter 1999.
|
| |
3
|
|
| |
4
|
|
| |
5
|
G. Caldarelli, M. Marsili, and Y. C. Zhang. A prototype model of stock exchange. Europhysics Letters, 40: 479--484, 1997.
|
| |
6
|
D. Challet and Y. C. Zhang. On the minority game: Analytical and numerical studies. Physica A, 256:514, 1998.
|
| |
7
|
|
| |
8
|
R. H. Crites and A. G. Barto. Improving elevator performance using reinforcement learning. In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 1017--1023. MIT Press, 1996.
|
| |
9
|
D. Fudenberg and J. Tirole. Game Theory. MIT Press, Cambridge, MA, 1991.
|
| |
10
|
|
| |
11
|
B. A. Huberman and T. Hogg. The behavior of computational ecologies. In The Ecology of Computation, pages 77--115. North-Holland, 1988.
|
| |
12
|
|
| |
13
|
N. F. Johnson, S. Jarvis, R. Jonso , P. Cheung, Y. R. Kwong, and P. M. Hui. Volatility and agent adaptability in a self-organizing market. preprint cond-mat/ 9802177, February 1998.
|
| |
14
|
L. P. Kaelbing, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4:237--285, 1996.
|
| |
15
|
M. L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the 11th International Conference on Machine Learning, pages 157--163, 1994.
|
| |
16
|
T. Sandholm and R. Crites. Multiagent reinforcement learning in the iterated prisoner's dilemma. Biosystems, 37:147--166, 1995.
|
| |
17
|
Tuomas Sandholm , Kate Larson , Martin Andersson , Onn Shehory , Fernando Tohmé, Anytime coalition structure generation with worst case guarantees, Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence, p.46-53, July 1998, Madison, Wisconsin, United States
|
| |
18
|
S. Sen. Multi-Agent Learning: Papers from the 1997 AAAIWorkshop (Technical Report WS-97-03. AAAI Press, Menlo Park, CA, 1997.
|
| |
19
|
|
| |
20
|
|
| |
21
|
K. Sycara. Multiagent systems. AIMagazine, 19(2):79--92,1998.
|
| |
22
|
|
| |
23
|
|
| |
24
|
M. P. Wellman. A market-oriented programming environment and its application to distributed multicommodity flow problems. In Journal of Artificial Intelligence Research, 1993.
|
| |
25
|
D. H. Wolpert. Bounded-rationality game theory. pre-print, 2001.
|
| |
26
|
D. H. Wolpert. The mathematics of collective intelligence. pre-print, 2001.
|
| |
27
|
D. H. Wolpert and K. Tumer. An Introduction to Collective Intelligence. Technical Report NASA-ARC-IC-99-63, NASA Ames Research Center, 1999. URL: http://ic.arc.nasa.gov/ic/projects/coin pubs.html To appear in Handbook of Agent Technology, Ed. J. M. Bradshaw, AAAI/MIT Press.
|
| |
28
|
D. H. Wolpert and K. Tumer. Optimal payoff functions for members of collectives. Advances in Complex Systems, 4(2/3):265--279, 2001.
|
| |
29
|
|
 |
30
|
|
| |
31
|
D. H. Wolpert, K. Wheeler, and K. Tumer. Collective intelligence for control of distributed dynamical systems. Europhysics Letters, 49(6), March 2000.
|
| |
32
|
Y. C. Zhang. Modeling market mechanism with evolutionary games. Europhysics Letters, March/April 1998.
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE conference on Design automation
Gwo-Dong Chen
, Daniel D. Gajski
|