ACM Home Page
Please provide us with feedback. Feedback
Learning algorithms for online principal-agent problems (and selling goods online)
Full text PdfPdf (172 KB)
Source ACM International Conference Proceeding Series; Vol. 148 archive
Proceedings of the 23rd international conference on Machine learning table of contents
Pittsburgh, Pennsylvania
Pages: 209 - 216  
Year of Publication: 2006
ISBN:1-59593-383-2
Authors
Vincent Conitzer  Carnegie Mellon University, Pittsburgh, PA
Nikesh Garera  Johns Hopkins University, Baltimore, MD
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 23,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1143844.1143871
What is a DOI?

ABSTRACT

In a principal-agent problem, a principal seeks to motivate an agent to take a certain action beneficial to the principal, while spending as little as possible on the reward. This is complicated by the fact that the principal does not know the agent's utility function (or type). We study the online setting where at each round, the principal encounters a new agent, and the principal sets the rewards anew. At the end of each round, the principal only finds out the action that the agent took, but not his type. The principal must learn how to set the rewards optimally. We show that this setting generalizes the setting of selling a digital good online.We study and experimentally compare three main approaches to this problem. First, we show how to apply a standard bandit algorithm to this setting. Second, for the case where the distribution of agent types is fixed (but unknown to the principal), we introduce a new gradient ascent algorithm. Third, for the case where the distribution of agents' types is fixed, and the principal has a prior belief (distribution) over a limited class of type distributions, we study a Bayesian approach.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Babaioff, M., Lavi, R., & Pavlov, E. (2005). Mechanism design for single-value domains. AAAI (pp. 241--247).
 
3
Bahar, G., & Tennenholtz, M. (2005). Sequential-simultaneous information elicitation in multi-agent systems. IJCAI (pp. 923--928).
 
4
5
 
6
 
7
Blumberg, A., & Shelat, A. (2004). Searching for stable mechanisms: Automated design for imperfect players. AAAI (pp. 8--13).
8
9
 
10
de Farias, D. P., & Megiddo, N. (2003). How to combine expert (or novice) advice when actions impact the environment? NIPS.
 
11
 
12
Mas-Colell, A., Whinston, M., & Green, J. R. (1995). Microeconomic theory. Oxford University Press.
 
13
Parkes, D., & Schoenebeck, G. (2004). GROWRANGE: Anytime VCG-based mechanisms. AAAI (pp. 34--41).
14
 
15
 
16
Zinkevich, M. (2003). Online convex programming and generalized infinitesimal gradient ascent. ICML (pp. 928--936).


Collaborative Colleagues:
Vincent Conitzer: colleagues
Nikesh Garera: colleagues