skip to main content
10.1145/1143844.1143871acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Learning algorithms for online principal-agent problems (and selling goods online)

Published:25 June 2006Publication History

ABSTRACT

In a principal-agent problem, a principal seeks to motivate an agent to take a certain action beneficial to the principal, while spending as little as possible on the reward. This is complicated by the fact that the principal does not know the agent's utility function (or type). We study the online setting where at each round, the principal encounters a new agent, and the principal sets the rewards anew. At the end of each round, the principal only finds out the action that the agent took, but not his type. The principal must learn how to set the rewards optimally. We show that this setting generalizes the setting of selling a digital good online.We study and experimentally compare three main approaches to this problem. First, we show how to apply a standard bandit algorithm to this setting. Second, for the case where the distribution of agent types is fixed (but unknown to the principal), we introduce a new gradient ascent algorithm. Third, for the case where the distribution of agents' types is fixed, and the principal has a prior belief (distribution) over a limited class of type distributions, we study a Bayesian approach.

References

  1. Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (1995). Gambling in a rigged casino: The adversarial multi-arm bandit problem. FOCS (pp. 322--331). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Babaioff, M., Lavi, R., & Pavlov, E. (2005). Mechanism design for single-value domains. AAAI (pp. 241--247). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bahar, G., & Tennenholtz, M. (2005). Sequential-simultaneous information elicitation in multi-agent systems. IJCAI (pp. 923--928). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bar-Yossef, Z., Hildrum, K., & Wu, F. (2002). Incentive-compatible online auctions for digital goods. SODA (pp. 964--970). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bartal, Y., Gonen, R., & Mura, P. L. (2004). Negotiation-range mechanisms: Exploring the limits of truthful efficient markets. ACM-EC (pp. 1--8). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Blum, A., Kumar, V., Rudra, A., & Wu, F. (2003). Online learning in online auctions. SODA (pp. 202--204). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Blumberg, A., & Shelat, A. (2004). Searching for stable mechanisms: Automated design for imperfect players. AAAI (pp. 8--13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cesa-Bianchi, N., Freund, Y., Haussler, D., Helmbold, D. P., Schapire, R. E., & Warmuth, M. K. (1997). How to use expert advice. Journal of the ACM, 44, 427--485. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Conitzer, V., & Sandholm, T. (2004). Self-interested automated mechanism design and implications for optimal combinatorial auctions. ACM-EC (pp. 132--141). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. de Farias, D. P., & Megiddo, N. (2003). How to combine expert (or novice) advice when actions impact the environment? NIPS.Google ScholarGoogle Scholar
  11. Kleinberg, R., & Leighton, T. (2003). The value of knowing a demand curve: Bounds on regret for on-line posted-price auctions. FOCS (pp. 594--605). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mas-Colell, A., Whinston, M., & Green, J. R. (1995). Microeconomic theory. Oxford University Press.Google ScholarGoogle Scholar
  13. Parkes, D., & Schoenebeck, G. (2004). GROWRANGE: Anytime VCG-based mechanisms. AAAI (pp. 34--41). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Porter, R. (2004). Mechanism design for online real-time scheduling. ACM-EC (pp. 61--70). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Smorodinsky, R., & Tennenholtz, M. (2004). Sequential information elicitation in multi-agent systems. UAI (pp. 528--535). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Zinkevich, M. (2003). Online convex programming and generalized infinitesimal gradient ascent. ICML (pp. 928--936).Google ScholarGoogle Scholar

Index Terms

  1. Learning algorithms for online principal-agent problems (and selling goods online)

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          ICML '06: Proceedings of the 23rd international conference on Machine learning
          June 2006
          1154 pages
          ISBN:1595933832
          DOI:10.1145/1143844

          Copyright © 2006 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 June 2006

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          ICML '06 Paper Acceptance Rate140of548submissions,26%Overall Acceptance Rate140of548submissions,26%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader