ABSTRACT
In a principal-agent problem, a principal seeks to motivate an agent to take a certain action beneficial to the principal, while spending as little as possible on the reward. This is complicated by the fact that the principal does not know the agent's utility function (or type). We study the online setting where at each round, the principal encounters a new agent, and the principal sets the rewards anew. At the end of each round, the principal only finds out the action that the agent took, but not his type. The principal must learn how to set the rewards optimally. We show that this setting generalizes the setting of selling a digital good online.We study and experimentally compare three main approaches to this problem. First, we show how to apply a standard bandit algorithm to this setting. Second, for the case where the distribution of agent types is fixed (but unknown to the principal), we introduce a new gradient ascent algorithm. Third, for the case where the distribution of agents' types is fixed, and the principal has a prior belief (distribution) over a limited class of type distributions, we study a Bayesian approach.
- Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (1995). Gambling in a rigged casino: The adversarial multi-arm bandit problem. FOCS (pp. 322--331). Google ScholarDigital Library
- Babaioff, M., Lavi, R., & Pavlov, E. (2005). Mechanism design for single-value domains. AAAI (pp. 241--247). Google ScholarDigital Library
- Bahar, G., & Tennenholtz, M. (2005). Sequential-simultaneous information elicitation in multi-agent systems. IJCAI (pp. 923--928). Google ScholarDigital Library
- Bar-Yossef, Z., Hildrum, K., & Wu, F. (2002). Incentive-compatible online auctions for digital goods. SODA (pp. 964--970). Google ScholarDigital Library
- Bartal, Y., Gonen, R., & Mura, P. L. (2004). Negotiation-range mechanisms: Exploring the limits of truthful efficient markets. ACM-EC (pp. 1--8). Google ScholarDigital Library
- Blum, A., Kumar, V., Rudra, A., & Wu, F. (2003). Online learning in online auctions. SODA (pp. 202--204). Google ScholarDigital Library
- Blumberg, A., & Shelat, A. (2004). Searching for stable mechanisms: Automated design for imperfect players. AAAI (pp. 8--13). Google ScholarDigital Library
- Cesa-Bianchi, N., Freund, Y., Haussler, D., Helmbold, D. P., Schapire, R. E., & Warmuth, M. K. (1997). How to use expert advice. Journal of the ACM, 44, 427--485. Google ScholarDigital Library
- Conitzer, V., & Sandholm, T. (2004). Self-interested automated mechanism design and implications for optimal combinatorial auctions. ACM-EC (pp. 132--141). Google ScholarDigital Library
- de Farias, D. P., & Megiddo, N. (2003). How to combine expert (or novice) advice when actions impact the environment? NIPS.Google Scholar
- Kleinberg, R., & Leighton, T. (2003). The value of knowing a demand curve: Bounds on regret for on-line posted-price auctions. FOCS (pp. 594--605). Google ScholarDigital Library
- Mas-Colell, A., Whinston, M., & Green, J. R. (1995). Microeconomic theory. Oxford University Press.Google Scholar
- Parkes, D., & Schoenebeck, G. (2004). GROWRANGE: Anytime VCG-based mechanisms. AAAI (pp. 34--41). Google ScholarDigital Library
- Porter, R. (2004). Mechanism design for online real-time scheduling. ACM-EC (pp. 61--70). Google ScholarDigital Library
- Smorodinsky, R., & Tennenholtz, M. (2004). Sequential information elicitation in multi-agent systems. UAI (pp. 528--535). Google ScholarDigital Library
- Zinkevich, M. (2003). Online convex programming and generalized infinitesimal gradient ascent. ICML (pp. 928--936).Google Scholar
Index Terms
- Learning algorithms for online principal-agent problems (and selling goods online)
Recommendations
Online double auction mechanism for perishable goods
We investigate mechanism design for a spot market of perishable goods.We explain that failures of trading in the perishable goods damage social utility.We develop an online double auction that prioritizes time-critical bids.Multiagent simulations show ...
Bundling Decisions for Selling Multiple Items in Online Auctions
Fueled by the widespread use of the internet, more and more ordinary people have now become merchandise sellers who sell their own possessions, such as antique collections and limited souvenirs, to buyers who are interested in such goods via online ...
Bounding the optimal revenue of selling multiple goods
Using duality theory techniques we derive simple, closed-form formulas for bounding the optimal revenue of a monopolist selling many heterogeneous goods, in the case where the buyer's valuations for the items come i.i.d. from a uniform distribution and ...
Comments