skip to main content
research-article

Watch and learn: optimizing from revealed preferences feedback

Published: 12 November 2015 Publication History

Abstract

A Stackelberg game is played between a leader and a follower. The leader first chooses an action, and then the follower plays his best response, and the goal of the leader is to pick the action that will maximize his payoff given the follower's best response. Stackelberg games capture, for example, the following interaction between a retailer and a buyer. The retailer chooses the prices of the goods he produces, and then the buyer chooses to buy a utility-maximizing bundle of goods. The goal of the retailer here is to set prices to maximize his profit---his revenue minus the production cost of the purchased bundle. It is quite natural that the retailer in this example would not know the buyer's utility function. However, he does have access to revealed preference feedback---he can set prices, and then observe the purchased bundle and his own profit. We give algorithms for efficiently solving, in terms of both computational and query complexity, a broad class of Stackelberg games in which the follower's utility function is unknown, using only "revealed preference" access to it. This class includes the profit maximization problem, as well as the optimal tolling problem in nonatomic congestion games, when the latency functions are unknown. Surprisingly, we are able to solve these problems even though the corresponding maximization problems are not concave in the leader's actions.

References

[1]
Amin, K., Cummings, R., Dworkin, L., Kearns, M., and Roth, A. 2015. Online learning and profit maximization from revealed preferences. In Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI-15).
[2]
Balcan, M.-F., Daniely, A., Mehta, R., Urner, R., and Vazirani, V. V. 2014. Learning economic parameters from revealed preferences. In Web and Internet Economics. Springer, 338--353.
[3]
Beigman, E. and Vohra, R. 2006. Learning from revealed preference. In Proceedings of the 7th ACM Conference on Electronic Commerce. ACM, 36--42.
[4]
Belloni, A., Liang, T., Narayanan, H., and Rakhlin, A. 2015. Escaping the local minima via simulated annealing: Optimization of approximately convex functions. CoRR abs/1501.07242.
[5]
Bhaskar, U., Ligett, K., Schulman, L. J., and Swamy, C. 2014. Achieving target equilibria in network routing games without knowing the latency functions. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18--21, 2014. IEEE Computer Society, 31--40.
[6]
Mas-Colell, A., Whinston, M. D., and Green, J. R. 1995. Microeconomic Theory. Oxford University Press.
[7]
Roth, A., Ullman, J., and Wu, Z. S. 2015. Watch and learn: Optimizing from revealed preferences feedback. arXiv preprint arXiv:1504.01033.
[8]
Rubinstein, A. 2012. Lecture notes in microeconomic theory: the economic agent. Princeton University Press.
[9]
Samuelson, P. A. 1938. A note on the pure theory of consumers' behavior. Economica 5, 17, 61--71.
[10]
Varian, H. R. 2006. Revealed preference. In Samuelsonian economics and the twenty-first century, M. Szenberg, L. Ramrattan, and A. A. Gottesman, Eds. Oxford University Press, 99--115.
[11]
Zadimoghaddam, M. and Roth, A. 2012. Efficiently learning from revealed preference. In Internet and Network Economics - 8th International Workshop, WINE 2012, Liverpool, UK, December 10--12, 2012. Proceedings, P. W. Goldberg, Ed. Lecture Notes in Computer Science, vol. 7695. Springer, 114--127.

Cited By

View all
  • (2023)Metacognitive Radar: Masking Cognition From an Inverse Reinforcement LearnerIEEE Transactions on Aerospace and Electronic Systems10.1109/TAES.2023.331440659:6(8826-8844)Online publication date: Dec-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGecom Exchanges
ACM SIGecom Exchanges  Volume 14, Issue 1
June 2015
107 pages
EISSN:1551-9031
DOI:10.1145/2845926
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2015
Published in SIGECOM Volume 14, Issue 1

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Metacognitive Radar: Masking Cognition From an Inverse Reinforcement LearnerIEEE Transactions on Aerospace and Electronic Systems10.1109/TAES.2023.331440659:6(8826-8844)Online publication date: Dec-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media