skip to main content
research-article

Exploration in Interactive Personalized Music Recommendation: A Reinforcement Learning Approach

Authors Info & Claims
Published:04 September 2014Publication History
Skip Abstract Section

Abstract

Current music recommender systems typically act in a greedy manner by recommending songs with the highest user ratings. Greedy recommendation, however, is suboptimal over the long term: it does not actively gather information on user preferences and fails to recommend novel songs that are potentially interesting. A successful recommender system must balance the needs to explore user preferences and to exploit this information for recommendation. This article presents a new approach to music recommendation by formulating this exploration-exploitation trade-off as a reinforcement learning task. To learn user preferences, it uses a Bayesian model that accounts for both audio content and the novelty of recommendations. A piecewise-linear approximation to the model and a variational inference algorithm help to speed up Bayesian inference. One additional benefit of our approach is a single unified model for both music recommendation and playlist generation. We demonstrate the strong potential of the proposed approach with simulation results and a user study.

Skip Supplemental Material Section

Supplemental Material

References

  1. S. Agrawal and N. Goyal. 2012. Analysis of thompson sampling for the multi-armed bandit problem. In Proceedings of the 25th Annual Conference on Learning Theory (COLT'12).Google ScholarGoogle Scholar
  2. N. Aizenberg, Y. Koren, and O. Somekh. 2012. Build your own music recommender by modeling internet radio streams. In Proceedings of the 21st International Conference on World Wide Web (WWW'12). ACM Press, New York, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. Auer. 2003. Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3, 397--422. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Braunhofer, M. Kaminskas, and F. Ricci. 2013. Location-aware music recommendation. Int. J. Multimedia Inf. Retr. 2, 1, 31--44.Google ScholarGoogle ScholarCross RefCross Ref
  5. R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu. 1995. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16, 5, 1190--1208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Cano, M. Koppenberger, and N. Wack. 2005. Content-based music audio recommendation. In Proceedings of the 13th Annual ACM International Conference on Multimedia (MM'05). ACM Press, New York, 211--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Castells, S. Vargas, and J. Wang. 2011. Novelty and diversity metrics for recommender systems: Choice, discovery and relevance. In Proceedings of the International Workshop on Diversity in Document Retrieval (DDR'11) at the 33rd European Conference on Information Retrieval (ECIR'11). 29--36.Google ScholarGoogle Scholar
  8. H. C. Chen and A. L. P. Chen. 2001. A music recommendation system based on music data grouping and user interests. In Proceedings of the 10th International Conference on Information and Knowledge Management (CIKM'01). ACM Press, New York, 231--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Chen, J. L. Moore, D. Turnbull, and T. Joachims. 2012. Playlist prediction via metric embedding. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'12). 714--722. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. X. Chen, P. N. Bennett, K. Collins-Thompson, and E. Horvitz. 2013. Pairwise ranking aggregation in a crowdsourced setting. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining (WSDM'13). ACM Press, New York, 193--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Z. Cheng and J. Shen. 2014. Just-for-me: An adaptive personalization system for location-aware social music recommendation. In Proceedings of International Conference on Multimedia Retrieval (ICMR'14). 185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. C. Y. Chi, R. T. H. Tsai, J. Y. Lai, and J. Y. Jen Hsu. 2010. A reinforcement learning approach to emotion-based automatic playlist generation. In Proceedings of the International Conference on Technologies and Applications of Artificial Intelligence. IEEE Computer Society, 60--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Ebbinghaus. 1913. Memory: A Contribution to Experimental Psychology. Educational reprints. Teachers College, Columbia University.Google ScholarGoogle ScholarCross RefCross Ref
  14. D. Eck, P. Lamere, T. Bertin-Mahieux, and S. Green. 2007. Automatic generation of social tags for music recommendation. In Proceedings of the Neural Information Processing Systems Conference (NIPS'07). Vol. 20.Google ScholarGoogle Scholar
  15. N. Friedman and D. Koller. 2009. Probabilistic Graphical Models: Principles and Techniques 1st Ed. The MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Golovin and E. Rahm. 2004. Reinforcement learning architecture for web recommendations. In Proceedings of the International Conference on Information Technology: Coding and Computing. Vol. 1, IEEE, 398--402. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Gunawardana and G. Shani. 2009. A survey of accuracy evaluation metrics of recommendation tasks. J. Mach. Learn. Res 10, 2935--2962. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Hariri, B. Mobasher, and R. Burke. 2012. Context-aware music recommendation based on latenttopic sequential patterns. In Proceedings of the 6th ACM Conference on Recommender Systems (RecSys'12). ACM Press, New York, 131--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. Hastie, R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning. Springer.Google ScholarGoogle Scholar
  20. Y. Hu and M. Ogihara. 2011. Nextone player: A music recommendation system based on user behavior. In Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR'11).Google ScholarGoogle Scholar
  21. T. Joachims, D. Freitag, and T. Mitchell. 1997. WebWatcher: A tour guide for the world wide web. In Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI'97). 770--777.Google ScholarGoogle Scholar
  22. M. Kaminskas, F. Ricci, and M. Schedl. 2013. Location-aware music recommendation using auto-tagging and hybrid matching. In Proceedings of the 7th ACM Conference on Recommender Systems (RecSys'13). ACM Press, New York, 17--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Karimi, C. Freudenthaler, A. Nanopoulos, and L. Schmidt-Thieme. 2011. Towards optimal active learning for matrix factorization in recommender systems. In Proceedings of the 23rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI'11). 1069--1076. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. E. Kaufmann, O. Cappé, and A. Garivier. 2012. On bayesian upper confidence bounds for bandit problems. J. Mach. Learn. Res. Proc. Track 22, 592--600.Google ScholarGoogle Scholar
  25. P. Knees and M. Schedl. 2013. A survey of music similarity and recommendation from music context data. ACM Trans. Multimedia Comput. Comm. Appl. 10, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y. Koren, R. Bell, and C. Volinsky. 2009. Matrix factorization techniques for recommender systems. Comput. 42, 8, 30--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. N. Lathia, S. Hailes, L. Capra, and X. Amatriain. 2010. Temporal diversity in recommender systems. In Proceedings of the 33rd International ACM SIGIR Conference (SIGIR'10). ACM Press, New York, 210--217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L. Li, W. Chu, J. Langford, and R. E. Schapire. 2012. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web (WWW'10). ACM Press, New York, 661--670. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. L. Li, W. Chu, J. Langford, and X. Wang. 2011. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM'11). ACM Press, New York, 297--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. E. Liebman and P. Stone. 2014. Dj-mc: A reinforcement-learning agent for music playlist recommendation. http://arxiv.org/pdf/1401.1880.pdfGoogle ScholarGoogle Scholar
  31. H. Liu, J. Hu, and M. Rauterberg. 2009. Music playlist recommendation based on user heartbeat and music preference. In Proceedings of the International Conference on Computer Technology and Development. Vol. 1, 545--549. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. L. Liu, H. Xu, J. Xing, S. Liu, X. Zhou, and S. Yan. 2013. Wow! You are so beautiful today! In Proceedings of the 21st ACM International Conference on Multimedia (MM'13). 3--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. B. Logan. 2002. Content-based playlist generation: Exploratory experiments. In Proceedings of the 3rd International Conference on Music Information Retrieval (ISMIR'02). 295--296.Google ScholarGoogle Scholar
  34. D. P. Mackinnon, M. S. Fritz, J. Williams, and C. M. Lockwood. 2007. Distribution of the product confidence limits for the indirect effect: Program prodclin. Behav. Res. Methods 39, 3, 384--389.Google ScholarGoogle ScholarCross RefCross Ref
  35. B. C. May, N. Korda, A. Lee, and D. S. Leslie. 2012. Optimistic bayesian sampling in contextual-bandit problems. J. Mach. Learn. Res. 13, 1, 2069--2106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. M. E. J. Newman. 2005. Power laws, pareto distributions and zipf's law. Contemp. Phys. 46, 5, 323--351.Google ScholarGoogle ScholarCross RefCross Ref
  37. A. V. D. Oord, S. Dieleman, and B. Schrauwen. 2013. Deep content-based music recommendation. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'13). 2643--2651.Google ScholarGoogle Scholar
  38. R. Salakhutdinov and A. Mnih. 2008. Bayesian probabilistic matrix factorization using markov chain monte carlo. In Proceedings of the 25th International Conference on Machine Learning (ICML'08). ACM Press, New York, 880--887. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. M. Schedl and D. Schnitzer. 2014. Location-aware music artist recommendation. In Proceedings of the 20th International Conference on MultiMedia Modeling (MMM'14).Google ScholarGoogle Scholar
  40. G. Shani, D. Heckerman, and R. I. Brafman. 2005. An mdp-based recommender system. J. Mach. Learn. Res 6, 1265--1295. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. J. Shen, X. S. Hua, and E. Sargin. 2013. Towards next generation multimedia recommendation systems. In Proceedings of the 21st ACM International Conference on Multimedia (MM'13). ACM Press, New York, 1109--1110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. J. Silva and L. Carin. 2012. Active learning for online bayesian matrix factorization. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'12). ACM Press, New York, 325--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Y. Song, S. Dixon, and M. Pearce. 2012. A survey of music recommendation systems and future perspectives. In Proceedings of the 9th International Symposium on Computer Music Modelling and Retrieval (CMMR'12).Google ScholarGoogle Scholar
  44. A. Srivihok and P. Sukonmanee. 2005. E-commerce intelligent agent: Personalization travel support agent using q learning. In Proceedings of the 7th International Conference on Electronic Commerce (ICEC'05). ACM Press, New York, 287--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. R. S. Sutton and A. G. Barto. 1998. Reinforcement Learning: An Introduction. Bradford. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. C. Szepesvári. 2010. Algorithms for Reinforcement Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning Series. Vol. 4, Morgan and Claypool, San Rafael, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. N. Taghipour and A. Kardan. 2008. A hybrid web recommender system based on q-learning. In Proceedings of the ACM Symposium on Applied Computing (SAC'08). 1164--1168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. X. Wang, D. Rosenblum, and Y. Wang. 2012. Context-aware mobile music recommendation for daily activities. In Proceedings of the 20th ACM International Conference on Multimedia (MM'12). ACM Press, New York, 99--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. K. Yoshii, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno. 2006. Hybrid collaborative and content-based music recommendation using probabilistic model with latent user preferences. In Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR'06). 296--301.Google ScholarGoogle Scholar
  50. B.-T. Zhang and Y.-W. Seo. 2001. Personalized web-document filtering using reinforcement learning. Appl. Artif. Intell. 15, 665--685.Google ScholarGoogle ScholarCross RefCross Ref
  51. Y. C. Zhang, Diarmuid, D. Quercia, and T. Jambor. 2012. Auralist: Introducing serendipity into music recommendation. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM'12). ACM Press, New York, 13--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. E. Zheleva, J. Guiver, E. M. Rodrigues, and N. M. Frayling. 2010. Statistical models of music-listening sessions in social media. In Proceedings of the 19th International Conference on World Wide Web (WWW'10). ACM Press, New York, 1019--1028. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploration in Interactive Personalized Music Recommendation: A Reinforcement Learning Approach

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 11, Issue 1
        August 2014
        151 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/2665935
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 September 2014
        • Revised: 1 May 2014
        • Accepted: 1 May 2014
        • Received: 1 November 2013
        Published in tomm Volume 11, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader