ABSTRACT
Classical collaborative filtering, and content-based filtering methods try to learn a static recommendation model given training data. These approaches are far from ideal in highly dynamic recommendation domains such as news recommendation and computational advertisement, where the set of items and users is very fluid. In this work, we investigate an adaptive clustering technique for content recommendation based on exploration-exploitation strategies in contextual multi-armed bandit settings. Our algorithm takes into account the collaborative effects that arise due to the interaction of the users with the items, by dynamically grouping users based on the items under consideration and, at the same time, grouping items based on the similarity of the clusterings induced over the users. The resulting algorithm thus takes advantage of preference patterns in the data in a way akin to collaborative filtering methods. We provide an empirical analysis on medium-size real-world datasets, showing scalability and increased prediction performance (as measured by click-through rate) over state-of-the-art methods for clustering bandits. We also provide a regret analysis within a standard linear stochastic noise setting.
- C. Gentile, S. Li, and G. Zappella. Online clustering of bandits. In ICML, 2014.Google ScholarDigital Library
- N. Korda, B. Szorenyi, and S. Li, Distributed Clustering of Linear Bandits in Peer to Peer Networks. In ICML, 2016.Google Scholar
- Y. Abbasi-Yadkori, D. Pál, and C. Szepesvári. Improved algorithms for linear stochastic bandits. 2011.Google Scholar
- J.-Y. Audibert, R. Munos, and C. Szepesvári. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theoretical Computer Science, 410(19):1876--1902, 2009. Google ScholarDigital Library
- P. Auer. Using confidence bounds for exploration-exploitation trade-offs. Journal of Machine Learning Research, 3:397--422, 2002. Google ScholarDigital Library
- P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 2001. Google ScholarDigital Library
- M. G. Azar, A. Lazaric, and E. Brunskill. Sequential transfer in multi-armed bandit with finite set of models. In NIPS, pages 2220--2228, 2013.Google Scholar
- G. Bresler, G. Chen, and S. D. A latent source model for online collaborative filtering. In NIPS. MIT Press, 2014. Google ScholarDigital Library
- E. Brunskill and L. Li. Sample complexity of multi-task reinforcement learning. In UAI, 2013.Google Scholar
- W. Cao, J. Li, Y. Tao, and Z. Li. On top-k selection in multi-armed bandits and hidden bipartite graphs. In NIPS, 2015. Google ScholarDigital Library
- G. Cavallanti, N. Cesa-Bianchi, and C. Gentile. Tracking the best hyperplane with a simple budget perceptron. Machine Learning, 69/2:143--167, 2007. Google ScholarDigital Library
- W. Chu, L. Li, L. Reyzin, and R. E. Schapire. Contextual bandits with linear payoff functions. In AISTATS, 2011.Google Scholar
- K. Crammer and C. Gentile. Multiclass classification with bandit feedback using adaptive regularization. In ICML, 2011.Google Scholar
- I. S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In 7th KDD, pages 269--274. ACM, 2001. Google ScholarDigital Library
- I. S. Dhillon, S. Mallela, and D. S. Modha. Information-theoretic co-clustering. In 9th KDD, pages 89--98, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
- J. Djolonga, A. Krause, and V. Cevher. High-dimensional gaussian process bandits. In NIPS, pages 1025--1033, 2013.Google Scholar
- M. Dudik, D. Erhan, J. Langford, and L. Li. Sample-efficient nonstationary-policy evaluation for contextual bandits. In UAI, 2012.Google Scholar
- T. George and S. Merugu. A scalable collaborative filtering framework based on co-clustering. In 5th ICDM, pages 625--628. IEEE Computer Society, 2005. Google ScholarDigital Library
- F. Hao, S. Li, G. Min, H. Kim, S. Yau, and L. Yang,An Efficient Approach to Generating Location-Sensitive Recommendations in Ad-hoc Social Network Environments. IEEE Transactions on Services Computing, 8:3, pp. 520--533, 2015.Google ScholarCross Ref
- J. Kawale, H. Bui, B. Kveton, L. Thanh, and S. Chawla. Efficient thompson sampling for online matrix-factorization recommendation. In NIPS, 2015. Google ScholarDigital Library
- A. Krause and C. Ong. Contextual gaussian process bandit optimization. In 25th NIPS, 2011.Google Scholar
- B. Kveton, C. Szepesvari, Z. Wen, and A. Ashkan. Cascading bandits: Learning to rank in the cascade model. In ICML, 2015.Google Scholar
- L. Li, W. Chu, J. Langford, and R. Schapire. A contextual-bandit approach to personalized news article recommendation. In WWW, pages 661--670, 2010. Google ScholarDigital Library
- S. Li, F. Hao, M. Li, and H.C. Kim.Medicine Rating Prediction and Recommendation in Mobile Social Networks.In International Conference on Green, Pervasive and Cloud Computing, 2013.Google Scholar
- L. Tang, Y. Jiang, L. Li, and T. Li. Ensemble contextual bandits for personalized recommendation. In RecSys, 2014. Google ScholarDigital Library
- S. Li, C. Gentile, and A. Karatzoglou. Graph clustering bandits for recommendation. arXiv:1605.00596.Google Scholar
- O. Maillard and S. Mannor. Latent bandits. In ICML, 2014.Google Scholar
- E. Moroshko, N. Vaits, and K. Crammer. Second-order non-stationary online learning for regression. Journal of Machine Learning Research, 16:1481--1517, 2015. Google ScholarDigital Library
- T. T. Nguyen and H. W. Lauw. Dynamic clustering of contextual multi-armed bandits. In 23rd CIKM, pages 1959--1962. ACM, 2014. Google ScholarDigital Library
- Y. Seldin, P. Auer, F. Laviolette, J. Shawe-Taylor, and R. Ortner. Pac-bayesian analysis of contextual bandits. In NIPS, pages 1683--1691, 2011.Google ScholarDigital Library
- I. Sutskever, R. Salakhutdinov, and J. Tenenbaum. Modelling relational data using bayesian clustered tensor factorization. In NIPS, pages 1821--1828. MIT Press, 2009.Google ScholarDigital Library
- L. Tang, Y. Jiang, L. Li, C. Zeng, and T. Li. Personalized recommendation via parameter-free contextual bandits. In SIGIR. ACM, 2015. Google ScholarDigital Library
- K. Verstrepen and B. Goethals. Unifying nearest neighbors collaborative filtering. In RecSys, 2014. Google ScholarDigital Library
- Y. Yue, S. A. Hong, and C. Guestrin. Hierarchical exploration for accelerating contextual bandits. In ICML, 2012.Google ScholarDigital Library
Index Terms
- Collaborative Filtering Bandits
Recommendations
Trust-based collaborative filtering: tackling the cold start problem using regular equivalence
RecSys '18: Proceedings of the 12th ACM Conference on Recommender SystemsUser-based Collaborative Filtering (CF) is one of the most popular approaches to create recommender systems. This approach is based on finding the most relevant k users from whose rating history we can extract items to recommend. CF, however, suffers ...
A hybrid knowledge-based approach to collaborative filtering for improved recommendations
Collaborative filtering (CF) is one of the most successful and effective recommendation techniques for personalized information access. This method makes recommendations based on past transactions and feedback from users sharing similar interests. ...
Collaborative filtering based on an iterative prediction method to alleviate the sparsity problem
iiWAS '09: Proceedings of the 11th International Conference on Information Integration and Web-based Applications & ServicesCollaborative filtering (CF) is one of the most popular recommender system technologies. It tries to identify users that have relevant interests and preferences by calculating similarities among user profiles. The idea behind this method is that, it may ...
Comments