skip to main content
10.1145/3077136.3080797acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention

Published:07 August 2017Publication History

ABSTRACT

Multimedia content is dominating today's Web information. The nature of multimedia user-item interactions is 1/0 binary implicit feedback (e.g., photo likes, video views, song downloads, etc.), which can be collected at a larger scale with a much lower cost than explicit feedback (e.g., product ratings). However, the majority of existing collaborative filtering (CF) systems are not well-designed for multimedia recommendation, since they ignore the implicitness in users' interactions with multimedia content. We argue that, in multimedia recommendation, there exists item- and component-level implicitness which blurs the underlying users' preferences. The item-level implicitness means that users' preferences on items (e.g. photos, videos, songs, etc.) are unknown, while the component-level implicitness means that inside each item users' preferences on different components (e.g. regions in an image, frames of a video, etc.) are unknown. For example, a 'view'' on a video does not provide any specific information about how the user likes the video (i.e.item-level) and which parts of the video the user is interested in (i.e.component-level). In this paper, we introduce a novel attention mechanism in CF to address the challenging item- and component-level implicit feedback in multimedia recommendation, dubbed Attentive Collaborative Filtering (ACF). Specifically, our attention model is a neural network that consists of two attention modules: the component-level attention module, starting from any content feature extraction network (e.g. CNN for images/videos), which learns to select informative components of multimedia items, and the item-level attention module, which learns to score the item preferences. ACF can be seamlessly incorporated into classic CF models with implicit feedback, such as BPR and SVD++, and efficiently trained using SGD. Through extensive experiments on two real-world multimedia Web services: Vine and Pinterest, we show that ACF significantly outperforms state-of-the-art CF methods.

References

  1. D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. In ICLR, 2014.Google ScholarGoogle Scholar
  2. S. Baluja, R. Seth, D. Sivakumar, Y. Jing, J. Yagnik, S. Kumar, D. Ravichandran, and M. Aly. Video suggestion and discovery for youtube: taking random walks through the view graph. In WWW, pages 895--904. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Bendersky, L. G. Pueyo, J. J. Harmsen, V. Josifovski, and D. Lepikhin. Up next: retrieval methods for large scale related video suggestion. In KDD, pages 1769--1778. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Chen, J. Wang, Q. Huang, and T. Mei. Personalized video recommendation through tripartite graph propagation. In Proceedings of the International Conference on Multimedia, pages 1133--1136. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Chen. Multi-modal learning: Study on A large-scale micro-video data collection. In Proceedings of the 2016 ACM Conference on Multimedia Conference, MM 2016, Amsterdam, The Netherlands, October 15-19, 2016, pages 1454--1458. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Chen, X. Song, L. Nie, X. Wang, H. Zhang, and T. Chua. Micro tells macro: Predicting the popularity of micro-videos via a transductive model. In MM, pages 898--907. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. Chen, H. Zhang, J. Xiao, L. Nie, J. Shao, W. Liu, and T.-S. Chua. Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In CVPR. IEEE, 2017.Google ScholarGoogle Scholar
  8. T. Chen, X. He, and M.-Y. Kan. Context-aware image tweet modelling and recommendation. In MM, pages 1018--1027. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. Chen, W. Zhang, Q. Lu, K. Chen, Z. Zheng, and Y. Yu. Svdfeature: a toolkit for feature-based collaborative filtering. JMLR, 13:3619--3622, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. X. Chen, Y. Zhang, H. X. Qingyao Ai, J. Yan, and Z. Qin. Personalized key frame recommendation. In SIGIR. ACM, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Z. Cheng and J. Shen. On effective location-aware music recommendation. TOIS, 34(2):13:1--13:32, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Cui, Z. Wang, and Z. Su. What videos are similar with you? Learning a common attributed representation for video recommendation. In MM, pages 597--606. ACM, 2014.Google ScholarGoogle Scholar
  13. A. Farseev, I. Samborskii, A. Filchenkov, and T.-S. Chua. Cross-domain recommendation via clustering on multi-layer graphs. In SIGIR. ACM, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. F. Feng, L. Nie, X. Wang, R. Hong, and C. Tat-Seng. Computational social indicators: a case study of chinese university ranking. In SIGIR. ACM, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. X. Geng, H. Zhang, J. Bian, and T. Chua. Learning image and user features for recommendation in social networks. In ICCV, pages 4274--4282. IEEE, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. X. Geng, H. Zhang, Z. Song, Y. Yang, H. Luan, and T. Chua. One of a kind: User profiling by social curation. In MM, pages 567--576. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In JMLR, pages 249--256. JMLR.org, 2010.Google ScholarGoogle Scholar
  18. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, pages 770--778. IEEE, 2016. Google ScholarGoogle ScholarCross RefCross Ref
  19. X. He, M. Gao, M.-Y. Kan, and D. Wang. Birank: Towards ranking on bipartite graphs. TKDE, 29(1):57--71, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T.-S. Chua. Neural collaborative filtering. In WWW, pages 173--182. ACM, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. X. He, H. Zhang, M. Kan, and T. Chua. Fast matrix factorization for online recommendation with implicit feedback. In SIGIR, pages 549--558. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Hu, M. Rohrbach, J. Andreas, T. Darrell, and K. Saenko. Modeling relationships in referential expressions with compositional modular networks. In CVPR, 2016.Google ScholarGoogle Scholar
  23. Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In ICDM, pages 263--272. IEEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Kabbur, X. Ning, and G. Karypis. FISM: factored item similarity models for top-n recommender systems. In KDD, pages 659--667. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In KDD, pages 426--434. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. T. Mei, B. Yang, X. Hua, L. Yang, S. Yang, and S. Li. Videoreach: an online video recommendation system. In SIGIR, pages 767--768. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Pan, Y. Zhou, B. Cao, N. N. Liu, R. M. Lukose, M. Scholz, and Q. Yang. One-class collaborative filtering. In ICDM, pages 502--511. IEEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. J. Pazzani and D. Billsus. Content-based recommendation systems. In Proceedings of the Adaptive Web, Methods and Strategies of Web Personalization, pages 325--341. Springer, 2007. Google ScholarGoogle ScholarCross RefCross Ref
  29. S. Rendle. Factorization machines. In ICDM, pages 995--1000. IEEE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme. BPR: bayesian personalized ranking from implicit feedback. In UAI, pages 452--461. IEEE, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. In WWW, pages 285--295. ACM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Shen, M. Wang, S. Yan, and P. Cui. Multimedia recommendation: technology and techniques. In SIGIR, page 1131. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. van den Oord, S. Dieleman, and B. Schrauwen. Deep content-based music recommendation. In NIPS, pages 2643--2651. NIPS Foundation, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Wang, H. Li, D. Tao, K. Lu, and X. Wu. Multimodal graph-based reranking for web image search. TIP, 21(11):4649--4661, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Wang, X. Liu, and X. Wu. Visual classification by l1-hypergraph modeling. TKDE, 27(9):2564--2574, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. S. Wang, Y. Wang, J. Tang, K. Shu, S. Ranganath, and H. Liu. What your images reveal: Exploiting visual contents for point-of-interest recommendation. In WWW, pages 391--400. ACM, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. X. Wang, X. He, L. Nie, and T.-S. Chua. Item silk road: Recommending items from information domains to social users. In SIGIR. ACM, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. X. Wang, L. Nie, X. Song, D. Zhang, and T.-S. Chua. Unifying virtual and physical worlds: Learning toward local and global consistency. TOIS, 36(1):4, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. K. Xu, J. Ba, R. Kiros, K. Cho, A. C. Courville, R. Salakhutdinov, R. S. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In ICML, pages 2048--2057. JMLR.org, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Q. You, H. Jin, Z. Wang, C. Fang, and J. Luo. Image captioning with semantic attention. In CVPR, pages 4651--4659. IEEE, 2016. Google ScholarGoogle ScholarCross RefCross Ref
  41. M. Zanfir, E. Marinoiu, and C. Sminchisescu. Spatio-temporal attention models for grounded video captioning. In ACCV, pages 104--119. Springer, 2016.Google ScholarGoogle Scholar
  42. H. Zhang, Z. Kyaw, S.-F. Chang, and T.-S. Chua. Visual translation embedding network for visual relation detection. In CVPR, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  43. H. Zhang, F. Shen, W. Liu, X. He, H. Luan, and T. Chua. Discrete collaborative filtering. In SIGIR, pages 325--334. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. H. Zhang, Z. Zha, Y. Yang, S. Yan, Y. Gao, and T. Chua. Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In MM, pages 33--42. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. J. Zhang, L. Nie, X. Wang, X. He, X. Huang, and T. Chua. Shorter-is-better: Venue category estimation from micro-video. In MM, pages 1415--1424. ACM, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Z. Zhao and M. Shang. User-based collaborative-filtering recommendation algorithms on hadoop. In KDD, pages 478--481. ACM, 2010.Google ScholarGoogle Scholar

Index Terms

  1. Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
      August 2017
      1476 pages
      ISBN:9781450350228
      DOI:10.1145/3077136

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 August 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGIR '17 Paper Acceptance Rate78of362submissions,22%Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader