skip to main content
10.1145/2835776.2835777acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Multi-view Machines

Published:08 February 2016Publication History

ABSTRACT

With rapidly growing amount of data available on the web, it becomes increasingly likely to obtain data from different perspectives for multi-view learning. Some successive examples of web applications include recommendation and target advertising. Specifically, to predict whether a user will click an ad in a query context, there are available features extracted from user profile, ad information and query description, and each of them can only capture part of the task signals from a particular aspect/view. Different views provide complementary information to learn a practical model for these applications. Therefore, an effective integration of the multi-view information is critical to facilitate the learning performance.

In this paper, we propose a general predictor, named multi-view machines (MVMs), that can effectively explore the full-order interactions between features from multiple views. A joint factorization is applied for the interaction parameters which makes parameter estimation more accurate under sparsity and renders the model with the capacity to avoid overfitting. Moreover, MVMs can work in conjunction with different loss functions for a variety of machine learning tasks. The advantages of MVMs are illustrated through comparison with other methods for multi-view prediction, including support vector machines (SVMs), support tensor machines (STMs) and factorization machines (FMs).

A stochastic gradient descent method and a distributed implementation on Spark are presented to learn the MVM model. Through empirical studies on two real-world web application datasets, we demonstrate the effectiveness of MVMs on modeling feature interactions in multi-view data. A 3.51\% accuracy improvement is shown on MVMs over FMs for the problem of movie rating prediction, and 0.57\% for ad click prediction.

References

  1. Yuanzhe Cai, Miao Zhang, Dijun Luo, Chris Ding, and Sharma Chakravarthy. Low-order tensor decompositions for social tagging recommendation. In WSDM, pages 695--704. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bokai Cao, Lifang He, Xiangnan Kong, Philip S. Yu, Zhifeng Hao, and Ann B. Ragin. Tensor-based multi-view feature selection with applications to brain diseases. In ICDM, pages 40--49. IEEE, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, 12:2121--2159, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Joseph E Gonzalez, Reynold S Xin, Ankur Dave, Daniel Crankshaw, Michael J Franklin, and Ion Stoica. GraphX: Graph processing in a distributed dataflow framework. In OSDI, pages 599--613. USENIX, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Liangjie Hong, Aziz S Doumith, and Brian D Davison. Co-factorization machines: modeling user interests and predicting individual decisions in twitter. In WSDM, pages 557--566. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Yu-Chin Juan, Yong Zhuang, and Wei-Sheng Chin. LIBFFM: A Library for Field-aware Factorization Machines, 2015. Software available at http://www.csie.ntu.edu.tw/cjlin/libffm.Google ScholarGoogle Scholar
  7. Tamara G Kolda and Brett W Bader. Tensor decompositions and applications. SIAM review, 51(3):455--500, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yehuda Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In KDD, pages 426--434. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gert RG Lanckriet, Nello Cristianini, Peter Bartlett, Laurent El Ghaoui, and Michael I Jordan. Learning the kernel matrix with semidefinite programming. The Journal of Machine Learning Research, 5:27--72, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Steffen Rendle. Factorization machines. In ICDM, pages 995--1000. IEEE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Steffen Rendle. Factorization machines with libFM. Intelligent Systems and Technology, 3(3):57, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Steffen Rendle and Lars Schmidt-Thieme. Pairwise interaction tensor factorization for personalized tag recommendation. In WSDM, pages 81--90. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Dacheng Tao, Xuelong Li, Weiming Hu, Stephen Maybank, and Xindong Wu. Supervised tensor learning. In ICDM, pages 8--pp. IEEE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Vladimir Vapnik. The nature of statistical learning theory. Springer Science & Business Media, 2000. Google ScholarGoogle ScholarCross RefCross Ref
  15. Cong Xie, Ling Yan, Wu-Jun Li, and Zhihua Zhang. Distributed power-law graph computing: Theoretical and empirical analysis. In NIPS, pages 1673--1681, 2014.Google ScholarGoogle Scholar
  16. Ling Yan, Wu-jun Li, Gui-Rong Xue, and Dingyi Han. Coupled group lasso for web-scale CTR prediction in display advertising. In ICML, pages 802--810, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J Franklin, Scott Shenker, and Ion Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, pages 2--2. USENIX, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multi-view Machines

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WSDM '16: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining
      February 2016
      746 pages
      ISBN:9781450337168
      DOI:10.1145/2835776

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 8 February 2016

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      WSDM '16 Paper Acceptance Rate67of368submissions,18%Overall Acceptance Rate498of2,863submissions,17%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader