skip to main content
10.1145/2507157.2507164acmconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
research-article

A fast parallel SGD for matrix factorization in shared memory systems

Published:12 October 2013Publication History

ABSTRACT

Matrix factorization is known to be an effective method for recommender systems that are given only the ratings from users to items. Currently, stochastic gradient descent (SGD) is one of the most popular algorithms for matrix factorization. However, as a sequential approach, SGD is difficult to be parallelized for handling web-scale problems. In this paper, we develop a fast parallel SGD method, FPSGD, for shared memory systems. By dramatically reducing the cache-miss rate and carefully addressing the load balance of threads, FPSGD is more efficient than state-of-the-art parallel algorithms for matrix factorization.

References

  1. R. M. Bell and Y. Koren. Lessons from the Netflix prize challenge. ACM SIGKDD Explorations Newsletter, 9(2):75--79, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. K.-W. Chang, C.-J. Hsieh, and C.-J. Lin. Coordinate descent method for large-scale L2-loss linear SVM. Journal of Machine Learning Research, 9:1369--1398, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Dror, N. Koenigstein, Y. Koren, and M. Weimer. The Yahoo! music dataset and KDD-Cup 11. In JMLR Workshop and Conference Proceedings: Proceedings of KDD Cup 2011, volume 18, pages 3--18, 2012.Google ScholarGoogle Scholar
  4. R. Gemulla, E. Nijkamp, P. J. Haas, and Y. Sismanis. Large-scale matrix factorization with distributed stochastic gradient descent. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '11, pages 69--77, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. B. Hall, S. Gilpin, and G. Mann. MapReduce/Bigtable for distributed optimization. In Neural Information Processing Systems Workshop on Leaning on Cores, Clusters, and Clouds, 2010.Google ScholarGoogle Scholar
  6. J. Kiefer and J. Wolfowitz. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics, 23(3):462--466, 1952.Google ScholarGoogle ScholarCross RefCross Ref
  7. Y. Koren, R. M. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30--37, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Mann, R. McDonald, M. Mohri, N. Silberman, and D. Walker. Efficient large-scale distributed training of conditional maximum entropy models. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 1231--1239. 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. McDonald, K. Hall, and G. Mann. Distributed training strategies for the structured perceptron. In Proceedings of the 48th Annual Meeting of the Association of Computational Linguistics (ACL), pages 456--464, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. F. Niu, B. Recht, C. Ré, and S. J. Wright. HOGWILD!: A lock-free approach to parallelizing stochastic gradient descent. In J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K. Weinberger, editors, Advances in Neural Information Processing Systems 24, pages 693--701, 2011.Google ScholarGoogle Scholar
  11. I. Pilászy, D. Zibriczky, and D. Tikk. Fast ALS-based matrix factorization for explicit and implicit feedback datasets. In Proceedings of the Fourth ACM Conference on Recommender Systems, pages 71--78, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Robbins and S. Monro. A stochastic approximation method. The Annals of Mathematical Statistics, 22(3):400--407, 1951.Google ScholarGoogle ScholarCross RefCross Ref
  13. H.-F. Yu, C.-J. Hsieh, S. Si, and I. S. Dhillon. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In Proceedings of the IEEE International Conference on Data Mining, pages 765--774, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Zhou, D. Wilkinson, R. Schreiber, and R. Pan. Large-scale parallel collaborative filtering for the Netflix prize. In Proceedings of the Fourth International Conference on Algorithmic Aspects in Information and Management, pages 337--348, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Zinkevich, M. Weimer, A. Smola, and L. Li. Parallelized stochastic gradient descent. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems 23, pages 2595--2603. 2010.Google ScholarGoogle Scholar

Index Terms

  1. A fast parallel SGD for matrix factorization in shared memory systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      RecSys '13: Proceedings of the 7th ACM conference on Recommender systems
      October 2013
      516 pages
      ISBN:9781450324090
      DOI:10.1145/2507157
      • General Chairs:
      • Qiang Yang,
      • Irwin King,
      • Qing Li,
      • Program Chairs:
      • Pearl Pu,
      • George Karypis

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 October 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      RecSys '13 Paper Acceptance Rate32of136submissions,24%Overall Acceptance Rate254of1,295submissions,20%

      Upcoming Conference

      RecSys '24
      18th ACM Conference on Recommender Systems
      October 14 - 18, 2024
      Bari , Italy

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader