research-article

A fast parallel SGD for matrix factorization in shared memory systems

Authors:
Yong Zhuang

National Taiwan University, Taipei, Taiwan Roc

National Taiwan University, Taipei, Taiwan Roc
View Profile

,
Wei-Sheng Chin

National Taiwan University, Taipei, Taiwan Roc

National Taiwan University, Taipei, Taiwan Roc
View Profile

,
Yu-Chin Juan

National Taiwan University, Taipei, Taiwan Roc

National Taiwan University, Taipei, Taiwan Roc
View Profile

,
Chih-Jen Lin

National Taiwan University, Taipei, Taiwan Roc

National Taiwan University, Taipei, Taiwan Roc
View Profile

RecSys '13: Proceedings of the 7th ACM conference on Recommender systemsOctober 2013Pages 249–256https://doi.org/10.1145/2507157.2507164

Published:12 October 2013Publication History

RecSys '13: Proceedings of the 7th ACM conference on Recommender systems

Pages 249–256

ABSTRACT

Matrix factorization is known to be an effective method for recommender systems that are given only the ratings from users to items. Currently, stochastic gradient descent (SGD) is one of the most popular algorithms for matrix factorization. However, as a sequential approach, SGD is difficult to be parallelized for handling web-scale problems. In this paper, we develop a fast parallel SGD method, FPSGD, for shared memory systems. By dramatically reducing the cache-miss rate and carefully addressing the load balance of threads, FPSGD is more efficient than state-of-the-art parallel algorithms for matrix factorization.

References

R. M. Bell and Y. Koren. Lessons from the Netflix prize challenge. ACM SIGKDD Explorations Newsletter, 9(2):75--79, 2007. Google ScholarDigital Library
K.-W. Chang, C.-J. Hsieh, and C.-J. Lin. Coordinate descent method for large-scale L2-loss linear SVM. Journal of Machine Learning Research, 9:1369--1398, 2008. Google ScholarDigital Library
G. Dror, N. Koenigstein, Y. Koren, and M. Weimer. The Yahoo! music dataset and KDD-Cup 11. In JMLR Workshop and Conference Proceedings: Proceedings of KDD Cup 2011, volume 18, pages 3--18, 2012.Google Scholar
R. Gemulla, E. Nijkamp, P. J. Haas, and Y. Sismanis. Large-scale matrix factorization with distributed stochastic gradient descent. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '11, pages 69--77, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
K. B. Hall, S. Gilpin, and G. Mann. MapReduce/Bigtable for distributed optimization. In Neural Information Processing Systems Workshop on Leaning on Cores, Clusters, and Clouds, 2010.Google Scholar
J. Kiefer and J. Wolfowitz. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics, 23(3):462--466, 1952.Google ScholarCross Ref
Y. Koren, R. M. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30--37, 2009. Google ScholarDigital Library
G. Mann, R. McDonald, M. Mohri, N. Silberman, and D. Walker. Efficient large-scale distributed training of conditional maximum entropy models. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 1231--1239. 2009.Google ScholarDigital Library
R. McDonald, K. Hall, and G. Mann. Distributed training strategies for the structured perceptron. In Proceedings of the 48th Annual Meeting of the Association of Computational Linguistics (ACL), pages 456--464, 2010. Google ScholarDigital Library
F. Niu, B. Recht, C. Ré, and S. J. Wright. HOGWILD!: A lock-free approach to parallelizing stochastic gradient descent. In J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K. Weinberger, editors, Advances in Neural Information Processing Systems 24, pages 693--701, 2011.Google Scholar
I. Pilászy, D. Zibriczky, and D. Tikk. Fast ALS-based matrix factorization for explicit and implicit feedback datasets. In Proceedings of the Fourth ACM Conference on Recommender Systems, pages 71--78, 2010. Google ScholarDigital Library
H. Robbins and S. Monro. A stochastic approximation method. The Annals of Mathematical Statistics, 22(3):400--407, 1951.Google ScholarCross Ref
H.-F. Yu, C.-J. Hsieh, S. Si, and I. S. Dhillon. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In Proceedings of the IEEE International Conference on Data Mining, pages 765--774, 2012. Google ScholarDigital Library
Y. Zhou, D. Wilkinson, R. Schreiber, and R. Pan. Large-scale parallel collaborative filtering for the Netflix prize. In Proceedings of the Fourth International Conference on Algorithmic Aspects in Information and Management, pages 337--348, 2008. Google ScholarDigital Library
M. Zinkevich, M. Weimer, A. Smola, and L. Li. Parallelized stochastic gradient descent. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems 23, pages 2595--2603. 2010.Google Scholar

Index Terms

A fast parallel SGD for matrix factorization in shared memory systems
1. Mathematics of computing
  1. Mathematical software

Recommendations

A Fast Parallel Stochastic Gradient Method for Matrix Factorization in Shared Memory Systems

Matrix factorization is known to be an effective method for recommender systems that are given only the ratings from users to items. Currently, stochastic gradient (SG) method is one of the most popular algorithms for matrix factorization. However, as a ...
Read More
Scalable Task-Parallel SGD on Matrix Factorization in Multicore Architectures
IPDPSW '15: Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop

Recommendation is an indispensable technique especially in e-commerce services such as Amazon or Netflix to provide more preferable items to users. Matrix factorization is a well-known algorithm for recommendation which estimates affinities between users ...
Read More
A parallel matrix factorization based recommender by alternating stochastic gradient decent

Collaborative Filtering (CF) can be achieved by Matrix Factorization (MF) with high prediction accuracy and scalability. Most of the current MF based recommenders, however, are serial, which prevent them sharing the efficiency brought by the rapid ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
RecSys '13: Proceedings of the 7th ACM conference on Recommender systems
October 2013
516 pages
ISBN:9781450324090
DOI:10.1145/2507157
General Chairs:
Qiang Yang
Huawei Noah Ark Research Lab, China
,
Irwin King
Chinese University of Hong Kong, China
,
Qing Li
City University of Hong Kong, China
,
Program Chairs:
Pearl Pu
Ecole Polytechnique Fédérale de Lausanne, Switzerland
,
George Karypis
University of Minnesota, USA
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 October 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
matrix factorization
parallel computing
recommender system
shared memory algorithm
stochastic gradient descent
Qualifiers
- research-article
Conference

Acceptance Rates
RecSys '13 Paper Acceptance Rate32of136submissions,24%Overall Acceptance Rate254of1,295submissions,20%
More
Upcoming Conference
RecSys '24

Sponsor:

sigchi

18th ACM Conference on Recommender Systems

October 14 - 18, 2024

Bari , Italy
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 108
  Total Citations
  View Citations
- 1,767
  Total Downloads
- Downloads (Last 12 months)28
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A fast parallel SGD for matrix factorization in shared memory systems

RecSys '13: Proceedings of the 7th ACM conference on Recommender systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Fast Parallel Stochastic Gradient Method for Matrix Factorization in Shared Memory Systems

Scalable Task-Parallel SGD on Matrix Factorization in Multicore Architectures

A parallel matrix factorization based recommender by alternating stochastic gradient decent