skip to main content
10.1145/2983323.2983377acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Distributed Deep Learning for Question Answering

Published:24 October 2016Publication History

ABSTRACT

This paper is an empirical study of the distributed deep learning for question answering subtasks: answer selection and question classification. Comparison studies of SGD, MSGD, ADADELTA, ADAGRAD, ADAM/ADAMAX, RMSPROP, DOWNPOUR and EASGD/EAMSGD algorithms have been presented. Experimental results show that the distributed framework based on the message passing interface can accelerate the convergence speed at a sublinear scale. This paper demonstrates the importance of distributed training. For example, with 48 workers, a 24x speedup is achievable for the answer selection task and running time is decreased from 138.2 hours to 5.81 hours, which will increase the productivity significantly.

References

  1. L. Bottou. Online learning in neural networks. chapter Online Learning and Stochastic Approximations. Cambridge University Press, New York, NY, USA, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman. Project adam: Building an efficient and scalable deep learning training system. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pages 571--582, Broomfield, CO, 2014. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. A. Ranzato, A. W. Senior, P. A. Tucker, K. Yang, and A. Y. Ng. Large scale distributed deep networks. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012., pages 1232--1240, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 12:2121--2159, July 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Feng, B. Xiang, M. R. Glass, L. Wang, and B. Zhou. Applying deep learning to answer selection: A study and an open task. In Proceedings of the 2015 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015), Scottsdale, Arizona, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  6. A. Graves. Generating sequences with recurrent neural networks. CoRR, abs/1308.0850, 2013.Google ScholarGoogle Scholar
  7. S. Gupta, W. Zhang, and J. Milthorpe. Model Accuracy and Runtime Tradeoff in Distributed Deep Learning. ArXiv e-prints, September 2015.Google ScholarGoogle Scholar
  8. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014.Google ScholarGoogle Scholar
  9. Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521:436--444, May 2015.Google ScholarGoogle ScholarCross RefCross Ref
  10. M. Li, D. G. Andersen, J. W. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J. Shekita, and B. Su. Scaling distributed machine learning with the parameter server. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pages 583--598, Broomfield, CO, Oct. 2014. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. I. Sutskever, J. Martens, G. E. Dahl, and G. E. Hinton. On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on Machine Learning (ICML-13), volume 28, pages 1139--1147. JMLR Workshop and Conference Proceedings, May 2013.Google ScholarGoogle Scholar
  12. E. P. Xing, Q. Ho, W. Dai, J. Kim, J. Wei, S. Lee, X. Zheng, P. Xie, A. Kumar, and Y. Yu. Petuum: A new platform for distributed machine learning on big data. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '15, pages 1335--1344, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. D. Zeiler. ADADELTA: an adaptive learning rate method. CoRR, abs/1212.5701, 2012.Google ScholarGoogle Scholar
  14. S. Zhang, A. Choromanska, and Y. LeCun. Deep learning with elastic averaging SGD. In Proceedings of the 2015 Conference on Neural Information Processing Systems. (NIPS 2015), 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Distributed Deep Learning for Question Answering

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
        October 2016
        2566 pages
        ISBN:9781450340731
        DOI:10.1145/2983323

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 October 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper

        Acceptance Rates

        CIKM '16 Paper Acceptance Rate160of701submissions,23%Overall Acceptance Rate1,861of8,427submissions,22%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader