ABSTRACT
This paper is an empirical study of the distributed deep learning for question answering subtasks: answer selection and question classification. Comparison studies of SGD, MSGD, ADADELTA, ADAGRAD, ADAM/ADAMAX, RMSPROP, DOWNPOUR and EASGD/EAMSGD algorithms have been presented. Experimental results show that the distributed framework based on the message passing interface can accelerate the convergence speed at a sublinear scale. This paper demonstrates the importance of distributed training. For example, with 48 workers, a 24x speedup is achievable for the answer selection task and running time is decreased from 138.2 hours to 5.81 hours, which will increase the productivity significantly.
- L. Bottou. Online learning in neural networks. chapter Online Learning and Stochastic Approximations. Cambridge University Press, New York, NY, USA, 1998. Google ScholarDigital Library
- T. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman. Project adam: Building an efficient and scalable deep learning training system. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pages 571--582, Broomfield, CO, 2014. USENIX Association. Google ScholarDigital Library
- J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. A. Ranzato, A. W. Senior, P. A. Tucker, K. Yang, and A. Y. Ng. Large scale distributed deep networks. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012., pages 1232--1240, 2012. Google ScholarDigital Library
- J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 12:2121--2159, July 2011. Google ScholarDigital Library
- M. Feng, B. Xiang, M. R. Glass, L. Wang, and B. Zhou. Applying deep learning to answer selection: A study and an open task. In Proceedings of the 2015 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015), Scottsdale, Arizona, 2015.Google ScholarCross Ref
- A. Graves. Generating sequences with recurrent neural networks. CoRR, abs/1308.0850, 2013.Google Scholar
- S. Gupta, W. Zhang, and J. Milthorpe. Model Accuracy and Runtime Tradeoff in Distributed Deep Learning. ArXiv e-prints, September 2015.Google Scholar
- D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014.Google Scholar
- Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521:436--444, May 2015.Google ScholarCross Ref
- M. Li, D. G. Andersen, J. W. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J. Shekita, and B. Su. Scaling distributed machine learning with the parameter server. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pages 583--598, Broomfield, CO, Oct. 2014. USENIX Association. Google ScholarDigital Library
- I. Sutskever, J. Martens, G. E. Dahl, and G. E. Hinton. On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on Machine Learning (ICML-13), volume 28, pages 1139--1147. JMLR Workshop and Conference Proceedings, May 2013.Google Scholar
- E. P. Xing, Q. Ho, W. Dai, J. Kim, J. Wei, S. Lee, X. Zheng, P. Xie, A. Kumar, and Y. Yu. Petuum: A new platform for distributed machine learning on big data. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '15, pages 1335--1344, 2015. Google ScholarDigital Library
- M. D. Zeiler. ADADELTA: an adaptive learning rate method. CoRR, abs/1212.5701, 2012.Google Scholar
- S. Zhang, A. Choromanska, and Y. LeCun. Deep learning with elastic averaging SGD. In Proceedings of the 2015 Conference on Neural Information Processing Systems. (NIPS 2015), 2015. Google ScholarDigital Library
Index Terms
Distributed Deep Learning for Question Answering
Recommendations
Quality-aware collaborative question answering: methods and evaluation
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data MiningCommunity Question Answering (QA) portals contain questions and answers contributed by hundreds of millions of users. These databases of questions and answers are of great value if they can be used directly to answer questions from any user. In this ...
Recent progress in leveraging deep learning methods for question answering
AbstractQuestion answering, serving as one of important tasks in natural language processing, enables machines to understand questions in natural language and answer the questions concisely. From web search to expert systems, question answering systems ...
Deep learning-based question answering: a survey
AbstractQuestion Answering is a crucial natural language processing task. This field of research has attracted a sudden amount of interest lately due mainly to the integration of the deep learning models in the Question Answering Systems which ...
Comments