skip to main content
research-article
Free Access

Research for practice: troubling trends in machine-learning scholarship

Published:21 May 2019Publication History
Skip Abstract Section

Abstract

Some ML papers suffer from flaws that could mislead the public and stymie future research.

References

  1. Armstrong, T.G., Moffat, A., Webber, W. and Zobel, J. Improvements that don't add up: ad-hoc retrieval results since 1998. In Proceedings of the 18<sup>th</sup> ACM Conf. Information and Knowledge Management, 2009, 601--610. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bengio, Y. Practical recommendations for gradient-based training of deep architectures. Neural Networks: Tricks of the Trade. G. Montavon, G.B. Orr, KR Müller, eds. LNCS 7700 (2012). Springer, Berlin, Heidelberg, 437--78.Google ScholarGoogle Scholar
  3. Bostrom, N. Superintelligence. Dunod, Paris, France, 2017.Google ScholarGoogle Scholar
  4. Bottou, L. et al. Counterfactual reasoning and learning systems: The example of computational advertising. J. Machine Learning Research 14, 1 (2013), 3207--3260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bray, A.J. and Dean, D.S. Statistics of critical points of Gaussian fields on large-dimensional spaces. Physical Review Letters 98, 15 (2007), 150201; https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.98.150201.Google ScholarGoogle ScholarCross RefCross Ref
  6. Chen, D., Bolton, J. and Manning, C.D. A thorough examination of the CNN/Daily Mail reading comprehension task. In Proceedings of the 54<sup>th</sup> Annual Meeting of Assoc. Computational Linguistics, 2016, 2358--2367.Google ScholarGoogle Scholar
  7. Choromanska, A., Henaff, M., Mathieu, M., Arous, G.B., LeCun, Y. The loss surfaces of multilayer networks. In Proceedings of the 18<sup>th</sup> Intern. Conf. Artificial Intelligence and Statistics, 2015.Google ScholarGoogle Scholar
  8. Cohen, P.R., Howe, A.E. How evaluation guides AI research: the message still counts more than the medium. AI Magazine 9, 4 (1988), 35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Cotterell, R., Mielke, S.J., Eisner, J. and Roark, B. Are all languages equally hard to language-model? In Proceedings of Conf. North American Chapt. Assoc. Computational Linguistics: Human Language Technologies, Vol. 2, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  10. Council of the European Union. Motion for a European Parliament Resolution with Recommendations to the Commission on Civil Law Rules on Robotics, 2016; https://bit.ly/285CBjM.Google ScholarGoogle Scholar
  11. Dauphin, Y.N. et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. Advances in Neural Information Processing Systems, 2014, 2933--2941. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 7639 (2017), 115--118.Google ScholarGoogle ScholarCross RefCross Ref
  13. Gershgorn, D. The data that transformed AI research---and possibly the world. Quartz, 2017; https://bit.ly/2uwyb8R.Google ScholarGoogle Scholar
  14. Goodfellow, I.J., Vinyals, O. and Saxe, A.M. Qualitatively characterizing neural network optimization problems. In Proceedings of the Intern. Conf. Learning Representations, 2015.Google ScholarGoogle Scholar
  15. Hazirbas, C., Leal-Taixé, L. and Cremers, D. Deep depth from focus. arXiv Preprint, 2017; arXiv:1704.01085.Google ScholarGoogle Scholar
  16. He, K., Zhang, X., Ren, S. and Sun, J. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In Proceedings of the IEEE Intern. Conf. Computer Vision, 2015, 1026--1034. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Henderson, P. et al. Deep reinforcement learning that matters. In Proceedings of the 32<sup>nd</sup> Assoc. Advancement of Artificial Intelligence Conf., 2018.Google ScholarGoogle Scholar
  18. Ioffe, S. and Szegedy, C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32<sup>nd</sup> Intern. Conf. Machine Learning 37, 2015; http://proceedings.mlr.press/v37/ioffe15.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kingma, D.P. and Ba, J. Adam: A method for stochastic optimization. In Proceedings of the 3<sup>rd</sup> Intern. Conf. Learning Representations, 2015Google ScholarGoogle Scholar
  20. Knuth, D.E., Larrabee, T. and Roberts, P.M. Mathematical writing, 1987; https://bit.ly/2TmxyNq Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Langley, P. and Kibler, D. The experimental study of machine learning, 1991; http://www.isle.org/~langley/papers/mlexp.ps.Google ScholarGoogle Scholar
  22. Lipton, Z.C. The mythos of model interpretability. Intern. Conf. Machine Learning Workshop on Human Interpretability, 2016.Google ScholarGoogle Scholar
  23. Lipton, Z.C., Chouldechova, A. and McAuley, J. Does mitigating ML's impact disparity require treatment disparity? Advances in Neural Inform. Process. Syst. 2017, 8136--8146. arXiv Preprint arXiv:1711.07076. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lucic, M., Kurach, K., Michalski, M., Gelly, S., Bousquet, O. Are GANs created equal? A large-scale study. In Proceedings of the 32<sup>nd</sup> Conf. Neural Information Processing Syst. arXiv Preprint 2017; arXiv:1711.10337. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Markoff, J. Researchers announce advance in image-recognition software. NYT (Nov. 17, 2014); https://nyti.ms/2HfcmSe.Google ScholarGoogle Scholar
  26. McDermott, D. Artificial intelligence meets natural stupidity. ACM SIGART Bulletin 57 (1976), 4--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Melis, G., Dyer, C. and Blunsom, P. On the state of the art of evaluation in neural language models. In Proceedings of the Intern. Conf. Learning Representations, 2018.Google ScholarGoogle Scholar
  28. Metz, C. You don't have to be Google to build an artificial brain. Wired (Sept. 26, 2014); https://www.wired.com/2014/09/google-artificial-brain/.Google ScholarGoogle Scholar
  29. Minsky, M. The Emotion Machine: Commonsense Thinking, Artificial Intelligence, and the Future of the Human Mind. Simon & Schuster, New York, NY, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Mohamed, S., Lakshminarayanan, B. Learning in implicit generative models. arXiv Preprint, 2016; arXiv:1610.03483.Google ScholarGoogle Scholar
  31. Noh, H., Hong, S. and Han, B. Learning deconvolution network for semantic segmentation. In Proceedings of the Intern. Conf. Computer Vision, 2015, 1520--1528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Nye, M.J. N-rays: An episode in the history and psychology of science. Historical Studies in the Physical Sciences 11, 1 (1980), 125--56.Google ScholarGoogle ScholarCross RefCross Ref
  33. Open Science Collaboration. Estimating the reproducibility of psychological science. Science 349, 6251 (2015), aac4716.Google ScholarGoogle Scholar
  34. Platt, J.R. Strong inference. Science 146, 3642 (1964), 347--353.Google ScholarGoogle ScholarCross RefCross Ref
  35. Reddi, S.J., Kale, S. and Kumar, S. On the convergence of Adam and beyond. In Proceedings of the Intern. Conf. Learning Representations, 2018.Google ScholarGoogle Scholar
  36. Romer, P.M. Mathiness in the theory of economic growth. Amer. Econ. Rev. 105, 5 (2015), 89--93.Google ScholarGoogle ScholarCross RefCross Ref
  37. Santurkar, S., Tsipras, D., Ilyas, A. and Madry, A. How does batch normalization help optimization? (No, it is not about internal covariate shift). In Proceedings of the 32<sup>nd</sup> Conf. Neural Information Processing Systems; 2018; https://papers.nips.cc/paper/7515-how-does-batch-normalization-help-optimization.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Sculley, D., Snoek, J., Wiltschko, A. and Rahimi, A. Winner's curse? On pace, progress, and empirical rigor. In Proceedings of the 6<sup>th</sup> Intern. Conf. Learning Representations, Workshop Track, 2018Google ScholarGoogle Scholar
  39. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Machine Learning Research 15, 1 (2014), 1929--1958; https://dl.acm.org/citation.cfm?id=2670313. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Steinhardt, J. and Liang, P. Learning fast-mixing models for structured prediction. In Proceedings of the 32<sup>nd</sup> Intern. Conf. Machine Learning 37 (2015), 1063--1072; http://proceedings.mlr.press/v37/steinhardtb15.html. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Steinhardt, J. and Liang, P. Reified context models. In Proceedings of the 32<sup>nd</sup> Intern. Conf. Machine Learning 37, (2015), 1043--1052; https://dl.acm.org/citation.cfm?id=3045230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Steinhardt, J., Koh, P.W. and Liang, P.S. Certified defenses for data poisoning attacks. In Proceedings of the 31<sup>st</sup> Conf. Neural Information Processing Systems, 2017; https://papers.nips.cc/paper/6943-certified-defenses-for-data-poisoning-attacks.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Stock, P. and Cisse, M. ConvNets and ImageNet beyond accuracy: Explanations, bias detection, adversarial examples and model criticism. arXiv Preprint, 2017, arXiv:1711.11443.Google ScholarGoogle Scholar
  44. Szegedy, C. et al. Intriguing properties of neural networks. Intern. Conf. Learning Representations. arXiv Preprint, 2013, arXiv:1312.6199.Google ScholarGoogle Scholar
  45. Zellers, R., Yatskar, M., Thomson, S. and Choi, Y. Neural motifs: Scene graph parsing with global context. In Proceedings of the IEEE Conf. Computer Vision and Pattern Recognition, 2018, 5831--5840.Google ScholarGoogle ScholarCross RefCross Ref
  46. Zhang, C., Bengio, S., Hardt, M., Recht, B. and Vinyals, O. Understanding deep learning requires rethinking generalization. In Proceedings of the Intern. Conf. Learning Representations, 2017.Google ScholarGoogle Scholar

Index Terms

  1. Research for practice: troubling trends in machine-learning scholarship

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Communications of the ACM
        Communications of the ACM  Volume 62, Issue 6
        June 2019
        85 pages
        ISSN:0001-0782
        EISSN:1557-7317
        DOI:10.1145/3336127
        Issue’s Table of Contents

        Copyright © 2019 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 21 May 2019

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Popular
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format