Abstract
Some ML papers suffer from flaws that could mislead the public and stymie future research.
- Armstrong, T.G., Moffat, A., Webber, W. and Zobel, J. Improvements that don't add up: ad-hoc retrieval results since 1998. In Proceedings of the 18<sup>th</sup> ACM Conf. Information and Knowledge Management, 2009, 601--610. Google ScholarDigital Library
- Bengio, Y. Practical recommendations for gradient-based training of deep architectures. Neural Networks: Tricks of the Trade. G. Montavon, G.B. Orr, KR Müller, eds. LNCS 7700 (2012). Springer, Berlin, Heidelberg, 437--78.Google Scholar
- Bostrom, N. Superintelligence. Dunod, Paris, France, 2017.Google Scholar
- Bottou, L. et al. Counterfactual reasoning and learning systems: The example of computational advertising. J. Machine Learning Research 14, 1 (2013), 3207--3260. Google ScholarDigital Library
- Bray, A.J. and Dean, D.S. Statistics of critical points of Gaussian fields on large-dimensional spaces. Physical Review Letters 98, 15 (2007), 150201; https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.98.150201.Google ScholarCross Ref
- Chen, D., Bolton, J. and Manning, C.D. A thorough examination of the CNN/Daily Mail reading comprehension task. In Proceedings of the 54<sup>th</sup> Annual Meeting of Assoc. Computational Linguistics, 2016, 2358--2367.Google Scholar
- Choromanska, A., Henaff, M., Mathieu, M., Arous, G.B., LeCun, Y. The loss surfaces of multilayer networks. In Proceedings of the 18<sup>th</sup> Intern. Conf. Artificial Intelligence and Statistics, 2015.Google Scholar
- Cohen, P.R., Howe, A.E. How evaluation guides AI research: the message still counts more than the medium. AI Magazine 9, 4 (1988), 35. Google ScholarDigital Library
- Cotterell, R., Mielke, S.J., Eisner, J. and Roark, B. Are all languages equally hard to language-model? In Proceedings of Conf. North American Chapt. Assoc. Computational Linguistics: Human Language Technologies, Vol. 2, 2018.Google ScholarCross Ref
- Council of the European Union. Motion for a European Parliament Resolution with Recommendations to the Commission on Civil Law Rules on Robotics, 2016; https://bit.ly/285CBjM.Google Scholar
- Dauphin, Y.N. et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. Advances in Neural Information Processing Systems, 2014, 2933--2941. Google ScholarDigital Library
- Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 7639 (2017), 115--118.Google ScholarCross Ref
- Gershgorn, D. The data that transformed AI research---and possibly the world. Quartz, 2017; https://bit.ly/2uwyb8R.Google Scholar
- Goodfellow, I.J., Vinyals, O. and Saxe, A.M. Qualitatively characterizing neural network optimization problems. In Proceedings of the Intern. Conf. Learning Representations, 2015.Google Scholar
- Hazirbas, C., Leal-Taixé, L. and Cremers, D. Deep depth from focus. arXiv Preprint, 2017; arXiv:1704.01085.Google Scholar
- He, K., Zhang, X., Ren, S. and Sun, J. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In Proceedings of the IEEE Intern. Conf. Computer Vision, 2015, 1026--1034. Google ScholarDigital Library
- Henderson, P. et al. Deep reinforcement learning that matters. In Proceedings of the 32<sup>nd</sup> Assoc. Advancement of Artificial Intelligence Conf., 2018.Google Scholar
- Ioffe, S. and Szegedy, C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32<sup>nd</sup> Intern. Conf. Machine Learning 37, 2015; http://proceedings.mlr.press/v37/ioffe15.pdf. Google ScholarDigital Library
- Kingma, D.P. and Ba, J. Adam: A method for stochastic optimization. In Proceedings of the 3<sup>rd</sup> Intern. Conf. Learning Representations, 2015Google Scholar
- Knuth, D.E., Larrabee, T. and Roberts, P.M. Mathematical writing, 1987; https://bit.ly/2TmxyNq Google ScholarDigital Library
- Langley, P. and Kibler, D. The experimental study of machine learning, 1991; http://www.isle.org/~langley/papers/mlexp.ps.Google Scholar
- Lipton, Z.C. The mythos of model interpretability. Intern. Conf. Machine Learning Workshop on Human Interpretability, 2016.Google Scholar
- Lipton, Z.C., Chouldechova, A. and McAuley, J. Does mitigating ML's impact disparity require treatment disparity? Advances in Neural Inform. Process. Syst. 2017, 8136--8146. arXiv Preprint arXiv:1711.07076. Google ScholarDigital Library
- Lucic, M., Kurach, K., Michalski, M., Gelly, S., Bousquet, O. Are GANs created equal? A large-scale study. In Proceedings of the 32<sup>nd</sup> Conf. Neural Information Processing Syst. arXiv Preprint 2017; arXiv:1711.10337. Google ScholarDigital Library
- Markoff, J. Researchers announce advance in image-recognition software. NYT (Nov. 17, 2014); https://nyti.ms/2HfcmSe.Google Scholar
- McDermott, D. Artificial intelligence meets natural stupidity. ACM SIGART Bulletin 57 (1976), 4--9. Google ScholarDigital Library
- Melis, G., Dyer, C. and Blunsom, P. On the state of the art of evaluation in neural language models. In Proceedings of the Intern. Conf. Learning Representations, 2018.Google Scholar
- Metz, C. You don't have to be Google to build an artificial brain. Wired (Sept. 26, 2014); https://www.wired.com/2014/09/google-artificial-brain/.Google Scholar
- Minsky, M. The Emotion Machine: Commonsense Thinking, Artificial Intelligence, and the Future of the Human Mind. Simon & Schuster, New York, NY, 2006. Google ScholarDigital Library
- Mohamed, S., Lakshminarayanan, B. Learning in implicit generative models. arXiv Preprint, 2016; arXiv:1610.03483.Google Scholar
- Noh, H., Hong, S. and Han, B. Learning deconvolution network for semantic segmentation. In Proceedings of the Intern. Conf. Computer Vision, 2015, 1520--1528. Google ScholarDigital Library
- Nye, M.J. N-rays: An episode in the history and psychology of science. Historical Studies in the Physical Sciences 11, 1 (1980), 125--56.Google ScholarCross Ref
- Open Science Collaboration. Estimating the reproducibility of psychological science. Science 349, 6251 (2015), aac4716.Google Scholar
- Platt, J.R. Strong inference. Science 146, 3642 (1964), 347--353.Google ScholarCross Ref
- Reddi, S.J., Kale, S. and Kumar, S. On the convergence of Adam and beyond. In Proceedings of the Intern. Conf. Learning Representations, 2018.Google Scholar
- Romer, P.M. Mathiness in the theory of economic growth. Amer. Econ. Rev. 105, 5 (2015), 89--93.Google ScholarCross Ref
- Santurkar, S., Tsipras, D., Ilyas, A. and Madry, A. How does batch normalization help optimization? (No, it is not about internal covariate shift). In Proceedings of the 32<sup>nd</sup> Conf. Neural Information Processing Systems; 2018; https://papers.nips.cc/paper/7515-how-does-batch-normalization-help-optimization.pdf. Google ScholarDigital Library
- Sculley, D., Snoek, J., Wiltschko, A. and Rahimi, A. Winner's curse? On pace, progress, and empirical rigor. In Proceedings of the 6<sup>th</sup> Intern. Conf. Learning Representations, Workshop Track, 2018Google Scholar
- Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Machine Learning Research 15, 1 (2014), 1929--1958; https://dl.acm.org/citation.cfm?id=2670313. Google ScholarDigital Library
- Steinhardt, J. and Liang, P. Learning fast-mixing models for structured prediction. In Proceedings of the 32<sup>nd</sup> Intern. Conf. Machine Learning 37 (2015), 1063--1072; http://proceedings.mlr.press/v37/steinhardtb15.html. Google ScholarDigital Library
- Steinhardt, J. and Liang, P. Reified context models. In Proceedings of the 32<sup>nd</sup> Intern. Conf. Machine Learning 37, (2015), 1043--1052; https://dl.acm.org/citation.cfm?id=3045230. Google ScholarDigital Library
- Steinhardt, J., Koh, P.W. and Liang, P.S. Certified defenses for data poisoning attacks. In Proceedings of the 31<sup>st</sup> Conf. Neural Information Processing Systems, 2017; https://papers.nips.cc/paper/6943-certified-defenses-for-data-poisoning-attacks.pdf. Google ScholarDigital Library
- Stock, P. and Cisse, M. ConvNets and ImageNet beyond accuracy: Explanations, bias detection, adversarial examples and model criticism. arXiv Preprint, 2017, arXiv:1711.11443.Google Scholar
- Szegedy, C. et al. Intriguing properties of neural networks. Intern. Conf. Learning Representations. arXiv Preprint, 2013, arXiv:1312.6199.Google Scholar
- Zellers, R., Yatskar, M., Thomson, S. and Choi, Y. Neural motifs: Scene graph parsing with global context. In Proceedings of the IEEE Conf. Computer Vision and Pattern Recognition, 2018, 5831--5840.Google ScholarCross Ref
- Zhang, C., Bengio, S., Hardt, M., Recht, B. and Vinyals, O. Understanding deep learning requires rethinking generalization. In Proceedings of the Intern. Conf. Learning Representations, 2017.Google Scholar
Index Terms
- Research for practice: troubling trends in machine-learning scholarship
Recommendations
Affordances for practice
This paper argues that Gibson's concept of affordance inserts a powerful conceptual lens for the study of sociomateriality as enacted in contemporary organizational practices. Our objective in this paper is to develop a comprehensive view of affordances ...
The turn to practice in HCI: towards a research agenda
CHI '14: Proceedings of the SIGCHI Conference on Human Factors in Computing SystemsThis paper argues that a new paradigm for HCI research, which we label the 'practice' perspective, has been emerging in recent years. This stands in contrast to the prevailing mainstream HCI paradigm, which we term the 'interaction' perspective. The '...
Practice-based CSCW Research: ECSCW bridging across the Atlantic
CSCW '16 Companion: Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing CompanionPractice-based CSCW research is an orientation towards empirically-grounded research embracing particular methodological approaches with the aim of creating new theory about work, collaboration, and cooperative technologies. While practice-based CSCW ...
Comments