skip to main content
10.1145/3301275.3302310acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article
Public Access
Honorable Mention

Explaining models: an empirical study of how explanations impact fairness judgment

Published:17 March 2019Publication History

ABSTRACT

Ensuring fairness of machine learning systems is a human-in-the-loop process. It relies on developers, users, and the general public to identify fairness problems and make improvements. To facilitate the process we need effective, unbiased, and user-friendly explanations that people can confidently rely on. Towards that end, we conducted an empirical study with four types of programmatically generated explanations to understand how they impact people's fairness judgments of ML systems. With an experiment involving more than 160 Mechanical Turk workers, we show that: 1) Certain explanations are considered inherently less fair, while others can enhance people's confidence in the fairness of the algorithm; 2) Different fairness problems-such as model-wide fairness issues versus case-specific fairness discrepancies-may be more effectively exposed through different styles of explanation; 3) Individual differences, including prior positions and judgment criteria of algorithmic fairness, impact how people react to different styles of explanation. We conclude with a discussion on providing personalized and adaptive explanations to support fairness judgments of ML systems.

Skip Supplemental Material Section

Supplemental Material

p275-dodge.mp4

mp4

96.4 MB

References

  1. 2016. Artificial Intelligence's White Guy Problem. https://www.nytimes.com/2016/06/26/opinion/sunday/artificial-intelligences-white-guy-problem.html Accessed: 6/8/2018.Google ScholarGoogle Scholar
  2. 2017. Rise of the racist robots-how AI is learning all our worst impulses. https://www.theguardian.com/inequality/2017/aug/08/rise-of-the-racist-robots-how-ai-is-learning-all-our-worst-impulses Accessed: 6/8/2018.Google ScholarGoogle Scholar
  3. 2018. COMPAS Recidivism Risk Score Data and Analysis. https://www.propublica.org/datastore/dataset/compas-recidivism-risk-score-data-and-analysis Accessed: 6/8/2018.Google ScholarGoogle Scholar
  4. Ashraf Abdul, Jo Vermeulen, Danding Wang, Brian Y Lim, and Mohan Kankanhalli. 2018. Trends and trajectories for explainable, accountable and intelligible systems: An hci research agenda. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 582. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Reuben Binns, Max Van Kleek, Michael Veale, Ulrik Lyngs, Jun Zhao, and Nigel Shadbolt. 2018. 'It's Reducing a Human Being to a Percentage': Perceptions of Justice in Algorithmic Decisions. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). ACM, New York, NY, USA, Article 377, 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Tim Brennan, William Dieterich, and Beate Ehret. 2009. Evaluating the Predictive Validity of the Compas Risk and Needs Assessment System. Criminal Justice and Behavior 36, 1 (2009), 21--40.Google ScholarGoogle ScholarCross RefCross Ref
  7. John T Cacioppo, Richard E Petty, and Chuan Feng Kao. 1984. The efficient assessment of need for cognition. Journal of personality assessment 48, 3 (1984), 306--307.Google ScholarGoogle ScholarCross RefCross Ref
  8. Toon Calders and Indrė Žliobaitė. 2013. Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures. Springer Berlin Heidelberg, Berlin, Heidelberg, 43--57.Google ScholarGoogle Scholar
  9. Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Rama-murthy, and Kush R Varshney. 2017. Optimized Pre-Processing for Discrimination Prevention. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 3992--4001. http://papers.nips.cc/paper/6988-optimized-pre-processing-for-discrimination-prevention.pdf Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in neural information processing systems. 2172--2180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. William J Clancey. 1983. The epistemology of a rule-based expert system-a framework for explanation. Artificial intelligence 20, 3 (1983), 215--251.Google ScholarGoogle Scholar
  12. Duncan Cramer and Dennis Laurence Howitt. 2004. The Sage dictionary of statistics: a practical resource for students in the social sciences. Sage.Google ScholarGoogle Scholar
  13. Philip M Fernbach, Steven A Sloman, Robert St Louis, and Julia N Shube. 2012. Explanation fiends and foes: How mechanistic detail determines understanding and preference. Journal of Consumer Research 39, 5 (2012), 1115--1131.Google ScholarGoogle ScholarCross RefCross Ref
  14. Nina Grgic-Hlaca, Elissa M. Redmiles, Krishna P. Gummadi, and Adrian Weller. 2018. Human Perceptions of Fairness in Algorithmic Decision Making: A Case Study of Criminal Risk Prediction. In Proceedings of the 2018 World Wide Web Conference (WWW '18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 903--912. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sara Hajian, Francesco Bonchi, and Carlos Castillo. 2016. Algorithmic bias: From discrimination discovery to fairness-aware data mining. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 2125--2126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Matthew Joseph, Michael Kearns, Jamie H Morgenstern, and Aaron Roth. 2016. Fairness in learning: Classic and contextual bandits. In Advances in Neural Information Processing Systems. 325--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 35--50.Google ScholarGoogle ScholarCross RefCross Ref
  18. Been Kim, Rajiv Khanna, and Oluwasanmi O Koyejo. 2016. Examples are not enough, learn to criticize! criticism for interpretability. In Advances in Neural Information Processing Systems. 2280--2288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Gary Klein, Louise Rasmussen, Mei-Hua Lin, Robert R Hoffman, and Jason Case. 2014. Influencing preferences for different types of causal explanation of complex events. Human factors 56, 8 (2014), 1380--1400.Google ScholarGoogle Scholar
  20. T. Kulesza, S. Stumpf, M. Burnett, S. Yang, I. Kwan, and W. K. Wong. 2013. Too much, too little, or just right? Ways explanations impact end users' mental models. In 2013 IEEE Symposium on Visual Languages and Human Centric Computing. 3--10.Google ScholarGoogle Scholar
  21. Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. 2016. How We Analyzed the COMPAS Recidivism Algorithm. https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm Accessed: 6/8/2018.Google ScholarGoogle Scholar
  22. Min Kyung Lee. 2018. Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management. Big Data & Society 5, 1 (2018), 2053951718756684. arXiv:Google ScholarGoogle ScholarCross RefCross Ref
  23. Xiaodan Liang, Liang Lin, Xiaohui Shen, Jiashi Feng, Shuicheng Yan, and Eric P Xing. 2017. Interpretable Structure-Evolving LSTM. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1010--1019.Google ScholarGoogle ScholarCross RefCross Ref
  24. Q Vera Liao and Wai-Tat Fu. 2014. Expert voices in echo chambers: effects of source expertise indicators on exposure to diverse opinions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2745--2754. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Brian Y Lim, Anind K Dey, and Daniel Avrahami. 2009. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2119--2128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems. 4765--4774. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Tim Miller. 2017. Explanation in artificial intelligence: insights from the social sciences. arXiv preprint arXiv:1706.07269 (2017).Google ScholarGoogle Scholar
  28. Menaka Narayanan, Emily Chen, Jeffrey He, Been Kim, Sam Gershman, and Finale Doshi-Velez. 2018. How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation. CoRR abs/1802.00682 (2018). arXiv:1802.00682 http://arxiv.org/abs/1802.00682Google ScholarGoogle Scholar
  29. Hema Raghavan, Omid Madani, and Rosie Jones. 2006. Active learning with feedback on features and instances. Journal of Machine Learning Research 7, Aug (2006), 1655--1686. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classiier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 1135--1144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Burr Settles. 2011. Closing the loop: Fast, interactive semi-supervised annotation with queries on features and instances. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 1467--1478. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Simone Stumpf, Vidya Rajaram, Lida Li, Margaret Burnett, Thomas Dietterich, Erin Sullivan, Russell Drummond, and Jonathan Herlocker. 2007. Toward harnessing user feedback for machine learning. In Proceedings of the 12th international conference on Intelligent user interfaces. ACM, 82--91. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. William R Swartout. 1985. Explaining and justifying expert consulting programs. In Computer-assisted medical decision making. Springer, 254--271.Google ScholarGoogle Scholar
  34. Alan B Tickle, Robert Andrews, Mostefa Golea, and Joachim Diederich. 1998. The truth will come to light: Directions and challenges in extracting the knowledge embedded within trained artificial neural networks. IEEE Transactions on Neural Networks 9, 6 (1998), 1057--1068. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Michael Veale, Max Van Kleek, and Reuben Binns. 2018. Fairness and Accountability Design Needs for Algorithmic Support in High-Stakes Public Sector Decision-Making. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). ACM, New York, NY, USA, Article 440, 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. (2017).Google ScholarGoogle Scholar
  37. Michael R Wick and William B Thompson. 1992. Reconstructive expert system explanation. Artificial Intelligence 54, 1 (1992), 33--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. 2017. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1171--1180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In International Conference on Machine Learning. 325--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. 2017. Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Explaining models: an empirical study of how explanations impact fairness judgment

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      IUI '19: Proceedings of the 24th International Conference on Intelligent User Interfaces
      March 2019
      713 pages
      ISBN:9781450362726
      DOI:10.1145/3301275

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 March 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      IUI '19 Paper Acceptance Rate71of282submissions,25%Overall Acceptance Rate746of2,811submissions,27%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader