research-article

Public Access

Explaining models: an empirical study of how explanations impact fairness judgment

Authors:
Jonathan Dodge

Oregon State University

Oregon State University
View Profile

,
Q. Vera Liao

IBM Research AI

IBM Research AI
View Profile

,
Yunfeng Zhang

IBM Research AI

IBM Research AI
View Profile

,
Rachel K. E. Bellamy

IBM Research AI

IBM Research AI
View Profile

,
Casey Dugan

IBM Research AI

IBM Research AI
View Profile

IUI '19: Proceedings of the 24th International Conference on Intelligent User InterfacesMarch 2019Pages 275–285https://doi.org/10.1145/3301275.3302310

Published:17 March 2019Publication History

IUI '19: Proceedings of the 24th International Conference on Intelligent User Interfaces

Pages 275–285

ABSTRACT

Ensuring fairness of machine learning systems is a human-in-the-loop process. It relies on developers, users, and the general public to identify fairness problems and make improvements. To facilitate the process we need effective, unbiased, and user-friendly explanations that people can confidently rely on. Towards that end, we conducted an empirical study with four types of programmatically generated explanations to understand how they impact people's fairness judgments of ML systems. With an experiment involving more than 160 Mechanical Turk workers, we show that: 1) Certain explanations are considered inherently less fair, while others can enhance people's confidence in the fairness of the algorithm; 2) Different fairness problems-such as model-wide fairness issues versus case-specific fairness discrepancies-may be more effectively exposed through different styles of explanation; 3) Individual differences, including prior positions and judgment criteria of algorithmic fairness, impact how people react to different styles of explanation. We conclude with a discussion on providing personalized and adaptive explanations to support fairness judgments of ML systems.

Supplemental Material

p275-dodge.mp4

mp4

96.4 MB

Download

References

2016. Artificial Intelligence's White Guy Problem. https://www.nytimes.com/2016/06/26/opinion/sunday/artificial-intelligences-white-guy-problem.html Accessed: 6/8/2018.Google Scholar
2017. Rise of the racist robots-how AI is learning all our worst impulses. https://www.theguardian.com/inequality/2017/aug/08/rise-of-the-racist-robots-how-ai-is-learning-all-our-worst-impulses Accessed: 6/8/2018.Google Scholar
2018. COMPAS Recidivism Risk Score Data and Analysis. https://www.propublica.org/datastore/dataset/compas-recidivism-risk-score-data-and-analysis Accessed: 6/8/2018.Google Scholar
Ashraf Abdul, Jo Vermeulen, Danding Wang, Brian Y Lim, and Mohan Kankanhalli. 2018. Trends and trajectories for explainable, accountable and intelligible systems: An hci research agenda. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 582. Google ScholarDigital Library
Reuben Binns, Max Van Kleek, Michael Veale, Ulrik Lyngs, Jun Zhao, and Nigel Shadbolt. 2018. 'It's Reducing a Human Being to a Percentage': Perceptions of Justice in Algorithmic Decisions. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). ACM, New York, NY, USA, Article 377, 14 pages. Google ScholarDigital Library
Tim Brennan, William Dieterich, and Beate Ehret. 2009. Evaluating the Predictive Validity of the Compas Risk and Needs Assessment System. Criminal Justice and Behavior 36, 1 (2009), 21--40.Google ScholarCross Ref
John T Cacioppo, Richard E Petty, and Chuan Feng Kao. 1984. The efficient assessment of need for cognition. Journal of personality assessment 48, 3 (1984), 306--307.Google ScholarCross Ref
Toon Calders and Indrė Žliobaitė. 2013. Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures. Springer Berlin Heidelberg, Berlin, Heidelberg, 43--57.Google Scholar
Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Rama-murthy, and Kush R Varshney. 2017. Optimized Pre-Processing for Discrimination Prevention. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 3992--4001. http://papers.nips.cc/paper/6988-optimized-pre-processing-for-discrimination-prevention.pdf Google ScholarDigital Library
Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in neural information processing systems. 2172--2180. Google ScholarDigital Library
William J Clancey. 1983. The epistemology of a rule-based expert system-a framework for explanation. Artificial intelligence 20, 3 (1983), 215--251.Google Scholar
Duncan Cramer and Dennis Laurence Howitt. 2004. The Sage dictionary of statistics: a practical resource for students in the social sciences. Sage.Google Scholar
Philip M Fernbach, Steven A Sloman, Robert St Louis, and Julia N Shube. 2012. Explanation fiends and foes: How mechanistic detail determines understanding and preference. Journal of Consumer Research 39, 5 (2012), 1115--1131.Google ScholarCross Ref
Nina Grgic-Hlaca, Elissa M. Redmiles, Krishna P. Gummadi, and Adrian Weller. 2018. Human Perceptions of Fairness in Algorithmic Decision Making: A Case Study of Criminal Risk Prediction. In Proceedings of the 2018 World Wide Web Conference (WWW '18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 903--912. Google ScholarDigital Library
Sara Hajian, Francesco Bonchi, and Carlos Castillo. 2016. Algorithmic bias: From discrimination discovery to fairness-aware data mining. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 2125--2126. Google ScholarDigital Library
Matthew Joseph, Michael Kearns, Jamie H Morgenstern, and Aaron Roth. 2016. Fairness in learning: Classic and contextual bandits. In Advances in Neural Information Processing Systems. 325--333. Google ScholarDigital Library
Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 35--50.Google ScholarCross Ref
Been Kim, Rajiv Khanna, and Oluwasanmi O Koyejo. 2016. Examples are not enough, learn to criticize! criticism for interpretability. In Advances in Neural Information Processing Systems. 2280--2288. Google ScholarDigital Library
Gary Klein, Louise Rasmussen, Mei-Hua Lin, Robert R Hoffman, and Jason Case. 2014. Influencing preferences for different types of causal explanation of complex events. Human factors 56, 8 (2014), 1380--1400.Google Scholar
T. Kulesza, S. Stumpf, M. Burnett, S. Yang, I. Kwan, and W. K. Wong. 2013. Too much, too little, or just right? Ways explanations impact end users' mental models. In 2013 IEEE Symposium on Visual Languages and Human Centric Computing. 3--10.Google Scholar
Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. 2016. How We Analyzed the COMPAS Recidivism Algorithm. https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm Accessed: 6/8/2018.Google Scholar
Min Kyung Lee. 2018. Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management. Big Data & Society 5, 1 (2018), 2053951718756684. arXiv:Google ScholarCross Ref
Xiaodan Liang, Liang Lin, Xiaohui Shen, Jiashi Feng, Shuicheng Yan, and Eric P Xing. 2017. Interpretable Structure-Evolving LSTM. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1010--1019.Google ScholarCross Ref
Q Vera Liao and Wai-Tat Fu. 2014. Expert voices in echo chambers: effects of source expertise indicators on exposure to diverse opinions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2745--2754. Google ScholarDigital Library
Brian Y Lim, Anind K Dey, and Daniel Avrahami. 2009. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2119--2128. Google ScholarDigital Library
Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems. 4765--4774. Google ScholarDigital Library
Tim Miller. 2017. Explanation in artificial intelligence: insights from the social sciences. arXiv preprint arXiv:1706.07269 (2017).Google Scholar
Menaka Narayanan, Emily Chen, Jeffrey He, Been Kim, Sam Gershman, and Finale Doshi-Velez. 2018. How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation. CoRR abs/1802.00682 (2018). arXiv:1802.00682 http://arxiv.org/abs/1802.00682Google Scholar
Hema Raghavan, Omid Madani, and Rosie Jones. 2006. Active learning with feedback on features and instances. Journal of Machine Learning Research 7, Aug (2006), 1655--1686. Google ScholarDigital Library
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classiier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 1135--1144. Google ScholarDigital Library
Burr Settles. 2011. Closing the loop: Fast, interactive semi-supervised annotation with queries on features and instances. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 1467--1478. Google ScholarDigital Library
Simone Stumpf, Vidya Rajaram, Lida Li, Margaret Burnett, Thomas Dietterich, Erin Sullivan, Russell Drummond, and Jonathan Herlocker. 2007. Toward harnessing user feedback for machine learning. In Proceedings of the 12th international conference on Intelligent user interfaces. ACM, 82--91. Google ScholarDigital Library
William R Swartout. 1985. Explaining and justifying expert consulting programs. In Computer-assisted medical decision making. Springer, 254--271.Google Scholar
Alan B Tickle, Robert Andrews, Mostefa Golea, and Joachim Diederich. 1998. The truth will come to light: Directions and challenges in extracting the knowledge embedded within trained artificial neural networks. IEEE Transactions on Neural Networks 9, 6 (1998), 1057--1068. Google ScholarDigital Library
Michael Veale, Max Van Kleek, and Reuben Binns. 2018. Fairness and Accountability Design Needs for Algorithmic Support in High-Stakes Public Sector Decision-Making. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). ACM, New York, NY, USA, Article 440, 14 pages. Google ScholarDigital Library
Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. (2017).Google Scholar
Michael R Wick and William B Thompson. 1992. Reconstructive expert system explanation. Artificial Intelligence 54, 1 (1992), 33--70. Google ScholarDigital Library
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. 2017. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1171--1180. Google ScholarDigital Library
Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In International Conference on Machine Learning. 325--333. Google ScholarDigital Library
Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. 2017. Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Google ScholarCross Ref

Index Terms

Explaining models: an empirical study of how explanations impact fairness judgment
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Empirical studies in HCI

Recommendations

The Use and Misuse of Counterfactuals in Ethical Machine Learning
FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency

The use of counterfactuals for considerations of algorithmic fairness and explainability is gaining prominence within the machine learning community and industry. This paper argues for more caution with the use of counterfactuals when the facts to be ...
Read More
Explaining Disease: Correlations, Causes, and Mechanisms

Why do people get sick? I argue that a disease explanation is best thought of as causal network instantiation, where a causal network describes the interrelations among multiple factors, and instantiation consists of observational or hypothetical ...
Read More
Relay-volunteered multi-rate cooperative MAC protocol for IEEE 802.11 WLANs

In IEEE 802.11, the rate of a station (STA) is dynamically determined by link adaptation. Low-rate STAs tend to hog more channel time than high-rate STAs due to fair characteristics of carrier sense multiple access/collision avoidance, leading to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IUI '19: Proceedings of the 24th International Conference on Intelligent User Interfaces
March 2019
713 pages
ISBN:9781450362726
DOI:10.1145/3301275
General Chairs:
Wai-Tat Fu,
Shimei Pan,
Program Chairs:
Oliver Brdiczka,
Polo Chau,
Gaelle Calvary
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 March 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Honorable Mention
Author Tags
empirical studies
explanation
fairness
machine learning
Qualifiers
- research-article
Conference

Acceptance Rates
IUI '19 Paper Acceptance Rate71of282submissions,25%Overall Acceptance Rate746of2,811submissions,27%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 172
  Total Citations
  View Citations
- 4,236
  Total Downloads
- Downloads (Last 12 months)829
- Downloads (Last 6 weeks)97
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Explaining models: an empirical study of how explanations impact fairness judgment

IUI '19: Proceedings of the 24th International Conference on Intelligent User Interfaces

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

The Use and Misuse of Counterfactuals in Ethical Machine Learning

Explaining Disease: Correlations, Causes, and Mechanisms

Relay-volunteered multi-rate cooperative MAC protocol for IEEE 802.11 WLANs