ABSTRACT
The goal of a visualization system is to facilitate dataset-driven insight discovery. But what if the insights are spurious? Features or patterns in visualizations can be perceived as relevant insights, even though they may arise from noise. We often compare visualizations to a mental image of what we are interested in: a particular trend, distribution or an unusual pattern. As more visualizations are examined and more comparisons are made, the probability of discovering spurious insights increases. This problem is well-known in Statistics as the multiple comparisons problem (MCP) but overlooked in visual analysis. We present a way to evaluate MCP in visualization tools by measuring the accuracy of user reported insights on synthetic datasets with known ground truth labels. In our experiment, over 60% of user insights were false. We show how a confirmatory analysis approach that accounts for all visual comparisons, insights and non-insights, can achieve similar results as one that requires a validation dataset.
Supplemental Material
- Georgia Albuquerque, Thomas Lowe, and Marcus Magnor. 2011. Synthetic generation of high-dimensional datasets. IEEE transactions on visualization and computer graphics 17, 12 (2011), 2317--2324. Google ScholarDigital Library
- Emmanuelle Anthoine, Leïla Moret, Antoine Regnault, Véronique Sébille, and Jean-Benoit Hardouin. 2014. Sample size used to validate a scale: a review of publications on newly-developed patient reported outcomes measures. Health and quality of life outcomes 12, 1 (2014), 2.Google Scholar
- Peter Ayton and Ilan Fischer. 2004. The hot hand fallacy and the gambler's fallacy: Two faces of subjective randomness? Memory&cognition 32, 8 (2004), 1369--1378.Google Scholar
- Maya Bar-Hillel. 1980. The base-rate fallacy in probability judgments. Acta Psychologica 44, 3 (1980), 211--233.Google ScholarCross Ref
- Yoav Benjamini. 2010. Simultaneous and selective inference: current successes and future challenges. Biometrical Journal 52, 6 (2010), 708--721.Google ScholarCross Ref
- Yoav Benjamini and Yosef Hochberg. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological) (1995), 289--300.Google Scholar
- Dimitri P. Bertsekas and John N. Tsitsiklis. 2002. Introduction to probability. Vol. 1. Athena Scientific Belmont, MA.Google Scholar
- Sabrina Bresciani and Martin J. Eppler. 2009. The risks of visualization. Identität und Vielfalt der Kommunikations-wissenschaft (2009), 165--178.Google Scholar
- Andreas Buja, Dianne Cook, Heike Hofmann, Michael Lawrence, Eun-Kyung Lee, Deborah F Swayne, and Hadley Wickham. 2009. Statistical inference for exploratory data analysis and model diagnostics. Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences 367, 1906 (2009), 4361--4383.Google ScholarCross Ref
- Stuart K. Card, Jock D. Mackinlay, and Ben Shneiderman. 1999. Readings in information visualization: using vision to think. Morgan Kaufmann. Google ScholarDigital Library
- Remco Chang, Caroline Ziemkiewicz, Tera Marie Green, and William Ribarsky. 2009. Defining insight for visual analytics. IEEE Computer Graphics and Applications 29, 2 (2009), 14--17.Google ScholarDigital Library
- William S. Cleveland and Robert McGill. 1984. Graphical perception: Theory, experimentation, and application to the development of graphical methods. Journal of the American statistical association 79, 387 (1984), 531--554.Google ScholarCross Ref
- Michael Correll and Jeffrey Heer. 2017. Regression by Eye: Estimating Trends in Bivariate Visualizations. In ACM Human Factors in Computing Systems (CHI). http://idl.cs.washington.edu/papers/regression-by-eye Google ScholarDigital Library
- Andrew Crotty, Alex Galakatos, Emanuel Zgraggen, Carsten Binnig, and Tim Kraska. 2015. Vizdom: interactive analytics through pen and touch. Proceedings of the VLDB Endowment 8, 12 (2015), 2024--2027. Google ScholarDigital Library
- Bernd Droge. 2006. Phillip Good: Permutation, parametric, and bootstrap tests of hypotheses. (2006).Google Scholar
- Olive Jean Dunn. 1961. Multiple comparisons among means. J. Amer. Statist. Assoc. 56, 293 (1961), 52--64.Google ScholarCross Ref
- Cynthia Dwork, Vitaly Feldman, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Aaron Leon Roth. 2015. Preserving statistical validity in adaptive data analysis. In STOC. ACM, 117--126. Google ScholarDigital Library
- Karl Anders Ericsson and Herbert Alexander Simon. 1993. Protocol analysis. MIT press Cambridge, MA.Google Scholar
- Dean P. Foster and Robert A. Stine. 2008. α-investing: a procedure for sequential control of expected false discoveries. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70, 2 (2008), 429--444.Google ScholarCross Ref
- Andrew Gelman and John Carlin. 2014. Beyond power calculations: Assessing Type S (sign) and Type M (magnitude) errors. Perspectives on Psychological Science 9, 6 (2014), 641--651.Google ScholarCross Ref
- Andrew Gelman and Eric Loken. 2013. The garden of forking paths: Why multiple comparisons can be a problem, even when there is no "fishing expedition" or "p-hacking" and the research hypothesis was posited ahead of time. Department of Statistics, Columbia University (2013).Google Scholar
- Andrew Gelman and Francis Tuerlinckx. 2000. Type S error rates for classical and Bayesian single and multiple comparison procedures. Computational Statistics 15, 3 (2000), 373--390.Google ScholarCross Ref
- Steven R. Gomez, Hua Guo, Caroline Ziemkiewicz, and David H. Laidlaw. 2014. An insight-and task-based methodology for evaluating spatiotemporal visual analytics. In Visual Analytics Science&Technology (VAST), 2014 IEEE Conference on. IEEE, 63--72.Google Scholar
- Connor C. Gramazio, Karen B. Schloss, and David H. Laidlaw. 2014. The relation between visualization size, grouping, and user performance. IEEE transactions on visualization and computer graphics 20, 12 (2014), 1953--1962.Google Scholar
- Thomas L. Griffiths and Joshua B. Tenenbaum. 2001. Randomness and coincidences: Reconciling intuition and probability theory. In Proceedings of the 23rd annual conference of the cognitive science society. University of Edinburgh Edinburgh, 370--375.Google Scholar
- Hua Guo, Steven R. Gomez, Caroline Ziemkiewicz, and David H. Laidlaw. 2016. A case study using visualization interaction logs and insight metrics to understand how analysts arrive at insights. IEEE transactions on visualization and computer graphics 22, 1 (2016), 51--60.Google Scholar
- Pat Hanrahan. 2012. Analytic database technologies for a new kind of user: the data enthusiast. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 577--578. Google ScholarDigital Library
- Megan L. Head, Luke Holman, Rob Lanfear, Andrew T. Kahn, and Michael D. Jennions. 2015. The extent and consequences of p-hacking in science. PLoS Biol 13, 3 (2015), e1002106.Google ScholarCross Ref
- Sandra L. Hubscher and August Strindberg. 2007. Apophenia: Definition and analysis. Digital Bits Skeptic (2007).Google Scholar
- UC Irvine. 2017. UC Irvine Machine Learning Repository. (2017). http://archive.ics.uci.edu/ml/.Google Scholar
- Youn-ah Kang and John Stasko. 2012. Examining the use of a visual analytics system for sensemaking tasks: Case studies with domain experts. IEEE Transactions on Visualization and Computer Graphics 18, 12 (2012), 2869--2878. Google ScholarDigital Library
- Jonathan J. Koehler. 1996. The base rate fallacy reconsidered: Descriptive, normative, and methodological challenges. Behavioral and brain sciences 19, 1 (1996), 1--17.Google Scholar
- Robert Ladouceur, Claude Paquet, and Dominique Dubé. 1996. Erroneous Perceptions in Generating Sequences of Random Events. Journal of Applied Social Psychology 26, 24 (1996), 2157--2166.Google ScholarCross Ref
- Jing Li, Jean-Bernard Martens, and Jarke J. Van Wijk. 2010. Judging correlation from scatterplots and parallel coordinate plots. Information Visualization 9, 1 (2010), 13--30. Google ScholarDigital Library
- Zhicheng Liu and Jeffrey Heer. 2014. The effects of interactive latency on exploratory visual analysis. IEEE transactions on visualization and computer graphics 20, 12 (2014), 2122--2131.Google ScholarCross Ref
- Mahbubul Majumder, Heike Hofmann, and Dianne Cook. 2013. Validation of visual statistical inference, applied to linear models. J. Amer. Statist. Assoc. 108, 503 (2013), 942--956.Google ScholarCross Ref
- Chris North. 2006. Toward measuring visualization insight. IEEE computer graphics and applications 26, 3 (2006), 6--9. Google ScholarDigital Library
- Catherine Plaisant. 2004. The challenge of information visualization evaluation. In Proceedings of the working conference on Advanced visual interfaces. ACM, 109--116. Google ScholarDigital Library
- Ronald A. Rensink and Gideon Baldridge. 2010. The perception of correlation in scatterplots. In Computer Graphics Forum, Vol. 29. Wiley Online Library, 1203--1210. Google ScholarDigital Library
- Purvi Saraiya, Chris North, and Karen Duca. 2005. An insight-based methodology for evaluating bioinformatics visualizations. IEEE transactions on visualization and computer graphics 11, 4 (2005), 443--456. Google ScholarDigital Library
- Tarique Siddiqui, Albert Kim, John Lee, Karrie Karahalios, and Aditya Parameswaran. 2016. Effortless data exploration with zenvisage: an expressive and interactive visual analytics system. Proceedings of the VLDB Endowment 10, 4 (2016), 457--468. Google ScholarDigital Library
- Tarique Siddiqui, John Lee, Albert Kim, Edward Xue, Xiaofo Yu, Sean Zou, Lijin Guo, Changfeng Liu, Chaoran Wang, Karrie Karahalios, and others. 2017. Fast-Forwarding to Desired Visualizations with Zenvisage. In CIDR.Google Scholar
- James Sundali and Rachel Croson. 2006. Biases in casino betting: The hot hand and the gambler's fallacy. Judgment and Decision Making 1, 1 (2006), 1.Google Scholar
- Tableau. 2017. Tableau Product Description. (2017). https://www.tableau.com/products/desktop.Google Scholar
- TIBCO. 2017. TIBCO Spotfire Product Description. (2017). http://spotfire.tibco.com/data-discovery.Google Scholar
- Christian Tominski. 2006. Event based visualization for user centered visual analysis. Ph.D. Dissertation.Google Scholar
- John W. Tukey. 1977. Exploratory data analysis. Addison-Wesley Series in Behavioral Science: Quantitative Methods, Reading, Mass.: Addison-Wesley, 1977 (1977).Google Scholar
- Jarke J. Van Wijk. 2006. Views on visualization. IEEE transactions on visualization and computer graphics 12, 4 (2006), 421--432. Google ScholarDigital Library
- Manasi Vartak and others. 2015. SEEDB: Efficient Data-Driven Visualization Recommendations to Support Visual Analytics. PVLDB 8, 13 (2015). Google ScholarDigital Library
- Hadley Wickham, Dianne Cook, Heike Hofmann, and Andreas Buja. 2010. Graphical inference for infovis. IEEE Transactions on Visualization and Computer Graphics 16, 6 (2010), 973--979. Google ScholarDigital Library
- Stefan Wilhelm and others. 2012. Moments calculation for the doubly truncated multivariate normal density. arXiv preprint arXiv:1206.5387 (2012).Google Scholar
- Emanuel Zgraggen, Alex Galakatos, Andrew Crotty, Jean-Daniel Fekete, and Tim Kraska. 2016. How Progressive Visualizations Affect Exploratory Analysis. IEEE Transactions on Visualization and Computer Graphics (2016).Google Scholar
- Emanuel Zgraggen, Robert Zeleznik, and Steven M. Drucker. 2014. Panoramicdata: Data analysis through pen&touch. IEEE transactions on visualization and computer graphics 20, 12 (2014), 2112--2121.Google Scholar
- Zheguang Zhao, Lorenzo De Stefani, Emanuel Zgraggen, Carsten Binnig, Eli Upfal, and Tim Kraska. 2016. Controlling False Discoveries During Interactive Data Exploration. arXiv preprint arXiv:1612.01040 (2016).Google Scholar
Index Terms
- Investigating the Effect of the Multiple Comparisons Problem in Visual Analysis
Recommendations
Controlling False Discoveries During Interactive Data Exploration
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of DataRecent tools for interactive data exploration significantly increase the chance that users make false discoveries. They allow users to (visually) examine many hypotheses and make inference with simple interactions, and thus incur the issue commonly ...
GIAnT: Visualizing Group Interaction at Large Wall Displays
CHI '17: Proceedings of the 2017 CHI Conference on Human Factors in Computing SystemsLarge interactive displays are increasingly important and a relevant research topic, and several studies have focused on wall interaction. However, in many cases, thorough user studies currently require time-consuming video analysis and coding. We ...
A Visual Method for High-Dimensional Data Cluster Exploration
ICONIP '09: Proceedings of the 16th International Conference on Neural Information Processing: Part IIVisualization is helpful for clustering high dimensional data. The goals of visualization in data mining are exploration, confirmation and presentation of the clustering results. However, the most of visual techniques developed for cluster analysis are ...
Comments