research-article

DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization

Authors:
Tong Gao

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

,
Mira Dontcheva

Adobe Research, San Francisco, CA, USA

Adobe Research, San Francisco, CA, USA
View Profile

,
Eytan Adar

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

,
Zhicheng Liu

Adobe Research, San Francisco, CA, USA

Adobe Research, San Francisco, CA, USA
View Profile

,
Karrie G. Karahalios

University of Illinois, Urbana, IL, USA

University of Illinois, Urbana, IL, USA
View Profile

UIST '15: Proceedings of the 28th Annual ACM Symposium on User Interface Software & TechnologyNovember 2015Pages 489–500https://doi.org/10.1145/2807442.2807478

Published:05 November 2015Publication History

UIST '15: Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology

Pages 489–500

ABSTRACT

Answering questions with data is a difficult and time-consuming process. Visual dashboards and templates make it easy to get started, but asking more sophisticated questions often requires learning a tool designed for expert analysts. Natural language interaction allows users to ask questions directly in complex programs without having to learn how to use an interface. However, natural language is often ambiguous. In this work we propose a mixed-initiative approach to managing ambiguity in natural language interfaces for data visualization. We model ambiguity throughout the process of turning a natural language query into a visualization and use algorithmic disambiguation coupled with interactive ambiguity widgets. These widgets allow the user to resolve ambiguities by surfacing system decisions at the point where the ambiguity matters. Corrections are stored as constraints and influence subsequent queries. We have implemented these ideas in a system, DataTone. In a comparative study, we find that DataTone is easy to learn and lets users ask questions without worrying about syntax and proper question form.

Supplemental Material

uist2990-file4.mp4

mp4

8.1 MB

Download

p489.mp4

mp4

69.5 MB

Download

Available for Download

pdf

uist2990-paper.pdf (3 MB)

References

Agrawal, S., Chaudhuri, S., and Das, G. Dbxplorer: A system for keyword-based search over relational databases. In Data Engineering '02, IEEE (2002), 5--16. Google ScholarDigital Library
Androutsopoulos, I., Ritchie, G. D., and Thanisch, P. Natural language interfaces to databases--an introduction. Natural language engineering 1, 01 (1995), 29--81.Google Scholar
Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., and Sudarshan, S. Keyword searching and browsing in databases using banks. In Data Engineering '02, IEEE (2002), 431--440. Google ScholarDigital Library
Blunschi, L., Jossen, C., Kossmann, D., Mori, M., and Stockinger, K. Soda: Generating SQL for business users. Proceedings of the VLDB Endowment 5, 10 (2012), 932--943. Google ScholarDigital Library
Bostock, M., Ogievetsky, V., and Heer, J. D3 data-driven documents. Trans. on Vis. and Comp. Graphics (TVCG) 17, 12 (2011), 2301--2309. Google ScholarDigital Library
Casner, S. M. Task-analytic approach to the automated design of graphic presentations. ACM Transactions on Graphics (ToG) 10, 2 (1991), 111--151. Google ScholarDigital Library
Cleveland, W. S., et al. The elements of graphing data. Wadsworth Advanced Books and Software Monterey, CA, 1985. Google ScholarDigital Library
Cox, K., Grinter, R. E., Hibino, S. L., Jagadeesan, L. J., and Mantilla, D. A multi-modal natural language interface to an information visualization environment. International Journal of Speech Technology 4, 3--4 (2001), 297--314.Google ScholarCross Ref
Ge, R., and Mooney, R. J. A statistical semantic parser that integrates syntax and semantics. In Computational Natural Language Learning '05, Association for Computational Linguistics (2005), 9--16. Google ScholarDigital Library
Healey, C. G., Kocherlakota, S., Rao, V., Mehta, R., and St Amant, R. Visual perception and mixed-initiative interaction for assisted visualization design. Trans. on Vis. and Comp. Graphics (TVCG) 14, 2 (2008), 396--411. Google ScholarDigital Library
Hristidis, V., and Papakonstantinou, Y. Discover: Keyword search in relational databases. In VLDB'02, VLDB Endowment (2002), 670--681. Google ScholarDigital Library
Kate, R. J., and Mooney, R. J. Using string-kernels for learning semantic parsers. In ICCL-ACL'06, Association for Computational Linguistics (2006), 913--920. Google ScholarDigital Library
Li, F., and Jagadish, H. V. Nalir: an interactive natural language interface for querying relational databases. In SIGMOD'14, ACM (2014), 709--712. Google ScholarDigital Library
Li, Y., Yang, H., and Jagadish, H. Nalix: an interactive natural language interface for querying xml. In SIGMOD'05, ACM (2005), 900--902. Google ScholarDigital Library
Mackinlay, J. Automating the design of graphical presentations of relational information. ACM Trans. Graph. 5, 2 (Apr. 1986), 110--141. Google ScholarDigital Library
Mackinlay, J., Hanrahan, P., and Stolte, C. Show me: Automatic presentation for visual analysis. Trans. on Vis. and Comp. Graphics (TVCG) 13, 6 (2007), 1137--1144. Google ScholarDigital Library
Manning, C. D., and Schütze, H. Foundations of statistical natural language processing. MIT press, 1999. Google ScholarDigital Library
Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., and McClosky, D. The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL): System Demonstrations (2014), 55--60.Google ScholarCross Ref
Miller, G. A. Wordnet: a lexical database for english. Communications of the ACM 38, 11 (1995), 39--41. Google ScholarDigital Library
Popescu, A.-M., Armanasu, A., Etzioni, O., Ko, D., and Yates, A. Modern natural language interfaces to databases: Composing statistical parsing with semantic tractability. In Computational Linguistics '04, Association for Computational Linguistics (2004), 141. Google ScholarDigital Library
Popescu, A.-M., Etzioni, O., and Kautz, H. Towards a theory of natural language interfaces to databases. In IUI'03, ACM (2003), 149--157. Google ScholarDigital Library
Rao, V. R. Mixed-initiative techniques for assisted visualization, 2003.Google Scholar
Roth, S. F., Kolojejchick, J., Mattis, J., and Goldstein, J. Interactive graphic design using automatic presentation knowledge. In CHI'94, ACM (1994), 112--117. Google ScholarDigital Library
Roth, S. F., and Mattis, J. Automating the presentation of information. In Artificial Intelligence Applications 1991, vol. 1, IEEE (1991), 90--97.Google ScholarCross Ref
Satyanarayan, A., and Heer, J. Lyra: An interactive visualization design environment. In Computer Graphics Forum, vol. 33, Wiley Online Library (2014), 351--360. Google ScholarDigital Library
Schwarz, J., Hudson, S., Mankoff, J., and Wilson, A. D. A framework for robust and flexible handling of inputs with uncertainty. In UIST'10, ACM (2010), 47--56. Google ScholarDigital Library
Shilman, M., Tan, D. S., and Simard, P. Cuetip: a mixed-initiative interface for correcting handwriting errors. In UIST'06, ACM (2006), 323--332. Google ScholarDigital Library
Simitsis, A., Koutrika, G., and Ioannidis, Y. Précis: from unstructured keywords as queries to structured databases as answers. VLDB Journal 17, 1 (2008), 117--149. Google ScholarDigital Library
Stolte, C., Tang, D., and Hanrahan, P. Polaris: A system for query, analysis, and visualization of multidimensional relational databases. Trans. on Vis. and Comp. Graphics (TVCG) 8, 1 (2002), 52--65. Google ScholarDigital Library
Sun, Y., Leigh, J., Johnson, A., and Lee, S. Articulate: A semi-automated model for translating natural language queries into meaningful visualizations. In Smart Graphics, Springer (2010), 184--195. Google ScholarDigital Library
Tang, L. R., and Mooney, R. J. Using multiple clause constructors in inductive logic programming for semantic parsing. In ECML '01. Springer, 2001, 466--477. Google ScholarDigital Library
Tata, S., and Lohman, G. M. Sqak: doing more with keywords. In SIGMOD'08, ACM (2008), 889--902. Google ScholarDigital Library
Trifacta. Vega. http://trifacta.github.io/vega/.Google Scholar
Tufte, E. R., and Graves-Morris, P. The visual display of quantitative information, vol. 2. Graphics press Cheshire, CT, 1983. Google ScholarDigital Library
Wickham, H. ggplot2: Elegant Graphics for Data Analysis. Springer, New York, Aug. 2009. Google ScholarCross Ref
Wilkinson, L., Wills, D., Rope, D., Norton, A., and Dubbs, R. The grammar of graphics. Springer Science & Business Media, 2006. Google ScholarDigital Library
Wu, Z., and Palmer, M. Verbs semantics and lexical selection. ACL '94 (1994), 133--138. Google ScholarDigital Library
Xiao, C., Wang, W., Lin, X., Yu, J. X., and Wang, G. Efficient similarity joins for near-duplicate detection. ACM Trans. on DB Systems (TODS) 36, 3 (2011), 15. Google ScholarDigital Library
Zelle, J. M., and Mooney, R. J. Learning to parse database queries using inductive logic programming. In National Conference on Artificial Intelligence '96 (1996), 1050--1055. Google ScholarDigital Library

Index Terms

DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization
1. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

Voyager 2: Augmenting Visual Analysis with Partial View Specifications
CHI '17: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems

Visual data analysis involves both open-ended and focused exploration. Manual chart specification tools support question answering, but are often tedious for early-stage exploration where systematic data coverage is needed. Visualization recommenders ...
Read More
Visual Classification: Expert Knowledge Guides Machine Learning

Humans use intuition and experience to classify everything they perceive, but only if the distinguishing patterns are visible. Machine-learning algorithms can learn class information from data sets, but the created classes' meaning isn't always clear. A ...
Read More
DIVE: A Mixed-Initiative System Supporting Integrated Data Exploration Workflows
HILDA '18: Proceedings of the Workshop on Human-In-the-Loop Data Analytics

Generating knowledge from data is an increasingly important activity. This process of data exploration consists of multiple tasks: data ingestion, visualization, statistical analysis, and storytelling. Though these tasks are complementary, analysts ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
UIST '15: Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology
November 2015
686 pages
ISBN:9781450337793
DOI:10.1145/2807442
General Chair:
Celine Latulipe
UNC Charlotte, USA
,
Program Chairs:
Bjoern Hartmann
UC Berkeley, USA
,
Tovi Grossman
Autodesk Research, Canada
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 November 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
mixed-initiative interfaces
natural language interaction
visualization
Qualifiers
- research-article
Conference

Acceptance Rates
UIST '15 Paper Acceptance Rate70of297submissions,24%Overall Acceptance Rate842of3,967submissions,21%
More
Upcoming Conference
UIST '24

Sponsor:

sigchi

sigchi

UIST '24: The 37th Annual ACM Symposium on User Interface Software and Technology

October 13 - 16, 2024

Pittsburgh , PA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 132
  Total Citations
  View Citations
- 1,472
  Total Downloads
- Downloads (Last 12 months)264
- Downloads (Last 6 weeks)36
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization

UIST '15: Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Voyager 2: Augmenting Visual Analysis with Partial View Specifications

Visual Classification: Expert Knowledge Guides Machine Learning

DIVE: A Mixed-Initiative System Supporting Integrated Data Exploration Workflows