research-article

On Improving User Response Times in Tableau

Authors:
Pawel Terlecki

Tableau Software, Seattle, WA, USA

Tableau Software, Seattle, WA, USA
View Profile

,
Fei Xu

Tableau Software, Seattle, WA, USA

Tableau Software, Seattle, WA, USA
View Profile

,
Marianne Shaw

Tableau Software, Seattle, WA, USA

Tableau Software, Seattle, WA, USA
View Profile

,
Valeri Kim

Tableau Software, Seattle, WA, USA

Tableau Software, Seattle, WA, USA
View Profile

,
Richard Wesley

Tableau Software, Seattle, WA, USA

Tableau Software, Seattle, WA, USA
View Profile

SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataMay 2015Pages 1695–1706https://doi.org/10.1145/2723372.2742799

Published:27 May 2015Publication History

SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data

Pages 1695–1706

ABSTRACT

The rapid increase in data volumes and complexity of applied analytical tasks poses a big challenge for visualization solutions. It is important to keep the experience highly interactive, so that users stay engaged and can perform insightful data exploration. Query processing usually dominates the cost of visualization generation. Therefore, in order to achieve acceptable response times, one needs to utilize backend capabilities to the fullest and apply techniques, such as caching or prefetching. In this paper we discuss key data processing components in Tableau: the query processor, query caches, Tableau Data Engine [1, 2] and Data Server. Furthermore, we cover recent performance improvements related to the number and quality of remote queries, broader reuse of cached data, and application of inter and intra query parallelism.

References

Richard Wesley, Matthew Eldridge, and Pawel T. Terlecki. 2011. An analytic data engine for visualization in tableau. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (SIGMOD '11). ACM, New York, NY, USA, 1185--1194. DOI= http://doi.acm.org/10.1145/1989323.1989449 Google ScholarDigital Library
Richard Michael Grantham Wesley and Pawel Terlecki. 2014. Leveraging compression in the tableau data engine. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data (SIGMOD '14). ACM, New York, NY, USA, 563--573. DOI=http://doi.acm.org/10.1145/2588555.2595639 Google ScholarDigital Library
Boncz, P., Zukowski, M., and Nes, N. MonetDB/X100: Hyper-Pipelining Query Execution. In International Conference on Innovative Data Systems Research (CIDR), Jan. 2005, 225--237.Google Scholar
G. Graefe, "Volcano: An extensible and parallel query evaluation system," IEEE Transactions on Knowledge and Data Engineering, 120--135, 1994. Google ScholarDigital Library
J. Zhou, P. Larson, and R. Chaiken. Incorporating partitioning and parallel plans into the SCOPE optimizer. In ICDE, 2010.Google ScholarCross Ref
LibXL http://www.libxl.com/Google Scholar
Abadi, D. J., Madden, S. R., and Hachem, N. 2008. Column-stores vs. row-stores: how different are they really? In Proceedings of the 2008 ACM SIGMOD international Conference on Management of Data (Vancouver, Canada, June 09 - 12, 2008). SIGMOD '08. ACM, New York, NY, 967--980. Google ScholarDigital Library
Boncz, P. Monet: A Next-Generation DBMS Kernel For Query-Intensive Applications. Doctoral Thesis, Universiteit van Amsterdam, Amsterdam, The Netherlands, May 2002.Google Scholar
Zukowski, Marcin, and Peter A. Boncz. "Vectorwise: Beyond column stores."IEEE Data Engineering Bulletin 35.1 (2012): 21--27.Google Scholar
Shivnath Babu and Herodotos Herodotou (2013), "Massively Parallel Databases and MapReduce Systems", Foundations and Trends® in Databases: Vol. 5: No. 1, pp 1--104. http://dx.doi.org/10.1561/1900000036 Google ScholarDigital Library
Franz Färber, Norman May, Wolfgang Lehner, Philipp Große, Ingo Müller, Hannes Rauhe, and Jonathan Dees. The SAP HANA Database -- An Architecture Overview. IEEE Data Engineering Bulletin, 35(1):28'33, 2012.Google Scholar
Anikiej K. Multi-core Parallelization of Vectorized Queries {dissertation}. University of Warsaw and VU University of Amsterdam, 2010.Google Scholar
P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In Proceedings of SIGMOD Conference, 1979. Google ScholarDigital Library
M. Majster-Cederbaum. Elimination of redundant operations in relational queries with general selection operators. Computing, 34(4):303--323, 1984. Google ScholarDigital Library
A. V. Aho, C. Beeri, and J. D. Ullman. The theory of joins in relational databases. ACM Trans. on Database Systems, 4(3): 297--314, 1979. Google ScholarDigital Library
A. V. Aho, Y. Sagiv, and J. D. Ullman. Efficient optimization of a class of relational expression. ACM Trans. on Database Systems, 4(4):435--454, 1979. Google ScholarDigital Library
Nikolaus Ott, Klaus Horländer, Removing redundant join operations in queries involving views, Information Systems, Volume 10, Issue 3, 1985, Pages 279--288 Google ScholarDigital Library
Y. Sagiv and M. Yannakakis. Equivalences among relational expressions with the union and difference operator. Journal of the ACM, 27(4):633--655, 1980. Google ScholarDigital Library
Halevy, Alon Y. "Answering queries using views: A survey." The VLDB Journal 10.4 (2001): 270--294. Google ScholarDigital Library
Sara Cohen, Werner Nutt, and Yehoshua Sagiv. 2003. Containment of Aggregate Queries. In Proceedings of the 9th International Conference on Database Theory (ICDT '03), Diego Calvanese, Maurizio Lenzerini, and Rajeev Motwani (Eds.). Springer-Verlag, London, UK, UK, 111--125. Google ScholarDigital Library
Chandra A.K., Merlin P.M. Optimal implementation of conjunctive queries in relational databases. In: Proc. Ninth AnnualACMSymposium on Theory of Computing.pp 77'90, 1977 Google ScholarDigital Library
Zhang X., Ozsoyoglu M.Z. On efficient reasoning with implication constraints. In: Proc. of DOOD. pp 236'252, 1993Google ScholarCross Ref
Chaudhuri S., Vardi M. Optimizing real conjunctive queries. In: Proc. of PODS. pp 59'70, Washington D.C., USA, 1993 Google ScholarDigital Library
Chaudhuri S., Vardi M. On the complexity of equivalence between recursive and nonrecursive datalog programs. In: Proc. of PODS. pp 55'66, Minneapolis, Minn., USA, 1994 Google ScholarDigital Library
Kolaitis P., Martin D., Thakur M. On the complexity of the containment problem for conjunctive queries with built-in predicates. In: Proc. of PODS. pp 197'204, Seattle,Wash., USA, 1998 Google ScholarDigital Library
Tsatalos O.G., Solomon M.H., Ioannidis Y.E. The GMAP: a versatile tool for physical data independence. In: Proc. of VLDB. pp 367'378, Santiago, Chile, 1994 Google ScholarDigital Library
Tsatalos O.G., Solomon M.H., Ioannidis Y.E. The GMAP: a versatile tool for physical data independence. VLDB J. (2):101'118, 1996 Google ScholarDigital Library
Chaudhuri, S., Krishnamurthy, R., Potamianos, S., & Shim, K. (1995, March). Optimizing queries with materialized views. In 2013 IEEE 29th International Conference on Data Engineering (ICDE) (pp. 190--190). IEEE Computer Society. Google ScholarDigital Library
Goldstein J., Larson P.A. Optimizing queries using materialized views: a practical, scalable solution. In: Proc. of SIGMOD. pp 331'342, 2001 Google ScholarDigital Library
JarkeM. Common subexpression isolation in multiple query optimization. Query Processing in Database Systems, KimW, Reiner DS, Batory DS (eds.). Springer: Berlin, 1985Google Scholar
Park J, Segev A. Using common subexpressions to optimize multiple queries. Proceedings of the 4th International Conference on Data Engineering. IEEE Computer Society: Washington, DC, 1988; 311'319. Google ScholarDigital Library
Sellis T. Multiple query optimization. ACM Transactions on Database Systems 1988; 13(1):23'52. Google ScholarDigital Library
Cosar A, Lim E, Srivastava J. Multiple query optimization with depth-first branch-and-bound and dynamic query ordering. CIKM 93, Proceedings of the Second International Conference on Information and Knowledge Management. ACM, 1993; 433'438. Google ScholarDigital Library
Chen F, Dunham M. Common subexpression processing in multiple-query processing. IEEE Transactions on Knowledge and Data Engineering 1988; 10(3):493'499. Google ScholarDigital Library
Roy P et al. Efficient and extensible algorithms for multi query optimization. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. ACM Press: New York, 2000; 249'260. Google ScholarDigital Library
Tan K, Lu H. Workload scheduling for multiple query processing. Information Processing Letters 1995; 55(5):251'257. Google ScholarDigital Library
Tan K, Lu H. Scheduling multiple queries in symmetric multiprocessors. Information Sciences 1996; 95(1/2):125'153. Google ScholarDigital Library
Dalvi N et al. Pipelining in multi-query optimization. J. Comput. Syst. Sci. 2003; 66(4):728'762. Google ScholarDigital Library
O'Gorman, Kevin, Amr El Abbadi, and Divyakant Agrawal. "Multiple query optimization in middleware using query teamwork." Software: Practice and Experience 35.4 (2005): 361--391. Google ScholarDigital Library
Stolte, C., Tang, D., and Hanrahan, P. 2008. Polaris: a system for query, analysis, and visualization of multidimensional databases. Commun. ACM 51, 11 (Nov. 2008), 75--84. Google ScholarDigital Library
http://redis.io/Google Scholar
Lakshman, Avinash, and Prashant Malik. "Cassandra: a decentralized structured storage system." ACM SIGOPS Operating Systems Review 44.2 (2010): 35--40. Google ScholarDigital Library
https://www.faa.gov/data_research/Google Scholar
Milena G. Ivanova, Martin L. Kersten, Niels J. Nes, and Romulo A.P. Gonçalves. 2009. An architecture for recycling intermediates in a column-store. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data (SIGMOD '09), Carsten Binnig and Benoit Dageville (Eds.). ACM, New York, NY, USA, 309--320 Google ScholarDigital Library
Parag Agrawal , Daniel Kifer , Christopher Olston, Scheduling shared scans of large data files, Proceedings of the VLDB Endowment, v.1 n.1, August 2008 Google ScholarDigital Library
Prasanth Jayachandran, Karthik Tunga, Niranjan Kamat, Arnab Nandi. Combining User Interaction, Speculative Query Execution and Sampling in the DICE System. PVLDB 7(13): 1697--1700 (2014) Google ScholarDigital Library
Kristi Morton, Ross Bunker, Jock Mackinlay, Robert Morton, and Chris Stolte. 2012. Dynamic workload driven data integration in tableau. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD '12). ACM, New York, NY, USA, 807--816. http://doi.acm.org/10.1145/2213836.2213961 Google ScholarDigital Library
Shaul Dar , Michael J. Franklin , Björn Þór Jónsson , Divesh Srivastava , Michael Tan. Semantic Data Caching and Replacement, Proceedings of the 22th International Conference on Very Large Data Bases, p.330--341, September 03-06, 1996 Google ScholarDigital Library

Index Terms

On Improving User Response Times in Tableau
1. Information systems

Recommendations

An analytic data engine for visualization in tableau
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

Efficient data processing is critical for interactive visualization of analytic data sets. Inspired by the large amount of recent research on column-oriented stores, we have developed a new specialized analytic data engine tightly-coupled with the ...
Read More
Cache conscious star-join in MapReduce environments
Cloud-I '13: Proceedings of the 2nd International Workshop on Cloud Intelligence

With the popularity of big data and cloud computing, data parallel framework MapReduce based data warehouse systems are used widely. Column store is a default data placement in these systems. Traditionally star join is a core operation in the data ...
Read More
Exploratory Visualization of Surgical Training Databases for Improving Skill Acquisition

A new visualization system analyzes multidimensional surgical performance databases of information collected via emerging surgical robot and simulator technologies. In particular, it has visualized force, position, rotation, and synchronized video data ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
May 2015
2110 pages
ISBN:9781450327589
DOI:10.1145/2723372
General Chair:
Timos Sellis
RMIT University, Australia
,
Program Chairs:
Susan B. Davidson
University of Pennsylvania, USA
,
Zack Ives
University of Pennsylvania, USA
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 May 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
column store
concurrency
data server
data visualization
query batching
tableau data engine
Qualifiers
- research-article
Conference

Acceptance Rates
SIGMOD '15 Paper Acceptance Rate106of415submissions,26%Overall Acceptance Rate785of4,003submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 16
  Total Citations
  View Citations
- 641
  Total Downloads
- Downloads (Last 12 months)39
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

On Improving User Response Times in Tableau

SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data

ABSTRACT

References

Cited By

Index Terms

Recommendations

An analytic data engine for visualization in tableau

Cache conscious star-join in MapReduce environments

Exploratory Visualization of Surgical Training Databases for Improving Skill Acquisition