skip to main content
10.1145/3209950.3209958acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Snowtrail: Testing with Production Queries on a Cloud Database

Published:15 June 2018Publication History

ABSTRACT

Database as a service provided on cloud computing platforms has been rapidly gaining popularity in recent years. The Snowflake Elastic Data Warehouse (henceforth referred to as Snowflake) is a cloud database service provided by Snowflake Computing. The cloud native capabilities of new database services such as Snowflake bring exciting new opportunities for database testing. First, Snowflake maintains extensive knowledge of historical customer queries, including both the query text and corresponding system configurations. Second, Snowflake is multi-tenant, which provides easy access to metadata and data that can be used to rerun customer queries from a privileged role. Furthermore, the elastic nature of Snowflake's data warehouse service allows testing with these queries using a separate set of resources without impacting the customer's production workload.

This paper presents Snowtrail, an infrastructure developed within Snowflake for testing using customer production queries with result obfuscation. Running tests with production queries provides us with direct insight into the impact of improvements and new features on customer workloads. It enables testing on queries of more shapes and complexity than can be manually constructed by developers. Snowtrail is also used to help ensure the stability of the online upgrade process of the system.

References

  1. Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Vadim Antonov, Artin Avanes, Jon Bock, Jonathan Claybaugh, Daniel Engovatov, Martin Hentschel, Jiansheng Huang, Allison W. Lee, Ashish Motivala, Abdul Q. Munir, Steven Pelley, Peter Povinec, Greg Rahn, Spyridon Triantafyllis, and Philipp Unterbrunner. 2016. The Snowflake Elastic Data Warehouse. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD '16). ACM, New York, NY, USA, 215--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Snowflake Documentation. 2016. Multi-cluster Warehouses. (2016). https://docs.snowflake.net/manuals/user-guide/warehouses-multicluster.htmlGoogle ScholarGoogle Scholar
  3. Leonidas Galanis, Supiti Buranawatanachoke, Romain Colle, Benoît Dageville, Karl Dias, Jonathan Klein, Stratos Papadomanolakis, Leng Leng Tan, Venkateshwaran Venkataramani, Yujun Wang, et al. 2008. Oracle database replay. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, 1159--1170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Jain, B. Howe, J. Yan, and T. Cruanes. 2018. Query2Vec: An Evaluation of NLP Techniques for Generalized Workload Analytics. ArXiv e-prints (Jan. 2018). arXiv:cs.DB/1801.05613Google ScholarGoogle Scholar
  5. Trupti M Kodinariya and Prashant R Makwana. 2013. Review on determining number of Cluster in K-Means Clustering. International Journal 1, 6 (2013), 90--95.Google ScholarGoogle Scholar
  6. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, Nov (2008), 2579--2605.Google ScholarGoogle Scholar
  7. Ming-Chuan Wu, Jingren Zhou, Nicolas Bruno, Yu Zhang, and Jon Fowler. 2012. Scope Playback: Self-validation in the Cloud. In Proceedings of the Fifth International Workshop on Testing Database Systems (DBTest '12). ACM, New York, NY, USA, Article 3, 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Khaled Yagoub, Peter Belknap, Benoit Dageville, Karl Dias, Shantanu Joshi, and Hailing Yu. 2008. Oracle's SQL Performance Analyzer. (2008).Google ScholarGoogle Scholar

Index Terms

  1. Snowtrail: Testing with Production Queries on a Cloud Database

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        DBTest'18: Proceedings of the Workshop on Testing Database Systems
        June 2018
        49 pages
        ISBN:9781450358262
        DOI:10.1145/3209950

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 June 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate31of56submissions,55%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader