skip to main content
10.1145/1247480.1247566acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Lazy, adaptive rid-list intersection, and its application to index anding

Published:11 June 2007Publication History

ABSTRACT

RID-List (row id list) intersection is a common strategy in query processing, used in star joins, column stores, and even search engines. To apply a conjunction of predicates on a table, a query process ordoes index lookups to form sorted RID-lists (or bitmap) of the rows matching each predicate, then intersects the RID-lists via an AND-tree, and finally fetches the corresponding rows to apply any residual predicates and aggregates.

This process can be expensive when the RID-lists are large. Furthermore, the performance is sensitive to the order in which RID lists are intersected together, and to treating the right predicates as residuals. If the optimizer chooses a wrong order or a wrong residual, due to a poor cardinality estimate, the resulting plan can run orders of magnitude slower than expected.

We present a new algorithm for RID-list intersection that is both more efficient and more robust than this standard algorithm. First, we avoid forming the RID-lists up front, and instead form this lazily as part of the intersection. This reduces the associated IO and sort cost significantly, especially when the data distribution is skewed. It also ameliorates the problem of wrong residual table selection. Second, we do not intersect the RID-lists via an AND-tree, because this is vulnerable to cardinality mis-estimations. Instead, we use an adaptive set intersection algorithm that performs well even when the cardinality estimates are wrong.

We present detailed experiments of this algorithm on data with varying distributions to validate its efficiency and predictability.

References

  1. R. Avnur and J. Hellerstein. Eddies: Continuously adaptive query processing. In SIGMOD, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, and C. Zhang. Cost-based optimization in db2 xml. IBM Systems Journal, 45(2), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Barbay and C. Kenyon. Adaptive intersection and t-threshold problems. In SODA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Barbay, A. López-Ortiz, and T. Lu. Faster adaptive set intersections for text searching. In Intl. Workshop on Experimental Algorithms, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. Bruno, N. Koudas, and D. Srivastava. Holistic twig joins: optimal xml pattern matching. In SIGMOD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Demaine, A. López-Ortiz, and J. Munro. Adaptive set intersections, unions, and differences. In SODA, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Y. Ioannidis and S. Christodoulakis. On the propagation of errors in the size of join results. In SIGMOD, 1991.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Kamath and K. Ramamritham. Bucket skip merge join: A scalable algorithm for join processing in very large databases using indexes. Technical Report UM-CS-1996-020, University of Massachusetts at Amherst, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Mohan, Don Haderle, Yun Wang, and Josephine Cheng. Single table access using multiple indexes: Optimization, execution, and concurrency control techniques. In Proceedings of the International Conference on Extending Database Technology, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  10. Shivnath Babu, Rajeev Motwani, Kamesh Mungala, Itaru Nishizawa, and Jennifer Widom. Adaptive ordering of pipelined streamfilters. In SIGMOD, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. O'Neil and D. Quass. Improved query performance with variant indexes. In SIGMOD, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Oracle White Paper. Query Optimization in Oracle Database 10g Release 2, June 2005.Google ScholarGoogle Scholar
  13. SAP. SAP Business Information Warehouse Benchmark.Google ScholarGoogle Scholar
  14. M. Stillger, G. Lohman, V. Markl, and M. Kandil. LEO: DB2's LEarning Optimizer. In VLDB, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mike Stonebraker, Daniel Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden, Elizabeth O'Neil, Pat O'Neil, Alex Rasin, Nga Tran, and Stan Zdonik. C-Store: A Column-oriented DBMS. In VLDB, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Lazy, adaptive rid-list intersection, and its application to index anding

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
      June 2007
      1210 pages
      ISBN:9781595936868
      DOI:10.1145/1247480
      • General Chairs:
      • Lizhu Zhou,
      • Tok Wang Ling,
      • Program Chair:
      • Beng Chin Ooi

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 June 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader