skip to main content
research-article

Dynamic Indexability and the Optimality of B-Trees

Published:01 August 2012Publication History
Skip Abstract Section

Abstract

One-dimensional range queries, as one of the most basic type of queries in databases, have been studied extensively in the literature. For large databases, the goal is to build an external index that is optimized for disk block accesses (or I/Os). The problem is well understood in the static case. Theoretically, there exists an index of linear size that can answer a range query in O(1 + KB) I/Os, where K is the output size and B is the disk block size, but it is highly impractical. In practice, the standard solution is the B-tree, which answers a query in O(logB NM + KB) I/Os on a data set of size N, where M is the main memory size. For typical values of N, M, and B, logB NM can be considered a constant.

However, the problem is still wide open in the dynamic setting, when insertions and deletions of records are to be supported. With smart buffering, it is possible to speed up updates significantly to o(1) I/Os amortized. Indeed, several dynamic B-trees have been proposed, but they all cause certain levels of degradation in the query performance, with the most interesting tradeoff point at O(1B log NM) I/Os for updates and O(log NM + KB) I/Os for queries. In this article, we prove that the query-update tradeoffs of all the known dynamic B-trees are optimal, when logB NM is a constant. This implies that one should not hope for substantially better solutions for all practical values of the parameters. Our lower bounds hold in a dynamic version of the indexability model, which is of independent interests. Dynamic indexability is a clean yet powerful model for studying dynamic indexing problems, and can potentially lead to more interesting lower bound results.

References

  1. Aggarwal, A. and Vitter, J. S. 1988. The input/output complexity of sorting and related problems. Commun. ACM 31, 9, 1116--1127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alstrup, S., Brodal, G., and Rauhe, T. 2001. Optimal static range reporting in one dimension. In Proceedings of the ACM Symposium on Theory of Computing. 476--482. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Andersson, A., and Thorup, M. 2007. Dynamic ordered sets with exponential search trees. Journal of the ACM 53, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Arge, L. 2003. The buffer tree: A technique for designing batched external data structures. Algorithmica 37, 1, 1--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Arge, L., Samoladas, V., and Vitter, J. S. 1999. On two-dimensional indexability and optimal range search indexing. In Proceedings of the ACM Symposium on Principles of Database Systems. 346--357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Arge, L., Samoladas, V., and Yi, K. 2009. Optimal external memory planar point enclosure. Algorithmica 54, 3, 337--352. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bayer, R. and McCreight, E. 1972. Organization and maintenance of large ordered indexes. Acta Informatica 1, 173--189.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Beame, P. and Fich, F. E. 2002. Optimal bounds for the predecessor problem and related problems. J. Comput. Sys. Scie. 65, 1, 38--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bender, M. A., Farach-Colton, M., Fineman, J. T., Fogel, Y., Kuszmaul, B. C., and Nelson, J. 2007. Cache-oblivious streaming B-trees. In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bentley, J. L. and Saxe, J. B. 1980. Decomposable searching problems I: Static-to-dynamic transformation. J. Algorithms 1, 301--358.Google ScholarGoogle ScholarCross RefCross Ref
  11. Brodal, G. S. and Fagerberg, R. 2003. Lower bounds for external memory dictionaries. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms. 546--554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Buchsbaum, A. L., Goldwasser, M., Venkatasubramanian, S., and Westbrook, J. R. 2000. On external memory graph traversal. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms. 859--860. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Chazelle, B., and Guibas, L. J. 1986. Fractional cascading: I. A data structuring technique. Algorithmica 1, 133--162.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hellerstein, J. M., Koutsoupias, E., and Papadimitriou, C. H. 1997. On the analysis of indexing schemes. In Proceedings of the ACM Symposium on Principles of Database Systems. 249--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hellerstein, J. M., Koutsoupias, E., Miranker, D., Papadimitriou, C. H., and Samoladas, V. 2002. On a model of indexability and its bounds for range queries. J. ACM 49, 1, 35--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Iacono, J., and Pǎtraşcu, M. 2012. Using hashing to solve the dictionary problem (in external memory). In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jagadish, H. V., Narayan, P. P. S., Seshadri, S., Sudarshan, S., and Kanneganti, R. 1997. Incremental organization for data recording and warehousing. In Proceedings of the International Conference on Very Large Data Bases. 16--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jermaine, C., Datta, A., and Omiecinski, E. 1999. A novel index supporting high volume data waresshouse insertion. In Proceedings of the International Conference on Very Large Data Bases. 235--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Koutsoupias, E., and Taylor, D. S. 1998. Tight bounds for 2-dimensional indexing schemes. In Proceedings of the ACM Symposium on Principles of Database Systems. 52--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Mortensen, C. W., Pagh, R., and Pǎtraşcu, M. 2005. On dynamic range reporting in one dimension. In Proceedings of the ACM Symposium on Theory of Computing. 104--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. O’Neil, P., Cheng, E., Gawlick, D., and O’Neil, E. 1996. The log-structured merge-tree (LSM-tree). Acta Inf. 33, 4, 351--385. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Samoladas, V. and Miranker, D. 1998. A lower bound theorem for indexing schemes and its application to multidimensional range queries. In Proceedings of the ACM Symposium on Principles of Database Systems. 44--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Yao, A. C. 1981. Should tables be sorted? J. ACM 28, 3, 615--628. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Dynamic Indexability and the Optimality of B-Trees

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Journal of the ACM
          Journal of the ACM  Volume 59, Issue 4
          August 2012
          112 pages
          ISSN:0004-5411
          EISSN:1557-735X
          DOI:10.1145/2339123
          Issue’s Table of Contents

          Copyright © 2012 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 August 2012
          • Accepted: 1 December 2011
          • Revised: 1 May 2011
          • Received: 1 October 2009
          Published in jacm Volume 59, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader