Abstract
One-dimensional range queries, as one of the most basic type of queries in databases, have been studied extensively in the literature. For large databases, the goal is to build an external index that is optimized for disk block accesses (or I/Os). The problem is well understood in the static case. Theoretically, there exists an index of linear size that can answer a range query in O(1 + KB) I/Os, where K is the output size and B is the disk block size, but it is highly impractical. In practice, the standard solution is the B-tree, which answers a query in O(logB NM + KB) I/Os on a data set of size N, where M is the main memory size. For typical values of N, M, and B, logB NM can be considered a constant.
However, the problem is still wide open in the dynamic setting, when insertions and deletions of records are to be supported. With smart buffering, it is possible to speed up updates significantly to o(1) I/Os amortized. Indeed, several dynamic B-trees have been proposed, but they all cause certain levels of degradation in the query performance, with the most interesting tradeoff point at O(1B log NM) I/Os for updates and O(log NM + KB) I/Os for queries. In this article, we prove that the query-update tradeoffs of all the known dynamic B-trees are optimal, when logB NM is a constant. This implies that one should not hope for substantially better solutions for all practical values of the parameters. Our lower bounds hold in a dynamic version of the indexability model, which is of independent interests. Dynamic indexability is a clean yet powerful model for studying dynamic indexing problems, and can potentially lead to more interesting lower bound results.
- Aggarwal, A. and Vitter, J. S. 1988. The input/output complexity of sorting and related problems. Commun. ACM 31, 9, 1116--1127. Google ScholarDigital Library
- Alstrup, S., Brodal, G., and Rauhe, T. 2001. Optimal static range reporting in one dimension. In Proceedings of the ACM Symposium on Theory of Computing. 476--482. Google ScholarDigital Library
- Andersson, A., and Thorup, M. 2007. Dynamic ordered sets with exponential search trees. Journal of the ACM 53, 3. Google ScholarDigital Library
- Arge, L. 2003. The buffer tree: A technique for designing batched external data structures. Algorithmica 37, 1, 1--24.Google ScholarDigital Library
- Arge, L., Samoladas, V., and Vitter, J. S. 1999. On two-dimensional indexability and optimal range search indexing. In Proceedings of the ACM Symposium on Principles of Database Systems. 346--357. Google ScholarDigital Library
- Arge, L., Samoladas, V., and Yi, K. 2009. Optimal external memory planar point enclosure. Algorithmica 54, 3, 337--352. Google ScholarDigital Library
- Bayer, R. and McCreight, E. 1972. Organization and maintenance of large ordered indexes. Acta Informatica 1, 173--189.Google ScholarDigital Library
- Beame, P. and Fich, F. E. 2002. Optimal bounds for the predecessor problem and related problems. J. Comput. Sys. Scie. 65, 1, 38--72. Google ScholarDigital Library
- Bender, M. A., Farach-Colton, M., Fineman, J. T., Fogel, Y., Kuszmaul, B. C., and Nelson, J. 2007. Cache-oblivious streaming B-trees. In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures. Google ScholarDigital Library
- Bentley, J. L. and Saxe, J. B. 1980. Decomposable searching problems I: Static-to-dynamic transformation. J. Algorithms 1, 301--358.Google ScholarCross Ref
- Brodal, G. S. and Fagerberg, R. 2003. Lower bounds for external memory dictionaries. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms. 546--554. Google ScholarDigital Library
- Buchsbaum, A. L., Goldwasser, M., Venkatasubramanian, S., and Westbrook, J. R. 2000. On external memory graph traversal. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms. 859--860. Google ScholarDigital Library
- Chazelle, B., and Guibas, L. J. 1986. Fractional cascading: I. A data structuring technique. Algorithmica 1, 133--162.Google ScholarDigital Library
- Hellerstein, J. M., Koutsoupias, E., and Papadimitriou, C. H. 1997. On the analysis of indexing schemes. In Proceedings of the ACM Symposium on Principles of Database Systems. 249--256. Google ScholarDigital Library
- Hellerstein, J. M., Koutsoupias, E., Miranker, D., Papadimitriou, C. H., and Samoladas, V. 2002. On a model of indexability and its bounds for range queries. J. ACM 49, 1, 35--55. Google ScholarDigital Library
- Iacono, J., and Pǎtraşcu, M. 2012. Using hashing to solve the dictionary problem (in external memory). In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms. Google ScholarDigital Library
- Jagadish, H. V., Narayan, P. P. S., Seshadri, S., Sudarshan, S., and Kanneganti, R. 1997. Incremental organization for data recording and warehousing. In Proceedings of the International Conference on Very Large Data Bases. 16--25. Google ScholarDigital Library
- Jermaine, C., Datta, A., and Omiecinski, E. 1999. A novel index supporting high volume data waresshouse insertion. In Proceedings of the International Conference on Very Large Data Bases. 235--246. Google ScholarDigital Library
- Koutsoupias, E., and Taylor, D. S. 1998. Tight bounds for 2-dimensional indexing schemes. In Proceedings of the ACM Symposium on Principles of Database Systems. 52--58. Google ScholarDigital Library
- Mortensen, C. W., Pagh, R., and Pǎtraşcu, M. 2005. On dynamic range reporting in one dimension. In Proceedings of the ACM Symposium on Theory of Computing. 104--111. Google ScholarDigital Library
- O’Neil, P., Cheng, E., Gawlick, D., and O’Neil, E. 1996. The log-structured merge-tree (LSM-tree). Acta Inf. 33, 4, 351--385. Google ScholarDigital Library
- Samoladas, V. and Miranker, D. 1998. A lower bound theorem for indexing schemes and its application to multidimensional range queries. In Proceedings of the ACM Symposium on Principles of Database Systems. 44--51. Google ScholarDigital Library
- Yao, A. C. 1981. Should tables be sorted? J. ACM 28, 3, 615--628. Google ScholarDigital Library
Index Terms
- Dynamic Indexability and the Optimality of B-Trees
Recommendations
Dynamic indexability and lower bounds for dynamic one-dimensional range query indexes
PODS '09: Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsThe B-tree is a fundamental external index structure that is widely used for answering one-dimensional range reporting queries. Given a set of N keys, a range query can be answered in O(logB NoverM + KoverB) I/Os, where B is the disk block size, K the ...
Indexability of restless bandit problems and optimality of Whittle index for dynamic multichannel access
In this paper, we consider a class of restless multiarmed bandit processes (RMABs) that arises in dynamic multichannel access, user/server scheduling, and optimal activation in multiagent systems. For this class of RMABs, we establish the indexability ...
A new range query algorithm for universal B-trees
In multi-dimensional databases the essential tool for accessing data is the range query (or window query). In this paper we introduce a new algorithm of processing range query in universal B-tree (UB-tree), which is an index structure for searching in ...
Comments