skip to main content
research-article

Cache-oblivious databases: Limitations and opportunities

Published: 24 June 2008 Publication History

Abstract

Cache-oblivious techniques, proposed in the theory community, have optimal asymptotic bounds on the amount of data transferred between any two adjacent levels of an arbitrary memory hierarchy. Moreover, this optimal performance is achieved without any hardware platform specific tuning. These properties are highly attractive to autonomous databases, especially because the hardware architectures are becoming increasingly complex and diverse.
In this article, we present our design, implementation, and evaluation of the first cache-oblivious in-memory query processor, EaseDB. Moreover, we discuss the inherent limitations of the cache-oblivious approach as well as the opportunities given by the upcoming hardware architectures. Specifically, a cache-oblivious technique usually requires sophisticated algorithm design to achieve a comparable performance to its cache-conscious counterpart. Nevertheless, this development-time effort is compensated by the automaticity of performance achievement and the reduced ownership cost. Furthermore, this automaticity enables cache-oblivious techniques to outperform their cache-conscious counterparts in multi-threading processors.

References

[1]
Ailamaki, A. 2005. Database architectures for new hardware. In Proceedings of the 21st International Conference on Data Engineering (ICDE '05). IEEE Computer Society, 1148.
[2]
Ailamaki, A., DeWitt, D. J., Hill, M. D., and Skounakis, M. 2001. Weaving relations for cache performance. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB '01). Morgan Kaufmann Publishers Inc., San Francisco, CA, 169--180.
[3]
Ailamaki, A., DeWitt, D. J., Hill, M. D., and Wood, D. A. 1999. DBMSs on a modern processor: Where does time go? In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB '99) Morgan Kaufmann Publishers Inc., San Francisco, CA, 266--277.
[4]
AMD Corp. 2005. Software Optimization Guide for AMD64 Processors.
[5]
Baulier, J., Bohannon, P., Gogate, S., Gupta, C., and Haldar, S. 1999. Datablitz storage manager: Main-memory database performance for critical applications. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '99). ACM Press, New York, NY, 519--520.
[6]
Bender, M. A., Demaine, E. D., and Farach-Colton, M. 2000. Cache-oblivious B-trees. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science (FOCS '00). IEEE Computer Society, 399.
[7]
Bender, M. A., Duan, Z., Iacono, J., and Wu, J. 2004. A locality-preserving cache-oblivious dynamic dictionary. J. Algorithms 53, 2, 115--136.
[8]
Bender, M. A., Farach-Colton, M., and Kuszmaul, B. C. 2006. Cache-oblivious string B-trees. In Proceedings of the 25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS '06). ACM Press, New York, NY, 233--242.
[9]
Berrendorf, R., Ziegler, H., and Mohr, B. 2002. PCL: Performance Counter Library. http://www.fz-juelich.de/zam/PCL/.
[10]
Bohannon, P., Mcllroy, P., and Rastogi, R. 2001. Main-memory index structures with fixed-size partial keys. In Proceedings of the 2001 ACM SIGMOD international conference on Management of data (SIGMOD '01). ACM Press, New York, NY, 163--174.
[11]
Boncz, P. A., Manegold, S., and Kersten, M. L. 1999. Database architecture optimized for the new bottleneck: Memory access. In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB '99). Morgan Kaufmann Publishers Inc., San Francisco, CA, 54--65.
[12]
Brodal, G. S. and Fagerberg, R. 2002. Cache oblivious distribution sweeping. In Proceedings of the 29th International Colloquium on Automata, Languages and Programming (ICALP '02). Springer-Verlag, Berlin, Germany, 426--438.
[13]
Brodal, G. S., Fagerberg, R., and Jacob, R. 2002. Cache oblivious search trees via binary trees of small height. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '02). Society for Industrial and Applied Mathematics, Philadelphia, PA, 39--48.
[14]
Brodal, G. S., Fagerberg, R., and Vinther, K. 2004. Engineering a cache-oblivious sorting algorithm. In ALENEX/ANALC. 4--17.
[15]
Chen, S., Ailamaki, A., Gibbons, P. B., and Mowry, T. C. 2004. Improving hash join performance through prefetching. In Proceedings of the 20th International Conference on Data Engineering (ICDE '04). IEEE Computer Society, 116.
[16]
Chilimbi, T. M., Hill, M. D., and Larus, J. R. 1999. Cache-conscious structure layout. In Proceedings of the ACM SIGPLAN Conference on Programming language Design and Implementation (PLDI '99). ACM Press, New York, NY, 1--12.
[17]
Chowdhury, R. A. and Ramachandran, V. 2007. The cache-oblivious gaussian elimination paradigm: Theoretical framework, parallelization and experimental evaluation. In Proceedings of the 19th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA '07). ACM Press, New York, NY, 71--80.
[18]
Comer, D. 1979. Ubiquitous B-tree. ACM Comput. Surv. 11, 2, 121--137.
[19]
Copeland, G. P. and Khoshafian, S. N. 1985. A decomposition storage model. In Proceedings of the 1985 ACM SIGMOD International Conference on Management of Data (SIGMOD '85). ACM Press, New York, NY, 268--279.
[20]
Demaine, E. D. 2002. Cache-oblivious algorithms and data structures. Lecture Notes from the EEF Summer School on Massive Data Sets, BRICS.
[21]
DeWitt, D. J., Katz, R. H., Olken, F., Shapiro, L. D., Stonebraker, M. R., and Wood, D. 1984. Implementation techniques for main memory database systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'84). ACM Press, New York, NY, 1--8.
[22]
Ding, C. and Zhong, Y. 2003. Predicting whole-program locality through reuse distance analysis. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '03). ACM Press, New York, NY, 245--257.
[23]
FastDB. 2002. http://www.ispras.ru/~knizhnik/fastdb.html.
[24]
Frigo, M., Leiserson, C. E., Prokop, H., and Ramachandran, S. 1999. Cache-oblivious algorithms. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science (FOCS '99). IEEE Computer Society, 285.
[25]
Garcia, P. and Korth, H. F. 2006. Database hash-join algorithms on multithreaded computer architectures. In Proceedings of the 3rd Conference on Computing Frontiers (CF06). ACM, New York, NY, 241--252.
[26]
Garcia-Molina, H. and Salem, K. 1992. Main memory database systems: An overview. IEEE Trans. Knowl. Data Engin. 4, 6, 509--516.
[27]
Hankins, R. A. and Patel, J. M. 2003. Data morphing: An adaptive, cache-conscious storage technique. In VLDB. 417--428.
[28]
Hardavellas, N., Pandis, I., Johnson, R., Mancheril, N., Harizopoulos, S., Ailamaki, A., and Falsafi, B. 2007. Database servers on chip multiprocessors: Limitations and opportunities. In Proceedings of the 3rd International Conference on Innovative Data Systems Research (CIDR '07). Asilomar, CA.
[29]
Harizopoulos, S. and Ailamaki, A. 2006. Improving instruction cache performance in OLTP. ACM Trans. Datab. Syst. 31, 3, 887--920.
[30]
He, B., Li, Y., Luo, Q., and Yang, D. 2007. EaseDB: A cache-oblivious in-memory query processor. In Proceedings of the International SIGMOD Conference. 1064--1066.
[31]
He, B. and Luo, Q. 2006. Cache-oblivious nested-loop joins. In Proceedings of the ACM 15th Conference on Information and Knowledge Management (CIKM '06).
[32]
He, B. and Luo, Q. 2007. Cache-oblivious query processing. In Proceedings of the 3rd International Conference on Innovative Data Systems Research (CIDR '07). Asilomar, CA.
[33]
He, B., Luo, Q., and Choi, B. 2006. Cache-conscious automata for XML filtering. IEEE Trans. Knowl. Data Engin. 18, 12, 1629--1644.
[34]
He, B., Luo, Q., and Choi, B. 2007. Adaptive index utilization in memory-resident structural joins. IEEE Trans. Knowl. Data Engin. 19, 6, 772--788.
[35]
Hennessy, J. L. and Patterson, D. A. 2002. Computer Architecture: A Quantitative Approach. Morgan Kaufman Publishers.
[36]
Intel Corp. 2004. Intel(R) Itanium(R) 2 Processor Reference Manual for Software Development and Optimization.
[37]
Intel Corp. 2007. Intel vtune performance analyzer. http://www3.intel.com/cd/software/products/asmo-na/eng/239144.htm.
[38]
Kim, S., Chandra, D., and Solihin, Y. 2004. Fair cache sharing and partitioning in a chip multiprocessor architecture. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT '04). IEEE Computer Society, 111--122.
[39]
Knuth, D. E. 1998. The Art of Computer Programming, 2nd ed. Addison-Wesley.
[40]
LaMarca, A. and Ladner, R. E. 1997. The influence of caches on the performance of sorting. In Proceedings of the 8th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '97). Society for Industrial and Applied Mathematics, Philadelphia, PA, 370--379.
[41]
Lehman, T. J. and Carey, M. J. 1986a. Query processing in main memory database management systems. In Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data (SIGMOD '86). ACM Press, New York. 239--250.
[42]
Lehman, T. J. and Carey, M. J. 1986b. A study of index structures for main memory database management systems. In Proceedings of the 12th International Conference on Very Large Data Bases (VLDB '86). Morgan Kaufmann Publishers Inc., San Francisco, CA, 294--303.
[43]
Manegold, S. 2004. The Calibrator (v0.9e), a cache-memory and TLB calibration tool. http://www. cwi.nl/~manegold/Calibrator/.
[44]
Manegold, S., Boncz, P., and Kersten, M. 2002. Generic database cost models for hierarchical memory systems. In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB '02).
[45]
Marr, D. T., Binns, F., Hill, D. L., Hinton, G., Koufaty, D. A., Miller, J. A., and Upton, M. 2002. Hyper-Threading Technology Architecture and Microarchitecture. Intel Technol. J. 6, 1.
[46]
MonetDB. 2006. http://monetdb.cwi.nl/.
[47]
Ramakrishnan, R. and Gehrke, J. 2003. Database Management Systems, 3rd ed. McGraw-Hill.
[48]
Rao, J. and Ross, K. A. 1999. Cache conscious indexing for decision-support in main memory. In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB '99). Morgan Kaufmann Publishers Inc., San Francisco, CA, 78--89.
[49]
Rao, J. and Ross, K. A. 2000. Making B+-trees cache conscious in main memory. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '00). ACM Press, New York, 475--486.
[50]
Samuel, M., Pedersen, A. U., and Bonnet, P. 2005. Making CSB+-trees processor conscious. In Proceedings of the 1st International Workshop on Data Management on New Hardware (DAMON '05). ACM Press, New York, NY, 1.
[51]
Selinger, P. G., Astrahan, M. M., Chamberlin, D. D., Lorie, R. A., and Price, T. G. 1979. Access path selection in a relational database management system. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '79). ACM Press, New York, NY, 23--34.
[52]
Shao, M., Ailamaki, A., and Falsafi, B. 2005. DBmbench: Fast and accurate database workload representation on modern microarchitecture. In Proceedings of the Conference of the Centre for Advanced Studies on Collaborative Research (CASCON '05). IBM Press, 254--267.
[53]
Shatdal, A., Kant, C., and Naughton, J. F. 1994. Cache conscious algorithms for relational query processing. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB '94). Morgan Kaufmann Publishers Inc., San Francisco, CA, 510--521.
[54]
Stonebraker, M., Abadi, D. J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O'Neil, E., O'Neil, P., Rasin, A., Tran, N., and Zdonik, S. 2005. C-store: A column-oriented dbms. In Proceedings of the 31st International Conference on Very large Data Bases (VLDB '05). VLDB Endowment, 553--564.
[55]
Sun Corp. 1997. UltraSPARC (R) III Cu Users Manual.
[56]
TimesTen. 2006. http://www.oracle.com/timesten/index.html.
[57]
TPC. 2004. http://www.tpc.org/.
[58]
van Emde Boas, P., Kaas, R., and Zijlstra, E. 1977. Design and implementation of an efficient priority queue. Math. Syst. Theory 10, 99--127.
[59]
Yoon, S.-E. and Lindstrom, P. 2006. Mesh layouts for block-based caches. IEEE Trans. Visual. Comput. Graph. 12, 5, 1213--1220.
[60]
Zhou, J., Cieslewicz, J., Ross, K. A., and Shah, M. 2005. Improving database performance on simultaneous multithreading processors. In Proceedings of the 31st International Conference on Very large Data Bases (VLDB '05). VLDB Endowment, 49--60.
[61]
Zhou, J. and Ross, K. A. 2003. Buffering accesses to memory-resident index structures. In Proceedings of the 29th International Conference on Very Large Data Bases (VLDB '03). 405--416.

Cited By

View all
  • (2023)JG2Time: A Learned Time Estimator for Join Operators Based on Heterogeneous Join-GraphsDatabase Systems for Advanced Applications10.1007/978-3-031-30637-2_9(132-147)Online publication date: 17-Apr-2023
  • (2021)Cache-Efficient Fork-Processing Patterns on Large GraphsProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457253(1208-1221)Online publication date: 9-Jun-2021
  • (2020)Revisiting hash join on graphics processors: a decade laterDistributed and Parallel Databases10.1007/s10619-019-07280-z38:4(771-793)Online publication date: 8-Jan-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 33, Issue 2
June 2008
309 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/1366102
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2008
Accepted: 01 January 2008
Revised: 01 September 2007
Received: 01 February 2007
Published in TODS Volume 33, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cache-oblivious
  2. cache-conscious
  3. chip multiprocessors
  4. data caches
  5. simultaneous multithreading

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)1
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)JG2Time: A Learned Time Estimator for Join Operators Based on Heterogeneous Join-GraphsDatabase Systems for Advanced Applications10.1007/978-3-031-30637-2_9(132-147)Online publication date: 17-Apr-2023
  • (2021)Cache-Efficient Fork-Processing Patterns on Large GraphsProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457253(1208-1221)Online publication date: 9-Jun-2021
  • (2020)Revisiting hash join on graphics processors: a decade laterDistributed and Parallel Databases10.1007/s10619-019-07280-z38:4(771-793)Online publication date: 8-Jan-2020
  • (2016)Efficient Query Processing on Many-core ArchitecturesProceedings of the 2016 International Conference on Management of Data10.1145/2882903.2899407(2081-2084)Online publication date: 26-Jun-2016
  • (2016)Rank-Aware Dynamic Migrations and Adaptive Demotions for DRAM Power ManagementIEEE Transactions on Computers10.1109/TC.2015.240984765:1(187-202)Online publication date: 1-Jan-2016
  • (2016)An interval join optimized for modern hardware2016 IEEE 32nd International Conference on Data Engineering (ICDE)10.1109/ICDE.2016.7498316(1098-1109)Online publication date: May-2016
  • (2015)Main-Memory Hash Joins on Modern Processor ArchitecturesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2014.231387427:7(1754-1766)Online publication date: 1-Jul-2015
  • (2015)Synergy of Dynamic Frequency Scaling and Demotion on DRAM Power Management: Models and OptimizationsIEEE Transactions on Computers10.1109/TC.2014.236053464:8(2367-2381)Online publication date: 1-Aug-2015
  • (2015)Cache-Sensitive Memory Layout for Dynamic Binary TreesThe Computer Journal10.1093/comjnl/bxv09059:5(630-649)Online publication date: 4-Nov-2015
  • (2014)Database Techniques for Multi Cores and Big MemoryEncyclopedia of Business Analytics and Optimization10.4018/978-1-4666-5202-6.ch062(667-676)Online publication date: 2014
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media