Abstract
LSM-tree has been widely used in data management production systems for write-intensive workloads. However, as read and write workloads co-exist under LSM-tree, data accesses can experience long latency and low throughput due to the interferences to buffer caching from the compaction, a major and frequent operation in LSM-tree. After a compaction, the existing data blocks are reorganized and written to other locations on disks. As a result, the related data blocks that have been loaded in the buffer cache are invalidated since their referencing addresses are changed, causing serious performance degradations.
To re-enable high-speed buffer caching during intensive writes, we propose Log-Structured buffered-Merge tree (simplified as LSbM-tree) by adding a compaction buffer on disks to minimize the cache invalidations on buffer cache caused by compactions. The compaction buffer efficiently and adaptively maintains the frequently visited datasets. In LSbM, strong locality objects can be effectively kept in the buffer cache with minimum or no harmful invalidations. With the help of a small on-disk compaction buffer, LSbM achieves a high query performance by enabling effective buffer caching, while retaining all the merits of LSM-tree for write-intensive data processing and providing high bandwidth of disks for range queries. We have implemented LSbM based on LevelDB. We show that with a standard buffer cache and a hard disk, LSbM can achieve 2x performance improvement over LevelDB. We have also compared LSbM with other existing solutions to show its strong cache effectiveness.
- Muhammad Yousuf Ahmad and Bettina Kemme. 2015. Compaction management in distributed key-value datastores. Proc. VLDB Endow. 8, 8 (2015), 850--861. Google ScholarDigital Library
- Apache. 2017. Cassandra. Retrieved April 6, 2017 from http://cassandra.apache.org/.Google Scholar
- Apache. 2017. HBASE. Retrieved April 6, 2017 from https://hbase.apache.org/.Google Scholar
- Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In ACM SIGMETRICS Performance Evaluation Review, Vol. 40. ACM, 53--64. Google ScholarDigital Library
- Basho. 2017. Riak. Retrieved April 6, 2017from http://basho.com/riak.Google Scholar
- Andrei Broder and Michael Mitzenmacher. 2004. Network applications of bloom filters: A survey. Internet Math. 1, 4 (2004), 485--509.Google ScholarCross Ref
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2008. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26, 2 (2008), 4. Google ScholarDigital Library
- Feng Chen, Rubao Lee, and Xiaodong Zhang. 2011. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture (HPCA’11). IEEE, 266--277. Google ScholarDigital Library
- Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing. ACM, 143--154. Google ScholarDigital Library
- Facebook. 2017. RocksDB. Retrieved April 6, 2017 from http://rocksdb.org/.Google Scholar
- Guy Golan-Gueta, Edward Bortnikov, Eshcar Hillel, and Idit Keidar. 2015. Scaling concurrent log-structured data stores. In Proceedings of the 10th European Conference on Computer Systems. ACM, 32. Google ScholarDigital Library
- Google. 2017. LevelDB. Retrieved April 6, 2017 from http://code.google.com/p/leveldb.Google Scholar
- Lei Guo, Dejun Teng, Rubao Lee, Feng Chen, Siyuan Ma, and Xiaodong Zhang. 2016. Re-enabling high-speed caching for LSM-trees. arXiv preprint arXiv:1606.02015 (2016).Google Scholar
- H. V. Jagadish, P. P. S. Narayan, Sridhar Seshadri, S. Sudarshan, and Rama Kanneganti. 1997. Incremental organization for data recording and warehousing. In VLDB, Vol. 97. Citeseer, 16--25. Google ScholarDigital Library
- Yinan Li, Bingsheng He, Robin Jun Yang, Qiong Luo, and Ke Yi. 2010. Tree indexing on solid state drives. Proc. VLDB Endow. 3, 1-2 (2010), 1195--1206. Google ScholarDigital Library
- Hyeontaek Lim, David G. Andersen, and Michael Kaminsky. 2016. Towards accurate and fast evaluation of multi-stage log-structured designs. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, 149--166. Google ScholarDigital Library
- Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. WiscKey: Separating keys from values in SSD-conscious storage. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 133--148. Google ScholarDigital Library
- Ryan Mcguire. 2014. Compaction Improvements in Cassandra 2.1. April 6, 2014 from http://www.datastax.com/dev/blog/compaction-improvements-in-cassandra-21.Google Scholar
- Patrick ONeil, Edward Cheng, Dieter Gawlick, and Elizabeth ONeil. 1996. The log-structured merge-tree (LSM-tree). Acta Inform. 33, 4 (1996), 351--385. Google ScholarDigital Library
- Russell Sears and Raghu Ramakrishnan. 2012. bLSM: A general purpose log structured merge tree. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 217--228. Google ScholarDigital Library
- Pradeep J. Shetty, Richard P. Spillane, Ravikant R. Malpani, Binesh Andrews, Justin Seyster, and Erez Zadok. 2013. Building workload-independent storage with VT-trees. In Presented as part of the Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 17--30. Google ScholarDigital Library
- Dejun Teng, Lei Guo, Rubao Lee, Feng Chen, Siyuan Ma, Yanfeng Zhang, and Xiaodong Zhang. 2017. LSbM-tree: Re-enabling buffer caching in data management for mixed reads and writes. In Proceedings of the 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS’17). IEEE, 68--79.Google ScholarCross Ref
- Peng Wang, Guangyu Sun, Song Jiang, Jian Ouyang, Shiding Lin, Chen Zhang, and Jason Cong. 2014. An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In Proceedings of the 9th European Conference on Computer Systems. ACM, 16. Google ScholarDigital Library
- Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based ultra-large key-value store for small data. In Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference. USENIX Association, 71--82. Google ScholarDigital Library
- Kai Zhang, Kaibo Wang, Yuan Yuan, Lei Guo, Rubao Lee, and Xiaodong Zhang. 2015. Mega-kv: A case for GPUS to maximize the throughput of in-memory key-value stores. Proc. VLDB Endow. 8, 11 (2015), 1226--1237. Google ScholarDigital Library
Index Terms
- A Low-cost Disk Solution Enabling LSM-tree to Achieve High Performance for Mixed Read/Write Workloads
Recommendations
FlatLSM: Write-Optimized LSM-Tree for PM-Based KV Stores
The Log-Structured Merge Tree (LSM-Tree) is widely used in key-value (KV) stores because of its excwrite performance. But LSM-Tree-based KV stores still have the overhead of write-ahead log and write stall caused by slow L0 flush and L0-L1 compaction. New ...
LSM-tree managed storage for large-scale key-value store
SoCC '17: Proceedings of the 2017 Symposium on Cloud ComputingKey-value stores are increasingly adopting LSM-trees as their enabling data structure in the backend storage, and persisting their clustered data through a file system. A file system is expected to not only provide file/directory abstraction to organize ...
Towards Read-Intensive Key-Value Stores with Tidal Structure Based on LSM-Tree
ASPDAC '20: Proceedings of the 25th Asia and South Pacific Design Automation ConferenceKey-value store has played a critical role in many large-scale data storage applications. The log-structured merge-tree (LSM-tree) based key-value store achieves excellent performance on write-intensive workloads which is mainly benefited from the ...
Comments