research-article

KiWi: A Key-Value Map for Scalable Real-Time Analytics

Authors:
Dmitry Basin

Yahoo Research, Haifa, Israel

Yahoo Research, Haifa, Israel
View Profile

,
Edward Bortnikov

Yahoo Research, Haifa, Israel

Yahoo Research, Haifa, Israel
View Profile

,
Anastasia Braginsky

Yahoo Research, Haifa, Israel

Yahoo Research, Haifa, Israel
View Profile

,
Guy Golan-Gueta

VMWare Research, Herzliya, Israel

VMWare Research, Herzliya, Israel
View Profile

,
Eshcar Hillel

Yahoo Research, Haifa, Israel

Yahoo Research, Haifa, Israel
View Profile

,
Idit Keidar

Technion and Yahoo Research, Haifa, Israel

Technion and Yahoo Research, Haifa, Israel
View Profile

,
Moshe Sulamy

Tel Aviv University, Tel Aviv, Israel

Tel Aviv University, Tel Aviv, Israel
View Profile

Authors Info & Claims

ACM SIGPLAN Notices Volume 52 Issue 8August 2017pp 357–369https://doi.org/10.1145/3155284.3018761

Published:26 January 2017Publication History

ACM SIGPLAN Notices

Abstract

Modern big data processing platforms employ huge in-memory key-value (KV) maps. Their applications simultaneously drive high-rate data ingestion and large-scale analytics. These two scenarios expect KV-map implementations that scale well with both real-time updates and large atomic scans triggered by range queries.

We present KiWi, the first atomic KV-map to efficiently support simultaneous large scans and real-time access. The key to achieving this is treating scans as first class citizens,and organizing the data structure around them. KiWi provides wait-free scans, whereas its put operations are lightweight and lock-free. It optimizes memory management jointly with data structure access.We implement KiWi and compare it to state-of-the-art solutions. Compared to other KV-maps providing atomic scans, KiWi performs either long scans or concurrent puts an order of magnitude faster. Its scans are twice as fast as non-atomic ones implemented via iterators in the Java skiplist.

References

Apache HBase -- a Distributed Hadoop Database. https://hbase.apache.org/.Google Scholar
Java Array Copy. https://docs.oracle.com/javase/7/docs/api/java/lang/System.html, .Google Scholar
Java Concurrent Skip List. https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentSkipListMap.html, .Google Scholar
http://flurrymobile.tumblr.com/post/144245637325/appmatrix.Google Scholar
Flurry analytics. https://developer.yahoo.com/flurry/docs/analytics/.Google Scholar
http://flurrymobile.tumblr.com/post/117769261810/the-phablet-revolution.Google Scholar
A persistent key-value store for fast storage environments. http://rocksdb.org/, June 2014.Google Scholar
A fast and lightweight key/value database library by google. http://code.google.com/p/leveldb, Jan. 2014.Google Scholar
M. Arbel, G. Golan-Gueta, E. Hillel, and I. Keidar. Towards automatic lock removal for scalable synchronization. In DISC, pages 170--184, 2015. Google ScholarDigital Library
P. A. Bernstein, V. Hadzilacos, and N. Goodman. Concurrency Control and Recovery in Database Systems. AddisonWesley, 1987. ISBN 0--201--10715--5.Google ScholarDigital Library
A. Braginsky and E. Petrank. Locality-conscious lock-free linked lists. In ICDCN, pages 107--118, 2011. Google ScholarCross Ref
A. Braginsky and E. Petrank. A lock-free B+tree. In SPAA, pages 58--67, 2012.Google ScholarDigital Library
A. Braginsky, E. Petrank, and N. Cohen. CBPQ: High performance lock-free priority queue. In Euro-Par, 2016.Google ScholarDigital Library
N. G. Bronson, J. Casper, H. Chafi, and K. Olukotun. A practical concurrent binary search tree. In PPOPP, pages 257--268, 2010. Google ScholarDigital Library
T. Brown and H. Avni. Range queries in non-blocking k-ary search trees. In OPODIS, pages 31--45, 2012. Google ScholarCross Ref
T. Brown and J. Helga. Non-blocking k-ary search trees. In OPODIS, pages 207--221, 2011. Google ScholarDigital Library
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst., 26(2):4:1--4:26, June 2008. ISSN 0734--2071.Google ScholarDigital Library
B. Chatterjee. Lock-free linearizable 1-dimensional range queries. In WTTM, 2016.Google Scholar
K. Fraser. Practical lock-freedom. In PhD dissertation, University of Cambridge, 2004.Google Scholar
G. Golan-Gueta, E. Bortnikov, E. Hillel, and I. Keidar. Scaling concurrent log-structured data stores. In EuroSys, pages 32:1-- 32:14, 2015. Google ScholarDigital Library
V. Gramoli. More than you ever wanted to know about synchronization: Synchrobench, measuring the impact of the synchronization on concurrent algorithms. In PPoPP, 2015.Google ScholarDigital Library
D. Hendler, N. Shavit, and L. Yerushalmi. A scalable lockfree stack algorithm. In SPAA, pages 206--215, 2004.Google Scholar
M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Morgan Kaufmann Publishers Inc., 2008.Google ScholarDigital Library
M. Herlihy and J. M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst., 12(3):463--492, 1990. Google ScholarDigital Library
M. Herlihy, Y. Lev, V. Luchangco, and N. Shavit. A simple optimistic skiplist algorithm. In SIROCCO, pages 124--138, 2007. Google ScholarCross Ref
I. Keidar and D. Perelman. Multi-versioning in transactional memory. In Transactional Memory; Foundations, Algorithms, Tools, and Applications, volume 8913, chapter 7, pages 150--165. 2015. Google ScholarCross Ref
A. Kogan and E. Petrank. A methodology for creating fast wait-free data structures. In PPoPP, pages 141--150, 2012. Google ScholarDigital Library
D. B. Lomet, S. Sengupta, and J. J. Levandoski. The bw-tree: A b-tree for new hardware platforms. In ICDE, pages 302-- 313, 2013.Google Scholar
Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. In Proceedings of the 7th ACM European Conference on Computer Systems, EuroSys '12, pages 183--196, 2012. Google ScholarDigital Library
A. Natarajan and N. Mittal. Fast concurrent lock-free binary search trees. In PPoPP, pages 317--328, 2014. Google ScholarDigital Library
E. Petrank and S. Timnat. Lock-free data-structure iterators. In DISC, pages 224--238, 2013. Google ScholarDigital Library
A. Prokopec, N. G. Bronson, P. Bagwell, and M. Odersky. Concurrent tries with efficient non-blocking snapshots. In PPoPP, pages 151--160, 2012. Google ScholarDigital Library
J. Shute, R. Vingralek, B. Samwel, B. Handy, C. Whipkey, E. Rollins, M. Oancea, K. Littlefield, D. Menestrina, S. Ellner, J. Cieslewicz, I. Rae, T. Stancescu, and H. Apte. F1: A distributed sql database that scales. In VLDB, 2013. Google ScholarDigital Library
B. Sowell, W. Golab, and M. A. Shah. Minuet: A scalable distributed multiversion b-tree. Proc. VLDB Endow., 5(9): 884--895, May 2012. Google ScholarDigital Library
A. Spiegelman, G. Golan-Gueta, and I. Keidar. Transactional data structure libraries. In PLDI, pages 682--696, 2016. Google ScholarDigital Library
J. L. Welch and H. Attiya. Distributed Computing: Fundamentals, Simulations and Advanced Topics (2nd edition). John Wiley Interscience, 2004.Google Scholar

Index Terms

KiWi: A Key-Value Map for Scalable Real-Time Analytics
1. Theory of computation
  1. Design and analysis of algorithms
    1. Concurrent algorithms
  2. Theory and algorithms for application domains
    1. Database theory
      1. Data structures and algorithms for data management

Recommendations

KiWi: A Key-Value Map for Scalable Real-Time Analytics
PPoPP '17: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

Modern big data processing platforms employ huge in-memory key-value (KV) maps. Their applications simultaneously drive high-rate data ingestion and large-scale analytics. These two scenarios expect KV-map implementations that scale well with both real-...
Read More
KiWi: A Key-value Map for Scalable Real-time Analytics
Special Issue on PPoPP 2017 (Part 2) and Regular Papers

We present KiWi, the first atomic KV-map to efficiently support simultaneous large scans and real-time access. The key to achieving this is treating scans as first class citizens and organizing the data structure around them. KiWi provides wait-free ...
Read More
Flying KIWI: Design of Approximate Query Processing Engine for Interactive Data Analytics at Scale
BigDAS '15: Proceedings of the 2015 International Conference on Big Data Applications and Services

This paper introduces the design of hybrid SQL-on-Hadoop system, which supports dual-mode (interactive and deep) analytics. We present an architecture of approximate query processing engine using horizontal and vertical sampling of the original database ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGPLAN Notices Volume 52, Issue 8
PPoPP '17
August 2017
442 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3155284
Editor:
Matthew Fluet
Issue’s Table of Contents
PPoPP '17: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
January 2017
476 pages
ISBN:9781450344937
DOI:10.1145/3018743
General Chair:
Vivek Sarkar
Rice University, USA
,
Program Chair:
Lawrence Rauchwerger
Texas A&M University, USA
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 January 2017
Check for updates
Author Tags
lock-free key-value map
wait-free atomic scans
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 32
  Total Citations
  View Citations
- 435
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

KiWi: A Key-Value Map for Scalable Real-Time Analytics

ACM SIGPLAN Notices

Abstract

References

Cited By

Index Terms

Recommendations

KiWi: A Key-Value Map for Scalable Real-Time Analytics

KiWi: A Key-value Map for Scalable Real-time Analytics

Flying KIWI: Design of Approximate Query Processing Engine for Interactive Data Analytics at Scale