ABSTRACT
The skip list is an elegant dictionary data structure that is commonly deployed in RAM. A skip list with N elements supports searches, inserts, and deletes in O(log N) operations with high probability (w.h.p.) and range queries returning K elements in O(log N + K) operations w.h.p.
A seemingly natural way to generalize the skip list to external memory with block size B is to "promote" with probability 1/B, rather than 1/2. However, there are practical and theoretical obstacles to getting the skip list to retain its efficient performance, space bounds, and high-probability guarantees.
We give an external-memory skip list that achieves write-optimized bounds. That is, for 0 < ε < 1, range queries take O(logBε N + K/B) I/Os w.h.p. and insertions and deletions take O((logBε N) / B1-ε) amortized I/Os w.h.p.
Our write-optimized skip list inherits the virtue of simplicity from RAM skip lists. Moreover, it matches or beats the asymptotic bounds of prior write-optimized data structures such as the Bε & tree or LSM trees, which are deployed in high-performance databases and file systems.
The main technical challenge in proving our bounds comes from the fact that there are so few levels in the skip list, an aspect of the data structure that is essential to getting strong external-memory bounds. We use extremal-graph coloring to show that it is possible to decompose paths in the skip list into uncorrelated groups, regardless of the insertion/deletion pattern. Thus, we achieve our bounds by averaging over these uncorrelated paths rather than by averaging over uncorrelated levels, as in the standard skip list.
- I. Abraham, J. Aspnes, and J. Yuan. Skip B-trees. In Proc. 9th Annual International Conference on Principles of Distributed Systems (OPODIS), page 366, 2006. Google ScholarDigital Library
- A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9):1116--1127, Sept. 1988. Google ScholarDigital Library
- A. Anagnostopoulos, M. Goodrich, and R. Tamassia. Persistent authenticated dictionaries and their applications. Information Security, pages 379--393, 2001. Google ScholarDigital Library
- L. Arge. The buffer tree: A technique for designing batched external data structures. Algorithmica, 37(1):1--24, 2003.Google ScholarDigital Library
- L. Arge, D. Eppstein, and M. T. Goodrich. Skip-webs: efficient distributed data structures for multi-dimensional data sets. In Proc. 24th Annual ACM Symposium on Principles of Distributed Computing (PODC), pages 69--76, 2005. Google ScholarDigital Library
- J. Aspnes and G. Shah. Skip graphs. ACM Transactions on Algorithms, 3(4):37, 2007. Google ScholarDigital Library
- R. Bayer and E. M. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1:173--189, 1972. Google ScholarDigital Library
- M. A. Bender, J. W. Berry, R. Johnson, T. M. Kroeger, S. McCauley, C. A. Phillips, B. Simon, S. Singh, and D. Zage. Anti-persistence on persistent storage: History-independent sparse tables and dictionaries. In Proc. 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS), pages 289--302, 2016. Google ScholarDigital Library
- M. A. Bender, M. Farach-Colton, J. T. Fineman, Y. R. Fogel, B. C. Kuszmaul, and J. Nelson. Cache-oblivious streaming B-trees. In Proc. 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 81--92, 2007. Google ScholarDigital Library
- M. A. Bender, M. Farach-Colton, W. Jannen, R. Johnson, B. C. Kuszmaul, D. E. Porter, J. Yuan, and Y. Zhan. An introduction to Bε-trees and write-optimization. :login; magazine, 40(5):22--28, October 2015.Google Scholar
- M. A. Bender, M. Farach-Colton, R. Johnson, R. Kraner, B. C. Kuszmaul, D. Medjedovic, P. Montes, P. Shetty, R. P. Spillane, and E. Zadok. Don't thrash: How to cache your hash on flash. Proc. of the VLDB Endowment (PVLDB), 5(11):1627--1637, 2012. Google ScholarDigital Library
- M. A. Bender, J. T. Fineman, S. Gilbert, and B. C. Kuszmaul. Concurrent cache-oblivious B-trees. In Proc. 17th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 228--237, 2005. Google ScholarDigital Library
- G. S. Brodal, E. D. Demaine, J. T. Fineman, J. Iacono, S. Langerman, and J. I. Munro. Cache-oblivious dynamic dictionaries with update/query tradeoffs. In Proc.\ 21st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1448--1456, 2010. Google ScholarDigital Library
- G. S. Brodal and R. Fagerberg. Lower bounds for external memory dictionaries. In Proc. 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 546--554, 2003. Google ScholarDigital Library
- A. L. Buchsbaum, M. Goldwasser, S. Venkatasubramanian, and J. R. Westbrook. On external memory graph traversal. In Proc. 11th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 859--860, 2000. Google ScholarDigital Library
- P. Callahan, M. T. Goodrich, and K. Ramaiyer. Topology B-trees and their applications. In Proc. 4th International Workshop on Algorithms and Data Structures (WADS), pages 381--392, 1995. Google ScholarDigital Library
- V. Ciriani, P. Ferragina, F. Luccio, and S. Muthukrishnan. Static optimality theorem for external memory string access. In Proc. 43rd Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 219--227, 2002. Google ScholarDigital Library
- L. Devroye. A limit theory for random skip lists. The Annals of Applied Probability, pages 597--609, 1992.Google ScholarCross Ref
- M. Fomitchev and E. Ruppert. Lock-free linked lists and skip lists. In Proc. 23rd Annual ACM Symposium on Principles of Distributed Computing (PODC), pages 50--59, 2004. Google ScholarDigital Library
- D. Golovin. The B-skip-list: A simpler uniquely represented alternative to B-trees. arXiv preprint arXiv:1005.0662, 2010.Google Scholar
- M. T. Goodrich and R. Tamassia. Efficient authenticated dictionaries with skip lists and commutative hashing. US Patent App, 10(416,015), 2000.Google Scholar
- G. Graefe. Write-optimized B-trees. In Proc. 30th International Conference on Very Large Data Bases (VLDB), pages 672--683, 2004. Google ScholarDigital Library
- G. Graefe. B-tree indexes for high update rates. SIGMOD Rec., 35(1):39--44, Mar. 2006. Google ScholarDigital Library
- M. Herlihy, Y. Lev, V. Luchangco, and N. Shavit. A provably correct scalable concurrent skip list. In Proc. Conference On Principles of Distributed Systems (OPODIS), 2006.Google Scholar
- M. Herlihy, Y. Lev, V. Luchangco, and N. Shavit. A simple optimistic skiplist algorithm. Proc. 14th Annual Colloquium on Structural Information and Communication Complexity (SIROCCO), page 124, 2007. Google ScholarDigital Library
- R. Jacob, A. Richa, C. Scheideler, S. Schmid, and H. Täubig. A distributed polylogarithmic time algorithm for self-stabilizing skip graphs. In Proc. 28th ACM Symposium on Principles of Distributed Computing (PODC), pages 131--140, 2009. Google ScholarDigital Library
- C. Jermaine, A. Datta, and E. Omiecinski. A novel index supporting high volume data warehouse insertion. In M. P. Atkinson, M. E. Orlowska, P. Valduriez, S. B. Zdonik, and M. L. Brodie, editors, Proc. 25th International Conference on Very Large Data Bases (VLDB), pages 235--246, 1999. Google ScholarDigital Library
- P. Kirschenhofer and H. Prodinger. The path length of random skip lists. Acta Informatica, 31(8):775--792, 1994. Google ScholarDigital Library
- G. Korland, N. Shavit, and P. Felber. Noninvasive concurrency with Java S™. In Third Workshop on Programmability Issues for Multi-Core Computers (MULTIPROG), 2010.Google Scholar
- Y. Li, B. He, Q. Luo, and K. Yi. Tree indexing on flash disks. In Proc. 2009 IEEE International Conference on Data Engineering (ICDE), pages 1303--1306, 2009. Google ScholarDigital Library
- P. A. Martin. Method and apparatus for implementing a lock-free skip list that supports concurrent accesses, Dec. 11 2007. US Patent 7,308,448.Google Scholar
- M. M. Michael. High performance dynamic lock-free hash tables and list-based sets. In Proc. 14th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 73--82, 2002. Google ScholarDigital Library
- J. Nievergelt and E. M. Reingold. Binary search trees of bounded balance. SIAM Journal on Computing, 2(1):33--43, 1973.Google ScholarDigital Library
- P. O'Neil, E. Cheng, D. Gawlic, and E. O'Neil. The log-structured merge-tree (LSM-tree). Acta Informatica, 33(4):351--385, 1996. Google ScholarDigital Library
- R. Oshman and N. Shavit. The SkipTrie: low-depth concurrent search without rebalancing. In Proc. 32nd Annual ACM Symposium on Principles of Distributed Computing (PODC), pages 23--32, 2013. Google ScholarDigital Library
- T. Papadakis, J. I. Munro, and P. V. Poblete. Analysis of the expected search cost in skip lists. In Proc. 2nd Scandinavian Workshop on Algorithm Theory (SWAT), pages 160--172, 1990. Google ScholarDigital Library
- T. Peierls, B. Goetz, J. Bloch, J. Bowbeer, D. Lea, and D. Holmes. ConcurrentSkipList. In Java concurrency in practice. Pearson Education, 2006.Google Scholar
- W. Pugh. Concurrent maintenance of lists. Technical report, Dept. of Computer Science, University of Maryland, College Park, 1990. Google ScholarDigital Library
- W. Pugh. Skip lists: a probabilistic alternative to balanced trees. Communications of the ACM, 33(6):668--676, 1990. Google ScholarDigital Library
- R. Sears, M. Callaghan, and E. A. Brewer. Rose: compressed, log-structured replication. Proc. of the VLDB Endowment, 1(1):526--537, 2008. Google ScholarDigital Library
- R. Sears and R. Ramakrishnan. bLSM: a general purpose log structured merge tree. In Proc. ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 217--228, 2012. Google ScholarDigital Library
- N. Shamgunov. The MemSQL in-memory database system. In Proc. 2nd International Workshop on In Memory Data Management and Analytics (IMDM), 2014.Google Scholar
- N. Shavit and I. Lotan. Skiplist-based concurrent priority queues. In Proc. 14th International Parallel and Distributed Processing Symposium (IPDPS), pages 263--268, 2000. Google ScholarDigital Library
- D. D. Sleator and R. E. Tarjan. Amortized efficiency of list update and paging rules. Communications of the ACM, 28(2):202--208, Feb. 1985. Google ScholarDigital Library
- G. L. Steele Jr, A. T. Garthwaite, P. A. Martin, N. N. Shavit, M. S. Moir, and D. L. Detlefs. Lock-free implementation of concurrent shared object with dynamic node allocation and distinguishing pointer value, Nov. 30 2004. US Patent 6,826,757.Google Scholar
- H. Sundell and P. Tsigas. Scalable and lock-free concurrent dictionaries. In Proc. 2004 ACM Symposium on Applied Computing (SAC), pages 1438--1445, 2004. Google ScholarDigital Library
- J. D. Valois. Lock-free linked lists using compare-and-swap. In Proc. 14th Annual ACM Symposium on Principles of Distributed Computing (PODC), pages 214--222, 1995. Google ScholarDigital Library
- K. Yi. Dynamic indexability and the optimality of b-trees. Journal of the ACM, 59(4):21:1--21:19, Aug. 2012. Google ScholarDigital Library
Index Terms
- Write-Optimized Skip Lists
Recommendations
Anti-Persistence on Persistent Storage: History-Independent Sparse Tables and Dictionaries
PODS '16: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsWe present history-independent alternatives to a B-tree, the primary indexing data structure used in databases. A data structure is history independent (HI) if it is impossible to deduce any information by examining the bit representation of the data ...
Lock-free linked lists and skip lists
PODC '04: Proceedings of the twenty-third annual ACM symposium on Principles of distributed computingLock-free shared data structures implement distributed objects without the use of mutual exclusion, thus providing robustness and reliability. We present a new lock-free implementation of singly-linked lists. We prove that the worst-case amortized cost ...
Skip lift: A probabilistic alternative to red-black trees
We present the Skip lift, a randomized dictionary data structure inspired by the skip list [Pugh@?90, Comm. of the ACM]. Similar to the skip list, the skip lift has the finger search property: given a pointer to an arbitrary element f, searching for an ...
Comments