skip to main content
10.1145/3034786.3056117acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article
Public Access

Write-Optimized Skip Lists

Published:09 May 2017Publication History

ABSTRACT

The skip list is an elegant dictionary data structure that is commonly deployed in RAM. A skip list with N elements supports searches, inserts, and deletes in O(log N) operations with high probability (w.h.p.) and range queries returning K elements in O(log N + K) operations w.h.p.

A seemingly natural way to generalize the skip list to external memory with block size B is to "promote" with probability 1/B, rather than 1/2. However, there are practical and theoretical obstacles to getting the skip list to retain its efficient performance, space bounds, and high-probability guarantees.

We give an external-memory skip list that achieves write-optimized bounds. That is, for 0 < ε < 1, range queries take O(logBε N + K/B) I/Os w.h.p. and insertions and deletions take O((logBε N) / B1-ε) amortized I/Os w.h.p.

Our write-optimized skip list inherits the virtue of simplicity from RAM skip lists. Moreover, it matches or beats the asymptotic bounds of prior write-optimized data structures such as the Bε & tree or LSM trees, which are deployed in high-performance databases and file systems.

The main technical challenge in proving our bounds comes from the fact that there are so few levels in the skip list, an aspect of the data structure that is essential to getting strong external-memory bounds. We use extremal-graph coloring to show that it is possible to decompose paths in the skip list into uncorrelated groups, regardless of the insertion/deletion pattern. Thus, we achieve our bounds by averaging over these uncorrelated paths rather than by averaging over uncorrelated levels, as in the standard skip list.

References

  1. I. Abraham, J. Aspnes, and J. Yuan. Skip B-trees. In Proc. 9th Annual International Conference on Principles of Distributed Systems (OPODIS), page 366, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9):1116--1127, Sept. 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Anagnostopoulos, M. Goodrich, and R. Tamassia. Persistent authenticated dictionaries and their applications. Information Security, pages 379--393, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Arge. The buffer tree: A technique for designing batched external data structures. Algorithmica, 37(1):1--24, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. Arge, D. Eppstein, and M. T. Goodrich. Skip-webs: efficient distributed data structures for multi-dimensional data sets. In Proc. 24th Annual ACM Symposium on Principles of Distributed Computing (PODC), pages 69--76, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Aspnes and G. Shah. Skip graphs. ACM Transactions on Algorithms, 3(4):37, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Bayer and E. M. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1:173--189, 1972. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. A. Bender, J. W. Berry, R. Johnson, T. M. Kroeger, S. McCauley, C. A. Phillips, B. Simon, S. Singh, and D. Zage. Anti-persistence on persistent storage: History-independent sparse tables and dictionaries. In Proc. 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS), pages 289--302, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. A. Bender, M. Farach-Colton, J. T. Fineman, Y. R. Fogel, B. C. Kuszmaul, and J. Nelson. Cache-oblivious streaming B-trees. In Proc. 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 81--92, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. A. Bender, M. Farach-Colton, W. Jannen, R. Johnson, B. C. Kuszmaul, D. E. Porter, J. Yuan, and Y. Zhan. An introduction to Bε-trees and write-optimization. :login; magazine, 40(5):22--28, October 2015.Google ScholarGoogle Scholar
  11. M. A. Bender, M. Farach-Colton, R. Johnson, R. Kraner, B. C. Kuszmaul, D. Medjedovic, P. Montes, P. Shetty, R. P. Spillane, and E. Zadok. Don't thrash: How to cache your hash on flash. Proc. of the VLDB Endowment (PVLDB), 5(11):1627--1637, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. A. Bender, J. T. Fineman, S. Gilbert, and B. C. Kuszmaul. Concurrent cache-oblivious B-trees. In Proc. 17th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 228--237, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. S. Brodal, E. D. Demaine, J. T. Fineman, J. Iacono, S. Langerman, and J. I. Munro. Cache-oblivious dynamic dictionaries with update/query tradeoffs. In Proc.\ 21st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1448--1456, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. S. Brodal and R. Fagerberg. Lower bounds for external memory dictionaries. In Proc. 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 546--554, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. L. Buchsbaum, M. Goldwasser, S. Venkatasubramanian, and J. R. Westbrook. On external memory graph traversal. In Proc. 11th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 859--860, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Callahan, M. T. Goodrich, and K. Ramaiyer. Topology B-trees and their applications. In Proc. 4th International Workshop on Algorithms and Data Structures (WADS), pages 381--392, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. V. Ciriani, P. Ferragina, F. Luccio, and S. Muthukrishnan. Static optimality theorem for external memory string access. In Proc. 43rd Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 219--227, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. Devroye. A limit theory for random skip lists. The Annals of Applied Probability, pages 597--609, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  19. M. Fomitchev and E. Ruppert. Lock-free linked lists and skip lists. In Proc. 23rd Annual ACM Symposium on Principles of Distributed Computing (PODC), pages 50--59, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Golovin. The B-skip-list: A simpler uniquely represented alternative to B-trees. arXiv preprint arXiv:1005.0662, 2010.Google ScholarGoogle Scholar
  21. M. T. Goodrich and R. Tamassia. Efficient authenticated dictionaries with skip lists and commutative hashing. US Patent App, 10(416,015), 2000.Google ScholarGoogle Scholar
  22. G. Graefe. Write-optimized B-trees. In Proc. 30th International Conference on Very Large Data Bases (VLDB), pages 672--683, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Graefe. B-tree indexes for high update rates. SIGMOD Rec., 35(1):39--44, Mar. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Herlihy, Y. Lev, V. Luchangco, and N. Shavit. A provably correct scalable concurrent skip list. In Proc. Conference On Principles of Distributed Systems (OPODIS), 2006.Google ScholarGoogle Scholar
  25. M. Herlihy, Y. Lev, V. Luchangco, and N. Shavit. A simple optimistic skiplist algorithm. Proc. 14th Annual Colloquium on Structural Information and Communication Complexity (SIROCCO), page 124, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Jacob, A. Richa, C. Scheideler, S. Schmid, and H. Täubig. A distributed polylogarithmic time algorithm for self-stabilizing skip graphs. In Proc. 28th ACM Symposium on Principles of Distributed Computing (PODC), pages 131--140, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Jermaine, A. Datta, and E. Omiecinski. A novel index supporting high volume data warehouse insertion. In M. P. Atkinson, M. E. Orlowska, P. Valduriez, S. B. Zdonik, and M. L. Brodie, editors, Proc. 25th International Conference on Very Large Data Bases (VLDB), pages 235--246, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. P. Kirschenhofer and H. Prodinger. The path length of random skip lists. Acta Informatica, 31(8):775--792, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. G. Korland, N. Shavit, and P. Felber. Noninvasive concurrency with Java S™. In Third Workshop on Programmability Issues for Multi-Core Computers (MULTIPROG), 2010.Google ScholarGoogle Scholar
  30. Y. Li, B. He, Q. Luo, and K. Yi. Tree indexing on flash disks. In Proc. 2009 IEEE International Conference on Data Engineering (ICDE), pages 1303--1306, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. P. A. Martin. Method and apparatus for implementing a lock-free skip list that supports concurrent accesses, Dec. 11 2007. US Patent 7,308,448.Google ScholarGoogle Scholar
  32. M. M. Michael. High performance dynamic lock-free hash tables and list-based sets. In Proc. 14th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 73--82, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Nievergelt and E. M. Reingold. Binary search trees of bounded balance. SIAM Journal on Computing, 2(1):33--43, 1973.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. P. O'Neil, E. Cheng, D. Gawlic, and E. O'Neil. The log-structured merge-tree (LSM-tree). Acta Informatica, 33(4):351--385, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. Oshman and N. Shavit. The SkipTrie: low-depth concurrent search without rebalancing. In Proc. 32nd Annual ACM Symposium on Principles of Distributed Computing (PODC), pages 23--32, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. T. Papadakis, J. I. Munro, and P. V. Poblete. Analysis of the expected search cost in skip lists. In Proc. 2nd Scandinavian Workshop on Algorithm Theory (SWAT), pages 160--172, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. T. Peierls, B. Goetz, J. Bloch, J. Bowbeer, D. Lea, and D. Holmes. ConcurrentSkipList. In Java concurrency in practice. Pearson Education, 2006.Google ScholarGoogle Scholar
  38. W. Pugh. Concurrent maintenance of lists. Technical report, Dept. of Computer Science, University of Maryland, College Park, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. W. Pugh. Skip lists: a probabilistic alternative to balanced trees. Communications of the ACM, 33(6):668--676, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. R. Sears, M. Callaghan, and E. A. Brewer. Rose: compressed, log-structured replication. Proc. of the VLDB Endowment, 1(1):526--537, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. R. Sears and R. Ramakrishnan. bLSM: a general purpose log structured merge tree. In Proc. ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 217--228, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. N. Shamgunov. The MemSQL in-memory database system. In Proc. 2nd International Workshop on In Memory Data Management and Analytics (IMDM), 2014.Google ScholarGoogle Scholar
  43. N. Shavit and I. Lotan. Skiplist-based concurrent priority queues. In Proc. 14th International Parallel and Distributed Processing Symposium (IPDPS), pages 263--268, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. D. D. Sleator and R. E. Tarjan. Amortized efficiency of list update and paging rules. Communications of the ACM, 28(2):202--208, Feb. 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. G. L. Steele Jr, A. T. Garthwaite, P. A. Martin, N. N. Shavit, M. S. Moir, and D. L. Detlefs. Lock-free implementation of concurrent shared object with dynamic node allocation and distinguishing pointer value, Nov. 30 2004. US Patent 6,826,757.Google ScholarGoogle Scholar
  46. H. Sundell and P. Tsigas. Scalable and lock-free concurrent dictionaries. In Proc. 2004 ACM Symposium on Applied Computing (SAC), pages 1438--1445, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. J. D. Valois. Lock-free linked lists using compare-and-swap. In Proc. 14th Annual ACM Symposium on Principles of Distributed Computing (PODC), pages 214--222, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. K. Yi. Dynamic indexability and the optimality of b-trees. Journal of the ACM, 59(4):21:1--21:19, Aug. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Write-Optimized Skip Lists

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            PODS '17: Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
            May 2017
            458 pages
            ISBN:9781450341981
            DOI:10.1145/3034786

            Copyright © 2017 ACM

            © 2017 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 9 May 2017

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            PODS '17 Paper Acceptance Rate29of101submissions,29%Overall Acceptance Rate642of2,707submissions,24%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader