skip to main content
10.1145/2814270.2814294acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article
Open access

Fast, multicore-scalable, low-fragmentation memory allocation through large virtual memory and global data structures

Published: 23 October 2015 Publication History

Abstract

We demonstrate that general-purpose memory allocation involving many threads on many cores can be done with high performance, multicore scalability, and low memory consumption. For this purpose, we have designed and implemented scalloc, a concurrent allocator that generally performs and scales in our experiments better than other allocators while using less memory, and is still competitive otherwise. The main ideas behind the design of scalloc are: uniform treatment of small and big objects through so-called virtual spans, efficiently and effectively reclaiming free memory through fast and scalable global data structures, and constant-time (modulo synchronization) allocation and deallocation operations that trade off memory reuse and spatial locality without being subject to false sharing.

Supplementary Material

Auxiliary Archive (p451-aigner-s.zip)
The artifact can be used to reproduce the results presented in the paper, i.e., it can be used to evaluate scalloc against other interesting allocators.

References

[1]
Y. Afek, G. Korland, and E. Yanovsky. Quasi-linearizability: Relaxed consistency for improved concurrency. In Proc. Conference on Principles of Distributed Systems (OPODIS), pages 395–410. Springer, 2010. 29.
[2]
M. Aigner and C. Kirsch. ACDC: Towards a universal mutator for benchmarking heap management systems. In Proc. International Symposium on Memory Management (ISMM), pages 75–84. ACM, 2013.
[3]
E. Berger, K. McKinley, R. Blumofe, and P. Wilson. Hoard: a scalable memory allocator for multithreaded applications. In Proc. International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 117–128. ACM, 2000.
[4]
E. Berger, B. Zorn, and K. McKinley. Reconsidering custom memory allocation. In Proc. Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 1–12. ACM, 2002.
[5]
S. Blackburn, P. Cheng, and K. McKinley. Oil and water? High performance garbage collection in Java with MMTk. In Proc. International Conference on Software Engineering (ICSE). IEEE, 2004.
[6]
A. Clements, M. Kaashoek, and N. Zeldovich. RadixVM: Scalable address spaces for multithreaded applications. In Proc. ACM European Conference on Computer Systems (EuroSys), pages 211–224. ACM, 2013.
[7]
M. Dodds, A. Haas, and C. Kirsch. A scalable, correct time-stamped stack. In Proc. Symposium on Principles of Programming Languages (POPL), pages 233–246. ACM, 2015.
[9]
J. Evans. A scalable concurrent malloc(3) implementation for freebsd. In Proc. BSDCan, 2006.
[10]
W. Gloger. ptmalloc2 – a multi-thread malloc implementation. http://malloc.de/en/.
[11]
Google Inc. gperftools: Fast, multi-threaded malloc() and nifty performance analysis tools. http://code.google.com/p/ gperftools/.
[12]
A. Haas, T. Henzinger, C. Kirsch, M. Lippautz, H. Payer, A. Sezgin, and A. Sokolova. Distributed queues in shared memory—multicore performance and scalability through quantitative relaxation. In Proc. International Conference on Computing Frontiers (CF), pages 17:1–17:9. ACM, 2013.
[13]
A. Haas, T. Henzinger, A. Holzer, C. Kirsch, M. Lippautz, H. Payer, A. Sezgin, A. Sokolova, and H. Veith. Local linearizability. CoRR, abs/1502.07118, 2015.
[14]
D. Hendler, I. Incze, N. Shavit, and M. Tzafrir. Flat combining and the synchronization-parallelism tradeoff. In Proc. Symposium on Parallel Algorithms and Architectures (SPAA), pages 355–364. ACM, 2010.
[15]
J. L. Henning. Spec cpu2006 benchmark descriptions. SIGARCH Comput. Archit. News, 34(4):1–17, 2006.
[17]
T. Henzinger, C. Kirsch, H. Payer, A. Sezgin, and A. Sokolova. Quantitative relaxation of concurrent data structures. In Proc. Symposium on Principles of Programming Languages (POPL), pages 317–328. ACM, 2013.
[18]
M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Morgan Kaufmann Publishers Inc., 2008.
[19]
R. Hudson, B. Saha, A.-R. Adl-Tabatabai, and B. Hertzberg. Mcrt-malloc: a scalable transactional memory allocator. In Proc. International Symposium on Memory Management (ISMM), pages 74–83. ACM, 2006.
[20]
[21]
Intel Corporation. Thread building blocks (tbb). http: //threadingbuildingblocks.org.
[22]
A. Kogan and E. Petrank. A methodology for creating fast waitfree data structures. In Proc. Symposium on Principles and Practice of Parallel Programming (PPoPP), pages 141–150. ACM, 2012.
[23]
B. Kuszmaul. Supermalloc: A super fast multithreaded malloc for 64-bit machines. In Proc. International Symposium on Memory Management (ISMM), pages 41–55. ACM, 2015.
[24]
P.-A. Larson and M. Krishnan. Memory allocation for longrunning server applications. In Proc. International Symposium on Memory Management (ISMM), pages 176–185. ACM, 1998.
[26]
D. Lea. A memory allocator. http://g.oswego.edu/dl/ html/malloc.html.
[27]
Lockless Inc. llalloc: Lockless memory allocator. http: //locklessinc.com/.
[28]
M. Michael. Hazard pointers: Safe memory reclamation for lock-free objects. IEEE Trans. Parallel Distrib. Syst., 15(6): 491–504, 2004.
[29]
M. Michael. Scalable lock-free dynamic memory allocation. In Proc. Conference on Programming Language Design and Implementation (PLDI), pages 35–46. ACM, 2004. 1145/996893.996848.
[30]
MicroQuill Inc. shbench. http://www.microquill.com/.
[31]
S. Schneider, C. Antonopoulos, and D. Nikolopoulos. Scalable locality-conscious multithreaded memory allocation. In Proc. International Symposium on Memory Management (ISMM), pages 84–94. ACM, 2006.
[32]
K. Serebryany, D. Bruening, A. Potapenko, and D. Vyukov. Addresssanitizer: A fast address sanity checker. In Proc. USENIX Conference on Annual Technical Conference (USENIX ATC), pages 28–28. USENIX Association, 2012.
[33]
R. Treiber. Systems programming: Coping with parallelism. Technical Report RJ-5118, IBM Research Center, 1986.

Cited By

View all
  • (2023)MemPerf: Profiling Allocator-Induced Performance SlowdownsProceedings of the ACM on Programming Languages10.1145/36228487:OOPSLA2(1418-1441)Online publication date: 16-Oct-2023
  • (2023)NUMAlloc: A Faster NUMA Memory AllocatorProceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management10.1145/3591195.3595276(97-110)Online publication date: 6-Jun-2023
  • (2019)Floorplan: spatial layout in memory management systemsProceedings of the 18th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3357765.3359519(81-93)Online publication date: 21-Oct-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
OOPSLA 2015: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications
October 2015
953 pages
ISBN:9781450336895
DOI:10.1145/2814270
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 50, Issue 10
    OOPSLA '15
    October 2015
    953 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2858965
    • Editor:
    • Andy Gill
    Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2015

Check for updates

Author Tags

  1. Memory allocator
  2. concurrent data structures
  3. multicore scalability
  4. virtual memory

Qualifiers

  • Research-article

Funding Sources

Conference

SPLASH '15
Sponsor:

Acceptance Rates

Overall Acceptance Rate 268 of 1,244 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)197
  • Downloads (Last 6 weeks)23
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)MemPerf: Profiling Allocator-Induced Performance SlowdownsProceedings of the ACM on Programming Languages10.1145/36228487:OOPSLA2(1418-1441)Online publication date: 16-Oct-2023
  • (2023)NUMAlloc: A Faster NUMA Memory AllocatorProceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management10.1145/3591195.3595276(97-110)Online publication date: 6-Jun-2023
  • (2019)Floorplan: spatial layout in memory management systemsProceedings of the 18th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3357765.3359519(81-93)Online publication date: 21-Oct-2019
  • (2019)GrasperProceedings of the ACM Symposium on Cloud Computing10.1145/3357223.3362715(87-100)Online publication date: 20-Nov-2019
  • (2019)Data Races and the Discrete Resource-time Tradeoff Problem with Resource Reuse over PathsThe 31st ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3323165.3323209(359-368)Online publication date: 17-Jun-2019
  • (2019)Timescale functions for parallel memory allocationProceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management10.1145/3315573.3329987(64-78)Online publication date: 23-Jun-2019
  • (2019)snmalloc: a message passing allocatorProceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management10.1145/3315573.3329980(122-135)Online publication date: 23-Jun-2019
  • (2019)Lock–UnlockACM Transactions on Computer Systems10.1145/330150136:1(1-149)Online publication date: 14-Mar-2019
  • (2019)Survey of Memory Management Techniques for HPC and Cloud ComputingIEEE Access10.1109/ACCESS.2019.29541697(167351-167373)Online publication date: 2019
  • (2017)Constructing Neuronal Network Models in Massively Parallel EnvironmentsFrontiers in Neuroinformatics10.3389/fninf.2017.0003011Online publication date: 16-May-2017
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media