research-article

Open access

Fast, multicore-scalable, low-fragmentation memory allocation through large virtual memory and global data structures

Authors:

Christoph M. Kirsch,

Michael Lippautz,

Ana SokolovaAuthors Info & Claims

OOPSLA 2015: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications

Pages 451 - 469

https://doi.org/10.1145/2814270.2814294

Published: 23 October 2015 Publication History

Abstract

We demonstrate that general-purpose memory allocation involving many threads on many cores can be done with high performance, multicore scalability, and low memory consumption. For this purpose, we have designed and implemented scalloc, a concurrent allocator that generally performs and scales in our experiments better than other allocators while using less memory, and is still competitive otherwise. The main ideas behind the design of scalloc are: uniform treatment of small and big objects through so-called virtual spans, efficiently and effectively reclaiming free memory through fast and scalable global data structures, and constant-time (modulo synchronization) allocation and deallocation operations that trade off memory reuse and spatial locality without being subject to false sharing.

Supplementary Material

Auxiliary Archive (p451-aigner-s.zip)

The artifact can be used to reproduce the results presented in the paper, i.e., it can be used to evaluate scalloc against other interesting allocators.

Download
2330.75 MB

References

[1]

Y. Afek, G. Korland, and E. Yanovsky. Quasi-linearizability: Relaxed consistency for improved concurrency. In Proc. Conference on Principles of Distributed Systems (OPODIS), pages 395–410. Springer, 2010. 29.

Digital Library

[2]

M. Aigner and C. Kirsch. ACDC: Towards a universal mutator for benchmarking heap management systems. In Proc. International Symposium on Memory Management (ISMM), pages 75–84. ACM, 2013.

Digital Library

[3]

E. Berger, K. McKinley, R. Blumofe, and P. Wilson. Hoard: a scalable memory allocator for multithreaded applications. In Proc. International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 117–128. ACM, 2000.

Digital Library

[4]

E. Berger, B. Zorn, and K. McKinley. Reconsidering custom memory allocation. In Proc. Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 1–12. ACM, 2002.

Digital Library

[5]

S. Blackburn, P. Cheng, and K. McKinley. Oil and water? High performance garbage collection in Java with MMTk. In Proc. International Conference on Software Engineering (ICSE). IEEE, 2004.

Digital Library

[6]

A. Clements, M. Kaashoek, and N. Zeldovich. RadixVM: Scalable address spaces for multithreaded applications. In Proc. ACM European Conference on Computer Systems (EuroSys), pages 211–224. ACM, 2013.

Digital Library

[7]

M. Dodds, A. Haas, and C. Kirsch. A scalable, correct time-stamped stack. In Proc. Symposium on Principles of Programming Languages (POPL), pages 233–246. ACM, 2015.

Digital Library

[8]

[9]

J. Evans. A scalable concurrent malloc(3) implementation for freebsd. In Proc. BSDCan, 2006.

[10]

W. Gloger. ptmalloc2 – a multi-thread malloc implementation. http://malloc.de/en/.

[11]

Google Inc. gperftools: Fast, multi-threaded malloc() and nifty performance analysis tools. http://code.google.com/p/ gperftools/.

[12]

A. Haas, T. Henzinger, C. Kirsch, M. Lippautz, H. Payer, A. Sezgin, and A. Sokolova. Distributed queues in shared memory—multicore performance and scalability through quantitative relaxation. In Proc. International Conference on Computing Frontiers (CF), pages 17:1–17:9. ACM, 2013.

Digital Library

[13]

A. Haas, T. Henzinger, A. Holzer, C. Kirsch, M. Lippautz, H. Payer, A. Sezgin, A. Sokolova, and H. Veith. Local linearizability. CoRR, abs/1502.07118, 2015.

[14]

D. Hendler, I. Incze, N. Shavit, and M. Tzafrir. Flat combining and the synchronization-parallelism tradeoff. In Proc. Symposium on Parallel Algorithms and Architectures (SPAA), pages 355–364. ACM, 2010.

Digital Library

[15]

J. L. Henning. Spec cpu2006 benchmark descriptions. SIGARCH Comput. Archit. News, 34(4):1–17, 2006.

Digital Library

[16]

[17]

T. Henzinger, C. Kirsch, H. Payer, A. Sezgin, and A. Sokolova. Quantitative relaxation of concurrent data structures. In Proc. Symposium on Principles of Programming Languages (POPL), pages 317–328. ACM, 2013.

Digital Library

[18]

M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Morgan Kaufmann Publishers Inc., 2008.

Digital Library

[19]

R. Hudson, B. Saha, A.-R. Adl-Tabatabai, and B. Hertzberg. Mcrt-malloc: a scalable transactional memory allocator. In Proc. International Symposium on Memory Management (ISMM), pages 74–83. ACM, 2006.

Digital Library

[20]

1133967.

[21]

Intel Corporation. Thread building blocks (tbb). http: //threadingbuildingblocks.org.

[22]

A. Kogan and E. Petrank. A methodology for creating fast waitfree data structures. In Proc. Symposium on Principles and Practice of Parallel Programming (PPoPP), pages 141–150. ACM, 2012.

Digital Library

[23]

B. Kuszmaul. Supermalloc: A super fast multithreaded malloc for 64-bit machines. In Proc. International Symposium on Memory Management (ISMM), pages 41–55. ACM, 2015.

Digital Library

[24]

P.-A. Larson and M. Krishnan. Memory allocation for longrunning server applications. In Proc. International Symposium on Memory Management (ISMM), pages 176–185. ACM, 1998.

Digital Library

[25]

[26]

D. Lea. A memory allocator. http://g.oswego.edu/dl/ html/malloc.html.

[27]

Lockless Inc. llalloc: Lockless memory allocator. http: //locklessinc.com/.

[28]

M. Michael. Hazard pointers: Safe memory reclamation for lock-free objects. IEEE Trans. Parallel Distrib. Syst., 15(6): 491–504, 2004.

Digital Library

[29]

M. Michael. Scalable lock-free dynamic memory allocation. In Proc. Conference on Programming Language Design and Implementation (PLDI), pages 35–46. ACM, 2004. 1145/996893.996848.

Digital Library

[30]

MicroQuill Inc. shbench. http://www.microquill.com/.

[31]

S. Schneider, C. Antonopoulos, and D. Nikolopoulos. Scalable locality-conscious multithreaded memory allocation. In Proc. International Symposium on Memory Management (ISMM), pages 84–94. ACM, 2006.

Digital Library

[32]

K. Serebryany, D. Bruening, A. Potapenko, and D. Vyukov. Addresssanitizer: A fast address sanity checker. In Proc. USENIX Conference on Annual Technical Conference (USENIX ATC), pages 28–28. USENIX Association, 2012.

Digital Library

[33]

R. Treiber. Systems programming: Coping with parallelism. Technical Report RJ-5118, IBM Research Center, 1986.

Cited By

Zhou JSilvestro STang SYang HLiu HZeng GWu BLiu CLiu T(2023)MemPerf: Profiling Allocator-Induced Performance SlowdownsProceedings of the ACM on Programming Languages10.1145/36228487:OOPSLA2(1418-1441)Online publication date: 16-Oct-2023
https://dl.acm.org/doi/10.1145/3622848
Yang HZhao XZhou JWang WKundu SWu BGuan HLiu TBlackburn SPetrank E(2023)NUMAlloc: A Faster NUMA Memory AllocatorProceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management10.1145/3591195.3595276(97-110)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591195.3595276
Cronburg KGuyer SSchaefer IReichenbach CStorm T(2019)Floorplan: spatial layout in memory management systemsProceedings of the 18th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3357765.3359519(81-93)Online publication date: 21-Oct-2019
https://dl.acm.org/doi/10.1145/3357765.3359519
Show More Cited By

Index Terms

Fast, multicore-scalable, low-fragmentation memory allocation through large virtual memory and global data structures
1. Information systems
  1. Information storage systems
    1. Storage management
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
  2. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        File systems management
        Memory management

Recommendations

Fast, multicore-scalable, low-fragmentation memory allocation through large virtual memory and global data structures
OOPSLA '15

We demonstrate that general-purpose memory allocation involving many threads on many cores can be done with high performance, multicore scalability, and low memory consumption. For this purpose, we have designed and implemented scalloc, a concurrent ...
Redesign the Memory Allocator for Non-Volatile Main Memory
Special Issue on Hardware and Algorithms for Learning On-a-chip and Special Issue on Alternative Computing Systems

The non-volatile memory (NVM) has the merits of byte-addressability, fast speed, persistency and low power consumption, which make it attractive to be used as main memory. Commonly, user process dynamically acquires memory through memory allocators. ...
Energy-aware flash memory management in virtual memory system

The traditional virtual memory system is designed for decades assuming a magnetic disk as the secondary storage. Recently, flash memory becomes a popular storage alternative for many portable devices with the continuing improvements on its capacity, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

OOPSLA 2015: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications

October 2015

953 pages

ISBN:9781450336895

DOI:10.1145/2814270

General Chair:
Jonathan Aldrich
Carnegie Mellon University, USA
,
Program Chair:
Patrick Eugster
Purdue University, USA

ACM SIGPLAN Notices Volume 50, Issue 10
OOPSLA '15
October 2015
953 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2858965
Editor:
Andy Gill
University of Kansas, Lawrence, KS
Issue’s Table of Contents

Copyright © 2015 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2015

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Google PhD Fellowship
Austrian Science Fund

Conference

SPLASH '15

Sponsor:

SIGPLAN

SPLASH '15: Conference on Systems, Programming, Languages, and Applications: Software for Humanity

October 25 - 30, 2015

PA, Pittsburgh, USA

Acceptance Rates

Overall Acceptance Rate 268 of 1,244 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

23
Total Citations
View Citations
1,318
Total Downloads

Downloads (Last 12 months)197
Downloads (Last 6 weeks)23

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhou JSilvestro STang SYang HLiu HZeng GWu BLiu CLiu T(2023)MemPerf: Profiling Allocator-Induced Performance SlowdownsProceedings of the ACM on Programming Languages10.1145/36228487:OOPSLA2(1418-1441)Online publication date: 16-Oct-2023
https://dl.acm.org/doi/10.1145/3622848
Yang HZhao XZhou JWang WKundu SWu BGuan HLiu TBlackburn SPetrank E(2023)NUMAlloc: A Faster NUMA Memory AllocatorProceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management10.1145/3591195.3595276(97-110)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591195.3595276
Cronburg KGuyer SSchaefer IReichenbach CStorm T(2019)Floorplan: spatial layout in memory management systemsProceedings of the 18th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3357765.3359519(81-93)Online publication date: 21-Oct-2019
https://dl.acm.org/doi/10.1145/3357765.3359519
Chen HLi CFang JHuang CCheng JZhang JHou YYan X(2019)GrasperProceedings of the ACM Symposium on Cloud Computing10.1145/3357223.3362715(87-100)Online publication date: 20-Nov-2019
https://dl.acm.org/doi/10.1145/3357223.3362715
Das RTsai SDuppala SLynch JArkin EChowdhury RMitchell JSkiena SScheideler CBerenbrink P(2019)Data Races and the Discrete Resource-time Tradeoff Problem with Resource Reuse over PathsThe 31st ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3323165.3323209(359-368)Online publication date: 17-Jun-2019
https://dl.acm.org/doi/10.1145/3323165.3323209
Li PLuo HDing CSinger JXu H(2019)Timescale functions for parallel memory allocationProceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management10.1145/3315573.3329987(64-78)Online publication date: 23-Jun-2019
https://dl.acm.org/doi/10.1145/3315573.3329987
Liétar PButler TClebsch SDrossopoulou SFranco JParkinson MShamis AWintersteiger CChisnall DSinger JXu H(2019)snmalloc: a message passing allocatorProceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management10.1145/3315573.3329980(122-135)Online publication date: 23-Jun-2019
https://dl.acm.org/doi/10.1145/3315573.3329980
Guerraoui RGuiroux HLachaize RQuéma VTrigonakis V(2019)Lock–UnlockACM Transactions on Computer Systems10.1145/330150136:1(1-149)Online publication date: 14-Mar-2019
https://dl.acm.org/doi/10.1145/3301501
Pupykina AAgosta G(2019)Survey of Memory Management Techniques for HPC and Cloud ComputingIEEE Access10.1109/ACCESS.2019.29541697(167351-167373)Online publication date: 2019
https://doi.org/10.1109/ACCESS.2019.2954169
Ippen TEppler JPlesser HDiesmann M(2017)Constructing Neuronal Network Models in Massively Parallel EnvironmentsFrontiers in Neuroinformatics10.3389/fninf.2017.0003011Online publication date: 16-May-2017
https://doi.org/10.3389/fninf.2017.00030
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten