research-article

A study of connected object locality in NUMA heaps

Author:
Khaled Alnowaiser

University of Glasgow

University of Glasgow
View Profile

MSPC '14: Proceedings of the workshop on Memory Systems Performance and CorrectnessJune 2014Article No.: 1Pages 1–9https://doi.org/10.1145/2618128.2618132

Published:13 June 2014Publication History

MSPC '14: Proceedings of the workshop on Memory Systems Performance and Correctness

Pages 1–9

ABSTRACT

Reference locality is vital to the performance of parallel Garbage Collection (GC) running on Non-Uniform Memory Access (NUMA) machines. A GC thread may trace remotely placed objects that descend from the root set or, for load balance, a GC thread may steal non-local objects from other threads' work lists. Processing distant live objects could introduce contention in the interconnect links between memory nodes and it could increase memory access latency. Researchers have proposed various techniques to improve GC tracing locality. However, few studies attempt to optimize the locality of connected objects in NUMA object graph. In this paper, we study the locality of a rooted subgraph, a unit of object connectivity in the object graph. A rooted subgraph is a set of references containing one root reference, and every reference is reachable from the root. We empirically study the locality of rooted subgraphs of DaCapo and SPECjbb2005 benchmark suites. The results show that more than 80% of objects in a rooted subgraph are located in the same memory node as the root object. We then propose a GC locality optimization that uses the root memory node as a heuristic to guide GC threads processing local objects.

References

T. A. Anderson. Optimizations in a Private Nursery-based Garbage Collector. In Proceedings of the 2010 International Symposium on Memory Management, ISMM '10, pages 21--30, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
S. Auhagen, L. Bergstrom, M. Fluet, and J. Reppy. Garbage Collection for Multicore NUMA Machines. In Proceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, pages 51--57, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
M. Awasthi, D. W. Nellans, K. Sudan, R. Balasubramonian, and A. Davis. Handling the Problems and Opportunities Posed by Multiple On-chip Memory Controllers. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT '10, pages 319--330, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
K. Barabash and E. Petrank. Tracing Garbage Collection on Highly Parallel Platforms. In Proceedings of the 2010 International Symposium on Memory Management, ISMM '10, pages 1--10, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The Dacapo Benchmarks: Java Benchmarking Development and Analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications, OOPSLA '06, pages 169--190, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
C. J. Cheney. A Nonrecursive List Compacting Algorithm. Communications of the ACM, 13(11):677--678, 1970. Google ScholarDigital Library
Y. Chicha and S. M. Watt. A Localized Tracing Scheme Applied to Garbage Collection. Programming Languages and Systems, 4279:323--339, 2006. Google ScholarDigital Library
T. Domani, G. Goldshtein, E. K. Kolodner, E. Lewis, E. Petrank, and D. Sheinwald. Thread-local Heaps for Java. In Proceedings of the 3rd International Symposium on Memory Management, ISMM '02, pages 76--87, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
T. Endo, K. Taura, and A. Yonezawa. A Scalable Mark-Sweep Garbage Collector on Large-Scale Shared-Memory Machines. In Proceedings of the 1997 ACM/IEEE Conference on Supercomputing, pages 1--14, San Jose, CA, 1997. Google ScholarDigital Library
H. Eran and E. Petrank. A Study of Data Structures with a Deep Heap Shape. In Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, pages 21--28, Seattle, Washington, 2013. ACM. Google ScholarDigital Library
C. Flood, D. Detlefs, N. Shavit, and X. Zhang. Parallel Garbage Collection for Shared Memory Multiprocessors. In Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium, page 21, Monterey, California, 2001. USENIX Association. Google ScholarDigital Library
L. Gidra, G. Thomas, J. Sopena, and M. Shapiro. A Study of the Scalability of Stop-the-world Garbage Collectors on Multicores. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '13, pages 229--240, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
M. Hirzel, J. Henkel, A. Diwan, and M. Hind. Understanding the Connectivity of Heap Objects. In Proceedings of the 3rd International Symposium on Memory Management, ISMM '02, pages 36--49, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
X. Huang, S. M. Blackburn, K. S. McKinley, J. E. B. Moss, Z. Wang, and P. Cheng. The Garbage Collection Advantage: Improving Program Locality. In Proceedings of the 19th Annual ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications, OOPSLA '04, pages 69--80, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
R. Jones, A. Hosking, and E. Moss. The Garbage Collection Handbook: The Art of Automatic Memory Management. Chapman & Hall/CRC, 1st edition, 2011. Google ScholarDigital Library
R. Jones and A. King. A Fast Analysis for Thread-local Garbage Collection with Dynamic Class Loading. In Fifth IEEE International Workshop on Source Code Analysis and Manipulation, pages 129--138, 2005. Google ScholarDigital Library
M. S. Lam, P. R. Wilson, and T. G. Moher. Object Type Directed Garbage Collection To Improve Locality. In Proceedings of the International Workshop on Memory Management, IWMM '92, pages 404--425, London, UK, UK, 1992. Springer-Verlag. Google ScholarDigital Library
Z. Majo and T. R. Gross. Memory Management in NUMA Multicore Systems: Trapped Between Cache Contention and Interconnect Overhead. In Proceedings of the international symposium on Memory management - ISMM '11, page 11, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
S. Marlow and S. Peyton Jones. Multicore Garbage Collection with Local Heaps. In Proceedings of the International Symposium on Memory Management, ISMM '11, pages 21--32, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
C. E. Oancea, A. Mycroft, and S. M. Watt. A New Approach to Parallelising Tracing Algorithms. In Proceedings of the 2009 International Symposium on Memory Management, ISMM '09, pages 10--19, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
T. Ogasawara. Numa-aware Memory Manager with Dominant-thread-based Copying GC. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA '09, pages 377--390, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
Y. Shuf, M. Gupta, R. Bordawekar, and J. P. Singh. Exploiting Prolific Types for Memory Management and Optimizations. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '02, pages 295--306, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
Y. Shuf, M. Gupta, H. Franke, A. Appel, and J. P. Singh. Creating and Preserving Locality of Java Applications at Allocation and Garbage Collection Times. In Proceedings of the 17th ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications, OOPSLA '02, pages 13--25, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
SPECjbb2005. Standard Performance Evaluation Corporation. Available at: http://www.spec.org/jbb2005.Google Scholar
B. Steensgaard. Thread-specific Heaps for Multi-threaded Programs. In Proceedings of the 2Nd International Symposium on Memory Management, ISMM '00, pages 18--24, New York, NY, USA, 2000. ACM. Google ScholarDigital Library
M. M. Tikir and J. K. Hollingsworth. Numa-aware Java Heaps for Server Applications. In Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium, IPDPS '05, pages 108.2--, Washington, DC, USA, 2005. IEEE Computer Society. Google ScholarDigital Library
P. R. Wilson, M. S. Lam, and T. G. Moher. Effective "Static-graph" Reorganization to Improve Locality in Garbage-collected Systems. In Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, PLDI '91, pages 177--191, New York, NY, USA, 1991. ACM. Google ScholarDigital Library
J. Zhou and B. Demsky. Memory Management for Many-core Processors with Software Configurable Locality Policies. In Proceedings of the 2012 International Symposium on Memory Management, ISMM '12, pages 3--14, New York, NY, USA, 2012. ACM. Google ScholarDigital Library

Index Terms

A study of connected object locality in NUMA heaps

Recommendations

Creating and preserving locality of java applications at allocation and garbage collection times
OOPSLA '02: Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications

The growing gap between processor and memory speeds is motivating the need for optimization strategies that improve data locality. A major challenge is to devise techniques suitable for pointer-intensive applications. This paper presents two techniques ...
Read More
Thread-local heaps for Java
ISMM '02: Proceedings of the 3rd international symposium on Memory management

We present a memory management scheme for Java based on thread-local heaps. Assuming most objects are created and used by a single thread, it is desirable to free the memory manager from redundant synchronization for thread-local objects. Therefore, in ...
Read More
Creating and preserving locality of java applications at allocation and garbage collection times

The growing gap between processor and memory speeds is motivating the need for optimization strategies that improve data locality. A major challenge is to devise techniques suitable for pointer-intensive applications. This paper presents two techniques ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MSPC '14: Proceedings of the workshop on Memory Systems Performance and Correctness
June 2014
61 pages
ISBN:9781450329170
DOI:10.1145/2618128
General Chair:
Jeremy Singer
University of Glasgow
,
Program Chairs:
Milind Kulkarni
Purdue University
,
Tim Harris
Oracle Labs
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 June 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
locality
rooted subgraph
Qualifiers
- research-article
Conference

Acceptance Rates
MSPC '14 Paper Acceptance Rate6of20submissions,30%Overall Acceptance Rate6of20submissions,30%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 166
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.