research-article

Transparently consistent asynchronous shared memory

Authors:
Hakan Akkan

New Mexico Consortium

New Mexico Consortium
View Profile

,
Latchesar Ionkov

Los Alamos National Laboratory

Los Alamos National Laboratory
View Profile

,
Michael Lang

Los Alamos National Laboratory

Los Alamos National Laboratory
View Profile

ROSS '13: Proceedings of the 3rd International Workshop on Runtime and Operating Systems for SupercomputersJune 2013Article No.: 6Pages 1–6https://doi.org/10.1145/2491661.2481431

Published:10 June 2013Publication History

ROSS '13: Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers

Pages 1–6

ABSTRACT

The advent of many-core processors is imposing many changes on the operating system. The resources that are under contention have changed; previously, CPU cycles were the resource in demand and required fair and precise sharing. Now compute cycles are plentiful, but the memory per core is decreasing. In the past, scientific applications used all the CPU cores to finish as fast as possible, with visualization and analysis of the data performed after the simulation finished. With decreasing memory available per core, as well as the higher price (in power and time) for storing data on disk or sending it over the network, it now makes sense to run visualization and analytics applications in-situ, while the application is running. Visualization and analytics applications then need to sample the simulation memory with as little interference and as little changes in the simulation code as possible.

We propose an asynchronous memory sharing facility that allows consistent states of the memory to be shared between processes without any implicit or explicit synchronization. We distinguish two types of processes; a single producer and one or more observers. The producer modifies the state of the data, making available consistent versions of the state to any observer. The observers, working at different sampling rates, can access the latest available consistent state.

Some applications that would benefit from this type of facility include check-pointing applications, processes monitoring, unobtrusive process debugging, and the sharing of data for visualization or analytics. To evaluate our ideas we have developed two kernel-level implementations for sharing data asynchronously and we compared these implementations to a traditional user-space synchronized multi-buffer method.

We have seen improvements of up to 3.5x in our tests over the traditional multi-buffer method with 20% of the data pages touched.

References

H. Akkan, M. Lang, and L. M. Liebrock. Stepping towards noiseless linux environment. In Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers, ROSS '12, pages 7:1--7:7, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
A. Arcangeli, I. Eidus, and C. Wright. Increasing memory density by using KSM. In Proceedings of the Linux Symposium, pages 19--28, 2009.Google Scholar
A. Belay, A. Bittau, A. Mashtizadeh, D. Terei, D. Mazières, and C. Kozyrakis. Dune: Safe user-level access to privileged CPU features. In Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation, OSDI'12, pages 335--348, Berkeley, CA, USA, 2012. Google ScholarDigital Library
J. K. Bennett, J. B. Carter, and W. Zwaenepoel. Munin: distributed shared memory based on type-specific memory coherence. SIGPLAN Not., 25(3):168--176, Feb. 1990. Google ScholarDigital Library
R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: an efficient multithreaded runtime system. SIGPLAN Not., 30(8):207--216, Aug. 1995. Google ScholarDigital Library
J. Duell. The design and implementation of Berkeley Lab's Linux Checkpoint/Restart. Technical report, 2003.Google Scholar
P. Emelyanov. Checkpoint restart in userspace. http://criu.org, 2013.Google Scholar
A. Kulkarni, A. Lumsdaine, M. Lang, and L. Ionkov. Optimizing latency and throughput for spawning processes on massively multicore processors. In Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers, ROSS '12, pages 6:1--6:7, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
J. Lange, K. Pedretti, T. Hudson, P. Dinda, Z. Cui, L. Xia, P. Bridges, A. Gocke, S. Jaconette, M. Levenhagen, and R. Brightwell. Palacios and kitten: New high performance operating systems for scalable virtualized and native supercomputing. In Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1--12, april 2010.Google ScholarCross Ref
J. Moran. SunOS virtual memory implementation. In Proceedings of the Spring 1988 European UNIX Users Group Conference, 1988.Google Scholar
D. Orozco, E. Garcia, R. Pavel, R. Khan, and G. Gao. Tideflow: The time iterated dependency flow execution model. In Proceedings of the 2011 First Workshop on Data-Flow Execution Models for Extreme Scale Computing, DFM '11, pages 1--9, Washington, DC, USA, 2011. IEEE Computer Society. Google ScholarDigital Library
J. C. Sancho, F. Petrini, G. Johnson, J. Fernández, and E. Frachtenberg. On the feasibility of incremental checkpointing for scientific computing. In IPDPS, 2004.Google Scholar
P. Snyder. tmpfs: A virtual memory file system. In Proceedings of the Autumn 1990 EUUG Conference, pages 241--248, 1990.Google Scholar
S. Zuckerman, J. Suetterlein, R. Knauerhase, and G. R. Gao. Using a "codelet" program execution model for exascale machines: position paper. In Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era, EXADAPT '11, pages 64--69, New York, NY, USA, 2011. ACM. Google ScholarDigital Library

Index Terms

Transparently consistent asynchronous shared memory
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Virtual memory

Recommendations

Enabling Hybrid PCM Memory System with Inherent Memory Management
RACS '16: Proceedings of the International Conference on Research in Adaptive and Convergent Systems

Replacing the traditional volatile main memory, e.g., DRAM, with a non-volatile phase change memory (PCM) has become a possible solution to reduce the energy consumption of computing systems. To further reduce the bit cost of PCM, the development trend ...
Read More
Redesign the Memory Allocator for Non-Volatile Main Memory
Special Issue on Hardware and Algorithms for Learning On-a-chip and Special Issue on Alternative Computing Systems

The non-volatile memory (NVM) has the merits of byte-addressability, fast speed, persistency and low power consumption, which make it attractive to be used as main memory. Commonly, user process dynamically acquires memory through memory allocators. ...
Read More
A locality-improving dynamic memory allocator
MSP '05: Proceedings of the 2005 workshop on Memory system performance

In general-purpose applications, most data is dynamically allocated. The memory manager therefore plays a crucial role in application performance by determining the spatial locality of heap objects. Previous general-purpose allocators have focused on ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ROSS '13: Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
June 2013
75 pages
ISBN:9781450321464
DOI:10.1145/2491661
Conference Chairs:
Torsten Hoefler
ETH Zurich, Switzerland
,
Kamil Iskra
Argonne National Laboratory
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 June 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
copy on write
memory management
memory sharing
virtual memory
Qualifiers
- research-article
Conference

Acceptance Rates
ROSS '13 Paper Acceptance Rate9of18submissions,50%Overall Acceptance Rate58of169submissions,34%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 151
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Transparently consistent asynchronous shared memory

ROSS '13: Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers

ABSTRACT

References

Cited By

Index Terms

Recommendations

Enabling Hybrid PCM Memory System with Inherent Memory Management

Redesign the Memory Allocator for Non-Volatile Main Memory

A locality-improving dynamic memory allocator