research-article

Public Access

A Cross-Enclave Composition Mechanism for Exascale System Software

Authors:
Noah Evans

Center for Computing Research, Sandia National Laboratories

Center for Computing Research, Sandia National Laboratories
View Profile

,
Kevin Pedretti

Center for Computing Research, Sandia National Laboratories

Center for Computing Research, Sandia National Laboratories
View Profile

,
Brian Kocoloski

Dept. of Computer Science, University of Pittsburgh

Dept. of Computer Science, University of Pittsburgh
View Profile

,
John Lange

Dept. of Computer Science, University of Pittsburgh

Dept. of Computer Science, University of Pittsburgh
View Profile

,
Michael Lang

Ultrascale Systems Research, Center Los Alamos National Laboratory

Ultrascale Systems Research, Center Los Alamos National Laboratory
View Profile

,
Patrick G. Bridges

Dept. of Computer Science, University of New Mexico

Dept. of Computer Science, University of New Mexico
View Profile

ROSS '16: Proceedings of the 6th International Workshop on Runtime and Operating Systems for SupercomputersJune 2016Article No.: 3Pages 1–8https://doi.org/10.1145/2931088.2931094

Published:01 June 2016Publication History

ROSS '16: Proceedings of the 6th International Workshop on Runtime and Operating Systems for Supercomputers

Pages 1–8

ABSTRACT

As supercomputers move to exascale, the number of cores per node continues to increase, but the I/O bandwidth between nodes is increasing more slowly. This leads to computational power outstripping I/O bandwidth. This growth, in turn, encourages moving as much of an HPC workflow as possible onto the node in order to minimize data movement. One particular method of application composition, enclaves, co-locates different operating systems and runtimes on the same node where they communicate by in situ communication mechanisms.

In this work, we describe a mechanism for communicating between composed applications. We implement a mechanism using Copy on Write cooperating with XEMEM shared memory to provide consistent, implicitly unsynchronized communication across enclaves. We then evaluate this mechanism using a composed application and analytics between the Kitten Lightweight Kernel and Linux on top of the Hobbes Operating System and Runtime. These results show a 3% overhead compared to an application running in isolation, demonstrating the viability of this approach.

References

H. Akkan, L. Ionkov, and M. Lang. Transparently consistent asynchronous shared memory. In Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, page 6. ACM, 2013. Google ScholarDigital Library
D. A. Boyuka, S. Lakshminarasimham, X. Zou, Z. Gong, J. Jenkins, E. R. Schendel, N. Podhorszki, Q. Liu, S. Klasky, and N. F. Samatova. Transparent in situ data transformations in adios. In Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on, pages 256--266. IEEE, 2014.Google ScholarDigital Library
R. Brightwell, R. Oldfield, A. B. Maccabe, and D. E. Bernholdt. Hobbes: Composition and virtualization as the foundations of an extreme-scale OS/R. In Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, page 2. ACM, 2013. Google ScholarDigital Library
R. Brightwell, K. Pedretti, and T. Hudson. Smartmap: operating system support for efficient data sharing among processes on a multi-core processor. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, page 25. IEEE Press, 2008. Google ScholarDigital Library
A. Burtsev, K. Srinivasan, P. Radhakrishnan, K. Voruganti, and G. R. Goodson. Fido: Fast inter-virtual-machine communication for enterprise appliances. In USENIX Annual technical conference. San Diego, CA, 2009. Google ScholarDigital Library
J. Dayal, D. Bratcher, G. Eisenhauer, K. Schwan, M. Wolf, X. Zhang, H. Abbasi, S. Klasky, and N. Podhorszki. Flexpath: Type-based publish/subscribe system for large-scale science analytics. In Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on, pages 246--255. IEEE, 2014.Google ScholarDigital Library
M. Giampapa, T. Gooding, T. Inglett, and R. Wisniewski. Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK. In Proceedings of the 23rd International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2010. Google ScholarDigital Library
B. Goglin and S. Moreaud. Knem: A generic and scalable kernel-assisted intra-node mpi communication framework. Journal of Parallel and Distributed Computing, 73(2):176--188, 2013. Google ScholarDigital Library
S. M. Hand. Self-paging in the nemesis operating system. In OSDI, volume 99, pages 73--86, 1999. Google ScholarDigital Library
S. M. Kelly, J. P. V. Dyke, and C. T. Vaughan. Catamount N-Way (CNW): An implementation of the Catamount light weight kernel supporting N-cores version 2.0. Technical Report SAND2008-4039P, Sandia National Laboratories, June 2008.Google Scholar
B. Kocoloski and J. Lange. Xemem: Efficient shared memory for composed applications on multi-os/r exascale systems. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, pages 89--100. ACM, 2015. Google ScholarDigital Library
B. Kocoloski, J. Lange, H. Abbasi, D. Bernholdt, T. Jones, J. Dayal, N. Evans, M. Lang, J. Lofstead, K. Pedretti, and P. Bridges. System-level support for composition of applications. In Proc. 5th International Workshop on Runtime and Operating Systems for Supercomputers (ROSS), 2015. Google ScholarDigital Library
J. Lange, K. Pedretti, T. Hudson, P. Dinda, Z. Cui, L. Xia, P. Bridges, A. Gocke, S. Jaconette, M. Levenhagen, and R. Brightwell. Palacios and Kitten: New High Performance Operating Systems for Scalable Virtualized and Native Supercomputing. In Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2010.Google ScholarCross Ref
J. Lange, K. Pedretti, T. Hudson, P. Dinda, Z. Cui, L. Xia, P. Bridges, A. Gocke, S. Jaconette, M. Levenhagen, et al. Palacios and kitten: New high performance operating systems for scalable virtualized and native supercomputing. In Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1--12. IEEE, 2010.Google ScholarCross Ref
J. Ouyang, B. Kocoloski, J. R. Lange, and K. Pedretti. Achieving performance isolation with lightweight co-kernels. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, pages 149--160. ACM, 2015. Google ScholarDigital Library
C. Sewell, K. Heitmann, H. Finkel, G. Zagaris, S. T. Parete-Koon, P. K. Fasel, A. Pope, N. Frontiere, L.-t. Lo, B. Messer, et al. Large-scale compute-intensive analysis via a combined in-situ and co-scheduling workflow approach. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, page 50. ACM, 2015. Google ScholarDigital Library
H. Tomita, M. Sato, and Y. Ishikawa. Japan overview talk. In Proc. 2nd International Workshop on Big Data and Extreme-scale Computing (BDEC), 2014.Google Scholar
J. Wang, K.-L. Wright, and K. Gopalan. Xenloop: a transparent high performance inter-vm network loopback. In Proceedings of the 17th international symposium on High performance distributed computing, pages 109--118. ACM, 2008. Google ScholarDigital Library
R. Wisniewski, T. Inglett, P. Keppel, R. Murty, and R. Riesen. mos: An architecture for extreme-scale operating systems. In Proc. 4th International Workshop on Runtime and Operating Systems for Supercomputers (ROSS), 2014. Google ScholarDigital Library
M. Woodacre, D. Robb, D. Roe, and K. Feind. The sgi® altixtm 3000 global shared-memory architecture. Silicon Graphics, Inc.(2003), 2005.Google Scholar
J. Woodring, M. Petersen, A. Schmeißer, J. Patchett, J. Ahrens, and H. Hagen. In situ eddy analysis in a high-resolution ocean climate model. Visualization and Computer Graphics, IEEE Transactions on, 22(1):857--866, 2016.Google Scholar
J. Zerr and R. Baker. Snap: Sn (discrete ordinates) application proxy - proxy description, 2013.Google Scholar
X. Zhang, S. McIntosh, P. Rohatgi, and J. L. Griffin. Xensocket: A high-throughput interdomain transport for virtual machines. In Middleware 2007, pages 184--203. Springer, 2007. Google ScholarDigital Library

Recommendations

System-Level Support for Composition of Applications
ROSS '15: Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers

Current HPC system software lacks support for emerging application deployment scenarios that combine one or more simulations with in situ analytics, sometimes called multi-component or multi-enclave applications. This paper presents an initial design ...
Read More
DotGrid: a.NET-based cross-platform software for desktop grids

Grid infrastructures that have provided wide integrated use of resources are becoming the de facto computing platform for solving large-scale problems in science, engineering and commerce. In this evolution, desktop grid technologies allow the grid ...
Read More
Cross-Layer Self-Adaptive/Self-Aware System Software for Exascale Systems
SBAC-PAD '14: Proceedings of the 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing

The extreme level of parallelism coupled with the limited available power budget expected in the exascale era brings unprecedented challenges that demand optimization of performance, power and resiliency in unison. Scalability on such systems is of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ROSS '16: Proceedings of the 6th International Workshop on Runtime and Operating Systems for Supercomputers
June 2016
54 pages
ISBN:9781450343879
DOI:10.1145/2931088

Copyright © 2016 ACM
© 2016 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 June 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
ROSS '16 Paper Acceptance Rate6of10submissions,60%Overall Acceptance Rate58of169submissions,34%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 226
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Cross-Enclave Composition Mechanism for Exascale System Software

ROSS '16: Proceedings of the 6th International Workshop on Runtime and Operating Systems for Supercomputers

ABSTRACT

References

Cited By

Recommendations

System-Level Support for Composition of Applications

DotGrid: a.NET-based cross-platform software for desktop grids

Cross-Layer Self-Adaptive/Self-Aware System Software for Exascale Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A Cross-Enclave Composition Mechanism for Exascale System Software

ROSS '16: Proceedings of the 6th International Workshop on Runtime and Operating Systems for Supercomputers

ABSTRACT

References

Cited By

Recommendations

System-Level Support for Composition of Applications

DotGrid: a.NET-based cross-platform software for desktop grids

Cross-Layer Self-Adaptive/Self-Aware System Software for Exascale Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media