ABSTRACT
As supercomputers move to exascale, the number of cores per node continues to increase, but the I/O bandwidth between nodes is increasing more slowly. This leads to computational power outstripping I/O bandwidth. This growth, in turn, encourages moving as much of an HPC workflow as possible onto the node in order to minimize data movement. One particular method of application composition, enclaves, co-locates different operating systems and runtimes on the same node where they communicate by in situ communication mechanisms.
In this work, we describe a mechanism for communicating between composed applications. We implement a mechanism using Copy on Write cooperating with XEMEM shared memory to provide consistent, implicitly unsynchronized communication across enclaves. We then evaluate this mechanism using a composed application and analytics between the Kitten Lightweight Kernel and Linux on top of the Hobbes Operating System and Runtime. These results show a 3% overhead compared to an application running in isolation, demonstrating the viability of this approach.
- H. Akkan, L. Ionkov, and M. Lang. Transparently consistent asynchronous shared memory. In Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, page 6. ACM, 2013. Google ScholarDigital Library
- D. A. Boyuka, S. Lakshminarasimham, X. Zou, Z. Gong, J. Jenkins, E. R. Schendel, N. Podhorszki, Q. Liu, S. Klasky, and N. F. Samatova. Transparent in situ data transformations in adios. In Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on, pages 256--266. IEEE, 2014.Google ScholarDigital Library
- R. Brightwell, R. Oldfield, A. B. Maccabe, and D. E. Bernholdt. Hobbes: Composition and virtualization as the foundations of an extreme-scale OS/R. In Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, page 2. ACM, 2013. Google ScholarDigital Library
- R. Brightwell, K. Pedretti, and T. Hudson. Smartmap: operating system support for efficient data sharing among processes on a multi-core processor. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, page 25. IEEE Press, 2008. Google ScholarDigital Library
- A. Burtsev, K. Srinivasan, P. Radhakrishnan, K. Voruganti, and G. R. Goodson. Fido: Fast inter-virtual-machine communication for enterprise appliances. In USENIX Annual technical conference. San Diego, CA, 2009. Google ScholarDigital Library
- J. Dayal, D. Bratcher, G. Eisenhauer, K. Schwan, M. Wolf, X. Zhang, H. Abbasi, S. Klasky, and N. Podhorszki. Flexpath: Type-based publish/subscribe system for large-scale science analytics. In Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on, pages 246--255. IEEE, 2014.Google ScholarDigital Library
- M. Giampapa, T. Gooding, T. Inglett, and R. Wisniewski. Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK. In Proceedings of the 23rd International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2010. Google ScholarDigital Library
- B. Goglin and S. Moreaud. Knem: A generic and scalable kernel-assisted intra-node mpi communication framework. Journal of Parallel and Distributed Computing, 73(2):176--188, 2013. Google ScholarDigital Library
- S. M. Hand. Self-paging in the nemesis operating system. In OSDI, volume 99, pages 73--86, 1999. Google ScholarDigital Library
- S. M. Kelly, J. P. V. Dyke, and C. T. Vaughan. Catamount N-Way (CNW): An implementation of the Catamount light weight kernel supporting N-cores version 2.0. Technical Report SAND2008-4039P, Sandia National Laboratories, June 2008.Google Scholar
- B. Kocoloski and J. Lange. Xemem: Efficient shared memory for composed applications on multi-os/r exascale systems. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, pages 89--100. ACM, 2015. Google ScholarDigital Library
- B. Kocoloski, J. Lange, H. Abbasi, D. Bernholdt, T. Jones, J. Dayal, N. Evans, M. Lang, J. Lofstead, K. Pedretti, and P. Bridges. System-level support for composition of applications. In Proc. 5th International Workshop on Runtime and Operating Systems for Supercomputers (ROSS), 2015. Google ScholarDigital Library
- J. Lange, K. Pedretti, T. Hudson, P. Dinda, Z. Cui, L. Xia, P. Bridges, A. Gocke, S. Jaconette, M. Levenhagen, and R. Brightwell. Palacios and Kitten: New High Performance Operating Systems for Scalable Virtualized and Native Supercomputing. In Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2010.Google ScholarCross Ref
- J. Lange, K. Pedretti, T. Hudson, P. Dinda, Z. Cui, L. Xia, P. Bridges, A. Gocke, S. Jaconette, M. Levenhagen, et al. Palacios and kitten: New high performance operating systems for scalable virtualized and native supercomputing. In Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1--12. IEEE, 2010.Google ScholarCross Ref
- J. Ouyang, B. Kocoloski, J. R. Lange, and K. Pedretti. Achieving performance isolation with lightweight co-kernels. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, pages 149--160. ACM, 2015. Google ScholarDigital Library
- C. Sewell, K. Heitmann, H. Finkel, G. Zagaris, S. T. Parete-Koon, P. K. Fasel, A. Pope, N. Frontiere, L.-t. Lo, B. Messer, et al. Large-scale compute-intensive analysis via a combined in-situ and co-scheduling workflow approach. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, page 50. ACM, 2015. Google ScholarDigital Library
- H. Tomita, M. Sato, and Y. Ishikawa. Japan overview talk. In Proc. 2nd International Workshop on Big Data and Extreme-scale Computing (BDEC), 2014.Google Scholar
- J. Wang, K.-L. Wright, and K. Gopalan. Xenloop: a transparent high performance inter-vm network loopback. In Proceedings of the 17th international symposium on High performance distributed computing, pages 109--118. ACM, 2008. Google ScholarDigital Library
- R. Wisniewski, T. Inglett, P. Keppel, R. Murty, and R. Riesen. mos: An architecture for extreme-scale operating systems. In Proc. 4th International Workshop on Runtime and Operating Systems for Supercomputers (ROSS), 2014. Google ScholarDigital Library
- M. Woodacre, D. Robb, D. Roe, and K. Feind. The sgi® altixtm 3000 global shared-memory architecture. Silicon Graphics, Inc.(2003), 2005.Google Scholar
- J. Woodring, M. Petersen, A. Schmeißer, J. Patchett, J. Ahrens, and H. Hagen. In situ eddy analysis in a high-resolution ocean climate model. Visualization and Computer Graphics, IEEE Transactions on, 22(1):857--866, 2016.Google Scholar
- J. Zerr and R. Baker. Snap: Sn (discrete ordinates) application proxy - proxy description, 2013.Google Scholar
- X. Zhang, S. McIntosh, P. Rohatgi, and J. L. Griffin. Xensocket: A high-throughput interdomain transport for virtual machines. In Middleware 2007, pages 184--203. Springer, 2007. Google ScholarDigital Library
Recommendations
System-Level Support for Composition of Applications
ROSS '15: Proceedings of the 5th International Workshop on Runtime and Operating Systems for SupercomputersCurrent HPC system software lacks support for emerging application deployment scenarios that combine one or more simulations with in situ analytics, sometimes called multi-component or multi-enclave applications. This paper presents an initial design ...
DotGrid: a.NET-based cross-platform software for desktop grids
Grid infrastructures that have provided wide integrated use of resources are becoming the de facto computing platform for solving large-scale problems in science, engineering and commerce. In this evolution, desktop grid technologies allow the grid ...
Cross-Layer Self-Adaptive/Self-Aware System Software for Exascale Systems
SBAC-PAD '14: Proceedings of the 2014 IEEE 26th International Symposium on Computer Architecture and High Performance ComputingThe extreme level of parallelism coupled with the limited available power budget expected in the exascale era brings unprecedented challenges that demand optimization of performance, power and resiliency in unison. Scalability on such systems is of ...
Comments