skip to main content
tutorial

HeteroOS: OS Design for Heterogeneous Memory Management in Datacenter

Published:24 June 2017Publication History
Skip Abstract Section

Abstract

Heterogeneous memory management combined with server virtualization in datacenters is expected to increase the software and OS management complexity. State-of-the-art solutions rely exclusively on the hypervisor (VMM) for expensive page hotness tracking and migrations, limiting the benefits from heterogeneity. To address this, we design HeteroOS, a novel application-transparent OS-level solution for managing memory heterogeneity in virtualized system. The HeteroOS design first makes the guest-OSes heterogeneity-aware and then extracts rich OS-level information about applications' memory usage to place data in the 'right' memory avoiding page migrations. When such pro-active placements are not possible, HeteroOS combines the power of the guest-OSes' information about applications with the VMM's hardware control to track for hotness and migrate only performance-critical pages. Finally, HeteroOS also designs an efficient heterogeneous memory sharing across multiple guest-VMs. Evaluation of HeteroOS with memory, storage, and network-intensive datacenter applications shows up to 2x performance improvement compared to the state-of-the-art VMM-exclusive approach.

References

  1. Ameen Akel, Adrian M. Caulfield, Todor I. Mollov, Rajesh K. Gupta, and Steven Swanson. Onyx: a protoype phase change memory storage array. In HotStorage '11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Berkin Akin, Franz Franchetti, and James C. Hoe. 2015. Data Reorganization in Memory Using 3D-stacked DRAM. In Proceedings of the 42Nd Annual International Symposium on Computer Architecture (ISCA '15). ACM, New York, NY, USA, 131--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Oren Avissar, Rajeev Barua, and Dave Stewart. 2002. An Optimal Memory Allocation Scheme for Scratch-pad-based Embedded Systems. ACM Trans. Embed. Comput. Syst. 1, 1 (Nov. 2002), 6--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bryan Black, Murali Annavaram, Ned Brekelbaum, John DeVale, Lei Jiang, Gabriel H. Loh, Don McCaule, Pat Morrow, Donald W. Nelson, Daniel Pantuso, Paul Reed, Jeff Rupley, Sadasivan Shankar, John Shen, and Clair Webb. 2006. Die Stacking (3D) Microarchitecture. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 39). IEEE Computer Society, Washington, DC, USA, 469--479. Google ScholarGoogle ScholarCross RefCross Ref
  5. Silas Boyd-Wickizer, Haibo Chen, Rong Chen, Yandong Mao, Frans Kaashoek, Robert Morris, Aleksey Pesterev, Lex Stein, Ming Wu, Yuehua Dai, Yang Zhang, and Zheng Zhang. 2008. Corey: An Operating System for Many Cores. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI'08). USENIX Association, Berkeley, CA, USA, 43--57. http://dl.acm.org/citation.cfm?id=1855741.1855745 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chiachen Chou, Aamer Jaleel, and Moinuddin K. Qureshi. 2014. CAMEO: A Two-Level Memory Organization with Capacity of Main Memory and Flexibility of Hardware-Managed Cache. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47). IEEE Computer Society, Washington, DC, USA, 1--12. Google ScholarGoogle ScholarCross RefCross Ref
  7. Chia-Chen Chou, Aamer Jaleel, and Moinuddin Qureshi. 2015. BATMAN: Maximizing Bandwidth Utilization for Hybrid Memory Systems. In Technical Report, TR-CARET-2015-01 (March 9, 2015).Google ScholarGoogle Scholar
  8. Tae-Sun Chung, Dong-Joo Park, Sangwon Park, Dong-Ho Lee, Sang-Won Lee, and Ha-Joo Song. 2009. A Survey of Flash Translation Layer. J. Syst. Archit. 55, 5-6 (May 2009), 332--343. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jonathan Corbet. 2016. Linux Swap priority. https://lwn.net/Articles/690079. (2016).Google ScholarGoogle Scholar
  10. Jonathan Crobett. 2003. Linux object-based reverse-mapping. https://lwn.net/Articles/23732/. (2003).Google ScholarGoogle Scholar
  11. Qingyuan Deng, David Meisner, Luiz Ramos, Thomas F. Wenisch, and Ricardo Bianchini. 2011. MemScale: Active Low-power Modes for Main Memory. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XVI). ACM, New York, NY, USA, 225--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Peter J. Denning. 1968. The Working Set Model for Program Behavior. Commun. ACM 11, 5 (May 1968), 323--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Xiangyu Dong, Yuan Xie, Naveen Muralimanohar, and Norman P. Jouppi. 2010. Simple but Effective Heterogeneous Main Memory with On-Chip Memory Controller Support. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC '10). IEEE Computer Society, Washington, DC, USA, 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Subramanya R. Dulloor, Sanjay Kumar, Anil Keshavamurthy, Philip Lantz, Dheeraj Reddy, Rajesh Sankaran, and Jeff Jackson. 2014. System Software for Persistent Memory. In Proceedings of the Ninth European Conference on Computer Systems (EuroSys '14). ACM, New York, NY, USA, Article 15, 15 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Subramanya R. Dulloor, Amitabha Roy, Zheguang Zhao, Narayanan Sundaram, Nadathur Satish, Rajesh Sankaran, Jeff Jackson, and Karsten Schwan. 2016. Data Tiering in Heterogeneous Memory Systems. In Proceedings of the Eleventh European Conference on Computer Systems (EuroSys '16). ACM, New York, NY, USA, Article 15, 16 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Michael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi. 2012. Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '12). ACM, New York, NY, USA, 37--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Tal Garfinkel and Mendel Rosenblum. 2005. When Virtual is Harder Than Real: Security Challenges in Virtual Machine Based Computing Environments. In Proceedings of the 10th Conference on Hot Topics in Operating Systems - Volume 10 (HOTOS'05). USENIX Association, Berkeley, CA, USA, 20--20. http://dl.acm.org/citation.cfm?id=1251123.1251143 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Sanjay Ghemawat and Jeff Dean. 2011. Google LevelDB. http://tinyurl.com/osqd7c8. (2011).Google ScholarGoogle Scholar
  19. Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, and Ion Stoica. 2011. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI'11). USENIX Association, Berkeley, CA, USA, 323--336. http://dl.acm.org/citation.cfm?id=1972457.1972490 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jerome Glisse. 2016. Linux heterogeneous memory management. https://lwn.net/Articles/679300/. (2016).Google ScholarGoogle Scholar
  21. Maya Gokhale, Scott Lloyd, and Chris Macaraeg. 2015. Hybrid Memory Cube Performance Characterization on Data-centric Workloads. In Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms (IA3 '15). ACM, New York, NY, USA, Article 7, 8 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Mel Gorman. 2004. Understanding the Linux Virtual Memory Manager. Prentice Hall PTR, Upper Saddle River, NJ, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Mel Gorman. 2012. Foundation for automatic NUMA balancing. https://lwn.net/Articles/523065. (2012).Google ScholarGoogle Scholar
  24. Vishal Gupta, Min Lee, and Karsten Schwan. 2015. HeteroVisor: Exploiting Resource Heterogeneity to Enhance the Elasticity of Cloud Platforms. In Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE '15). ACM, New York, NY, USA, 79--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Anthony Gutierrez, Michael Cieslak, Bharan Giridhar, Ronald G. Dreslinski, Luis Ceze, and Trevor Mudge. 2014. Integrated 3D-stacked Server Designs for Increasing Physical Density of Key-value Stores. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '14). ACM, New York, NY, USA, 485--498. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Heather Hanson and Karthick Rajamani. 2012. What Computer Architects Need to Know About Memory Throttling. In Proceedings of the 2010 International Conference on Computer Architecture (ISCA'10). Springer-Verlag, Berlin, Heidelberg, 233--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jingtong Hu, Qingfeng Zhuge, Chun Jason Xue, Wei-Che Tseng, and Edwin H.-M. Sha. 2013. Software Enabled Wear-leveling for Hybrid PCM Main Memory on Embedded Systems. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE '13). EDA Consortium, San Jose, CA, USA, 599--602. http://dl.acm.org/citation.cfm?id=2485288.2485434 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sysoev Igor. 2004. NGinx Webserver. http://nginx.org. (2004).Google ScholarGoogle Scholar
  29. Xiaowei Jiang, N. Madan, Li Zhao, M. Upton, R. Iyer, S. Makineni, D. Newell, D. Solihin, and R. Balasubramonian. 2010. CHOP: Adaptive filter-based DRAM caching for CMP server platforms. In High Performance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on. 1--12.Google ScholarGoogle Scholar
  30. Crobett Jonathan. 2012. Linux Swapping. https://lwn.net/Articles/495543. (2012).Google ScholarGoogle Scholar
  31. Yongsoo Joo, Dimin Niu, Xiangyu Dong, Guangyu Sun, Naehyuck Chang, and Yuan Xie. 2010. Energy- and Endurance-aware Design of Phase Change Memory Caches. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE '10). European Design and Automation Association, 3001 Leuven, Belgium, Belgium, 136--141. http://dl.acm.org/citation.cfm?id=1870926.1870961 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Sudarsun Kannan, Ada Gavrilovska, and Karsten Schwan. 2016. pVM: Persistent Virtual Memory for Efficient Capacity Scaling and Object Storage. In Proceedings of the Eleventh European Conference on Computer Systems (EuroSys '16). ACM, New York, NY, USA, Article 13, 16 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Michael Kerrisk. 2007. Linux NUMA policies. http://man7.org/linux/man-pages/man3/numa.3.html. (2007).Google ScholarGoogle Scholar
  34. Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale Graph Computation on Just a PC. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI'12). USENIX Association, Berkeley, CA, USA, 31--46. http://dl.acm.org/citation.cfm?id=2387880.2387884 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Benjamin C. Lee, Engin Ipek, Onur Mutlu, and others. Architecting phase change memory as a scalable dram alternative. In ISCA '09. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Benjamin C. Lee, Engin Ipek, Onur Mutlu, and Doug Burger. 2009. Architecting phase change memory as a scalable dram alternative. In ISCA. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Felix Xiaozhu Lin and Xu Liu. 2016. Memif: Towards Programming Heterogeneous Memory Asynchronously. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '16). ACM, New York, NY, USA, 369--383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Duo Liu, Tianzheng Wang, Yi Wang, Zhiwei Qin, and Zili Shao. 2011. PCM-FTL: A Write-Activity-Aware NAND Flash Memory Management Scheme for PCM-Based Embedded Systems. In Proceedings of the 2011 IEEE 32Nd Real-Time Systems Symposium (RTSS '11). IEEE Computer Society, Washington, DC, USA, 357--366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Duo Liu, Tianzheng Wang, Yi Wang, Zhiwei Qin, and Zili Shao. 2012. A Block-level Flash Memory Management Scheme for Reducing Write Activities in PCM-based Embedded Systems. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE '12). EDA Consortium, San Jose, CA, USA, 1447--1450. http://dl.acm.org/citation.cfm?id=2492708.2493062 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ren-Shuo Liu, De-Yu Shen, Chia-Lin Yang, Shun-Chih Yu, and Cheng-Yuan Michael Wang. 2014. NVM Duet: Unified Working Memory and Persistent Store Architecture. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '14). ACM, New York, NY, USA, 455--470. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Gabriel Loh and Mark D. Hill. 2012. Supporting Very Large DRAM Caches with Compound-Access Scheduling and MissMap. IEEE Micro 32, 3 (May 2012), 70--78. Google ScholarGoogle ScholarCross RefCross Ref
  42. Sally A. McKee. 2004. Reflections on the Memory Wall. In Proceedings of the 1st Conference on Computing Frontiers (CF '04). ACM, New York, NY, USA, 162--. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. M.R. Meswani, S. Blagodurov, D. Roberts, J. Slice, M. Ignatowski, and G.H. Loh. 2015. Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories. In High Performance Computer Architecture (HPCA), 2015 IEEE 21st International Symposium on. 126--136.Google ScholarGoogle Scholar
  44. Rick Nelson. 2014. NGinx memory usage. https://www.nginx.com/blog/nginx-websockets-performance/. (2014).Google ScholarGoogle Scholar
  45. Mark Oskin and Gabriel H. Loh. 2015. A Software-Managed Approach to Die-Stacked DRAM. In Proceedings of the 2015 International Conference on Parallel Architecture and Compilation (PACT) (PACT '15). IEEE Computer Society, Washington, DC, USA, 188--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Sujay Phadke and S. Narayanasamy. 2011. MLP aware heterogeneous memory system. In Design, Automation Test in Europe Conference Exhibition (DATE), 2011. 1--6.Google ScholarGoogle Scholar
  47. Moinuddin K. Qureshi, John Karidis, Michele Franceschini, Vijayalakshmi Srinivasan, Luis Lastras, and Bulent Abali. 2009. Enhancing Lifetime and Security of PCM-based Main Memory with Start-gap Wear Leveling. In Proceedings of the 42Nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 42). ACM, New York, NY, USA, 14--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Moinuddin K. Qureshi and Gabe H. Loh. 2012. Fundamental Latency Trade-off in Architecting DRAM Caches: Outperforming Impractical SRAM-Tags with a Simple and Practical Design. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-45). IEEE Computer Society, Washington, DC, USA, 235--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable High Performance Main Memory System Using Phase-change Memory Technology. SIGARCH Comput. Archit. News 7, 3 (June 2009), 24--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Milan Radulovic, Darko Zivanovic, Daniel Ruiz, Bronis R. de Supinski, Sally A. McKee, Petar Radojković, and Eduard Ayguadé. 2015. Another Trip to the Wall: How Much Will Stacked DRAM Benefit HPC?. In Proceedings of the 2015 International Symposium on Memory Systems (MEMSYS '15). ACM, New York, NY, USA, 31--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. L. Ramos and R. Bianchini. 2012. Exploiting Phase-Change Memory in Cooperative Caches. In Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on. 227--234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Luiz E. Ramos, Eugene Gorbatov, and Ricardo Bianchini. 2011. Page Placement in Hybrid Memory Systems. In Proceedings of the International Conference on Supercomputing (ICS '11). ACM, New York, NY, USA, 85--95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis. 2007. Evaluating MapReduce for Multi-core and Multiprocessor Systems. In High Performance Computer Architecture, 2007. HPCA 2007. IEEE 13th International Symposium on. 13--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. David Rientjes. 2007. Linux Fake NUMA Patch. https://www.kernel.org/doc/Documentation/x86/x86_64/fake-numa-for-cpusets. (2007).Google ScholarGoogle Scholar
  55. D.A. Roberts. 2016. Reliable wear-leveling for non-volatile memory and method therefor. (May 26 2016). http://www.google.ch/patents/US20160147467 US Patent App. 14/554,972.Google ScholarGoogle Scholar
  56. Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-Stream: Edge-centric Graph Processing Using Streaming Partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP '13). ACM, New York, NY, USA, 472--488. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Salvatore Sanfilippo. 2009. Redis. http://redis.io/. (2009).Google ScholarGoogle Scholar
  58. Avinash Sodani. 2015. Knights landing (KNL): 2nd Generation Intel Xeon Phi processor. In 2015 IEEE Hot Chips 27 Symposium (HCS). 1--24.Google ScholarGoogle ScholarCross RefCross Ref
  59. Billy Tallis. 2017. Intel-Micron Memory 3D XPoint. goo.gl/wT4rQ6. (2017).Google ScholarGoogle Scholar
  60. Drepper Ulrich. 2007. "What every programmer should know about memory,". www.akkadia.org/drepper/cpumemory.pdf. (2007).Google ScholarGoogle Scholar
  61. Shivaram Venkataraman, Niraj Tolia, Parthasarathy Ranganathan, and Roy H. Campbell. 2011. Consistent and Durable Data Structures for Non-volatile Byte-addressable Memory. In Proceedings of the 9th USENIX Conference on File and Stroage Technologies (FAST'11). USENIX Association, Berkeley, CA, USA, 5--5. http://dl.acm.org/citation.cfm?id=1960475.1960480 Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Carl A. Waldspurger. 2002. Memory Resource Management in VMware ESX Server. SIGOPS Oper. Syst. Rev. 36, SI (Dec. 2002), 181--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Wm. A. Wulf and Sally A. McKee. 1995. Hitting the Memory Wall: Implications of the Obvious. SIGARCH Comput. Archit. News 23, 1 (March 1995), 20--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Fengzhe Zhang, Jin Chen, Haibo Chen, and Binyu Zang. 2011. CloudVisor: Retrofitting Protection of Virtual Machines in Multi-tenant Cloud with Nested Virtualization. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP '11). CM, New York, NY, USA, 203--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Jishen Zhao, Sheng Li, Doe Hyun Yoon, Yuan Xie, and Norman P. Jouppi. 2013. Kiln: Closing the Performance Gap Between Systems with and Without Persistence Support. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). ACM, New York, NY, USA, 421-- 432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Li Zhao, R. Iyer, R. Illikkal, and D. Newell. 2007. Exploring DRAM cache architectures for CMP server platforms. In Computer Design, 2007. ICCD 2007. 25th International Conference on. 55--62.Google ScholarGoogle Scholar

Index Terms

  1. HeteroOS: OS Design for Heterogeneous Memory Management in Datacenter

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGARCH Computer Architecture News
          ACM SIGARCH Computer Architecture News  Volume 45, Issue 2
          ISCA'17
          May 2017
          715 pages
          ISSN:0163-5964
          DOI:10.1145/3140659
          Issue’s Table of Contents
          • cover image ACM Conferences
            ISCA '17: Proceedings of the 44th Annual International Symposium on Computer Architecture
            June 2017
            736 pages
            ISBN:9781450348928
            DOI:10.1145/3079856

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 June 2017

          Check for updates

          Qualifiers

          • tutorial
          • Research
          • Refereed limited

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader