skip to main content
10.1145/3132402.3132403acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmemsysConference Proceedingsconference-collections
research-article

A bandwidth accurate, flexible and rapid simulating multi-HMC modeling tool

Published:02 October 2017Publication History

ABSTRACT

Derived by the demand for ever increasing computing performance, a steadily widening performance gap between memory and processor architectures has emerged. While attempting to mitigate the effects for processing systems that already face the exascale barrier and beyond, energy-efficient computing was identified as the critical topic to provide further scaling. Memory architectures, persistently known as slow, energy-hungry and cost-intensive, require novel findings to aid in increasing the energy efficiency as well as bandwidth. A quick fix for the performance aspect seems to be 3D stacking of such planar memories, that is available in the form of the High Bandwidth Memory (HBM) and the Hybrid Memory Cube (HMC). With the latter allowing to embed custom logic, novel non-von Neumann architectures can be accomplished, overcoming the performance gap while achieving a new path for scaling the computing performance. Considering the broad spectrum of custom logic that could be integrated into a mesh of HMCs, comprehensive modeling tools are required, enabling holistic design-space explorations for computing systems in breadth and depth. Fulfilling this demand, an HMC-modeling tool was implemented, providing rapid simulation of multiple interconnected HMCs that can run either in a functional or in a bandwidth-accurate mode. Since flexibility is a key for subsequent studies, the HMC-modeling tool is parameterizable whereas internal components can be adjusted.

References

  1. Juha Alakarhu and Jarkko Niittylahti. 2002. DRAM simulator for design and analysis of digital systems. Microprocessors and Microsystems 26, 4 (2002), 189--198.Google ScholarGoogle ScholarCross RefCross Ref
  2. E. Azarkhish, D. Rossi, I. Loi, and L. Benini. 2015. High performance AXI-4.0 based interconnect for extensible smart memory cubes. In 2015 Design, Automation Test in Europe Conference Exhibition (DATE). 1317--1322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Erfan Azarkhish, Davide Rossi, Igor Loi, and Luca Benini. 2016. Design and Evaluation of a Processing-in-Memory Architecture for the Smart Memory Cube. In Proceedings of the 29th International Conference on Architecture of Computing Systems - ARCS 2016 - Volume 9637. Springer-Verlag New York, Inc., New York, NY, USA, 19--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The Gem5 Simulator. SIGARCH Comput. Archit. News 39, 2 (aug 2011), 1--7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Lukai Cai and Daniel Gajski. 2003. Transaction Level Modeling: An Overview. In Proceedings of the 1st IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '03). ACM, New York, NY, USA, 19--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Arnaldo Carvalho de Melo. 2010. The New Linux 'perf' tools. In Linux Kongress.Google ScholarGoogle Scholar
  7. Karthik Chandrasekar, Christian Weis, Yonghui Li, Sven Goossens, Matthias Jung, Omar Naji, Benny Akesson, Norbert Wehn, and Kees Goossens. 2011. DRAMPower: Open-source DRAM Power & Energy Estimation Tool. http://www.drampower.info. (2011).Google ScholarGoogle Scholar
  8. Kevin Chang and Yoongu Kim. 2016. Ramulator#: A fast and lightweight DRAM simulator. https://github.com/CMU-SAFARI/RamulatorSharp. (2016).Google ScholarGoogle Scholar
  9. Niladrish Chatterjee, Rajeev Balasubramonian, Manjunath Shevgoor, Seth Pugsley, Aniruddha Udipi, Ali Shafiee, Kshitij Sudan, Manu Awasthi, and Zeshan Chishti. 2012. USIMM: the Utah SImulated Memory Module. Technical Report. University of Utah and Intel Corp.Google ScholarGoogle Scholar
  10. K. Chen, S. Li, N. Muralimanohar, J. H. Ahn, J. B. Brockman, and N. P. Jouppi. 2012. CACTI-3DD: Architecture-level modeling for 3D die-stacked DRAM main memory. In 2012 Design, Automation Test in Europe Conference Exhibition (DATE). 33--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hybrid Memory Cube Consortium. 2015. Hybrid Memory Cube Specification 2.1. Technical Report.Google ScholarGoogle Scholar
  12. Elliott Cooper-Balis. 2012. BUFFER-ON-BOARD MEMORY SYSTEM. Ph.D. Dissertation. University of Maryland.Google ScholarGoogle Scholar
  13. E. Cooper-Balis, P. Rosenfeld, and B. Jacob. 2012. Buffer-on-board memory systems. In Computer Architecture (ISCA), 2012 39th Annual International Symposium on. 392--403. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. X. Dong, C. Xu, Y. Xie, and N. P. Jouppi. 2012. NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 31, 7 (July 2012), 994--1007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. John Ellson, Emden Gansner, Lefteris Koutsofios, Stephen North, Gordon Woodhull, Short Description, and Lucent Technologies. 2001. Graphviz --- open source graph drawing tools. In Lecture Notes in Computer Science. Springer-Verlag, 483--484.Google ScholarGoogle Scholar
  16. Maya Gokhale, Scott Lloyd, and Chris Macaraeg. 2015. Hybrid Memory Cube Performance Characterization on Data-centric Workloads. In Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms (IA3 '15). ACM, New York, NY, USA, Article 7, 8 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Thorsten Grotker. 2002. System Design with SystemC. Kluwer Academic Publishers, Norwell, MA, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Andreas Hansson, Neha Agarwal, Aasheesh Kolli, Thomas F. Wenisch, and Aniruddha N. Udipi. 2014. Simulating DRAM controllers for future system architecture exploration. In 2014 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2014, Monterey, CA, USA, March 23--25, 2014. 201--210.Google ScholarGoogle ScholarCross RefCross Ref
  19. John L. Henning. 2006. SPEC CPU2006 Benchmark Descriptions. SIGARCH Comput. Archit. News 34, 4 (sep 2006), 1--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Bruce Jacob. 2009. The Memory System: You Can'T Avoid It, You Can'T Ignore It, You Can'T Fake It. Morgan and Claypool Publishers. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Jagtap, S. Diestelhorst, A. Hansson, M. Jung, and N. When. 2016. Exploring system performance using elastic traces: Fast, accurate and portable. In 2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS). 96--105.Google ScholarGoogle Scholar
  22. D. I. Jeon and K. S. Chung. 2016. CasHMC: A Cycle-accurate Simulator for Hybrid Memory Cube. IEEE Computer Architecture Letters PP, 99 (2016), 1--1.Google ScholarGoogle Scholar
  23. Min Kyu Jeong, Doe Hyun Yoon, and Mattan Erez. {n. d.}. DrSim: A Platform for Flexible DRAM System Research. ({n. d.}). http://lph.ece.utexas.edu/public/DrSimGoogle ScholarGoogle Scholar
  24. Matthias Jung, Christian Weis, and Norbert Wehn. 2015. DRAMSys: A Flexible DRAM Subsystem Design Space Exploration Framework. IPSJ Transactions on System LSI Design Methodology 8 (2015), 63--74.Google ScholarGoogle ScholarCross RefCross Ref
  25. M. J. Khurshid and M. Lipasti. 2013. Data compression for thermal mitigation in the Hybrid Memory Cube. In 2013 IEEE 31st International Conference on Computer Design (ICCD). 185--192.Google ScholarGoogle Scholar
  26. Y. Kim, W. Yang, and O. Mutlu. 2016. Ramulator: A Fast and Extensible DRAM Simulator. IEEE Computer Architecture Letters 15, 1 (Jan 2016), 45--49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Sangho Lee, Teresa Johnson, and Easwaran Raman. 2014. Feedback Directed Optimization of TCMalloc. In Proceedings of the Workshop on Memory Systems Performance and Correctness (MSPC '14). ACM, New York, NY, USA, Article 3, 8 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. John Leidel and Yong Chen. 2016. HMC-Sim-2.0: A Simulation Platform for Exploring Custom Memory Cube Operations. In Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES). 10.Google ScholarGoogle ScholarCross RefCross Ref
  29. J. D. Leidel and Y. Chen. 2014. HMC-Sim: A Simulation Framework for Hybrid Memory Cube Devices. In Parallel Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International. 1465--1474. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Rolf Meyer, Jan Wagner, Bastian Farkas, Sven Horsinka, Patrick Siegl, Rainer Buchty, and Mladen Berekovic. 2016. A Scriptable Standard-Compliant Reporting and Logging Framework for SystemC. ACM Trans. Embed. Comput. Syst. 16, 1, Article 6 (oct 2016), 28 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Sparsh Mittal, Matthew Poremba, Jeffrey Vetter, and Yuan Xie. 2015. Exploring Design Space of 3D NVM and eDRAM Caches Using DESTINY Tool. https://www.academia.edu/9741921/Exploring_Design_Space_of_3D_NVM_and_eDRAM_Caches_Using_DESTINY_ToolGoogle ScholarGoogle Scholar
  32. M. Motoyoshi. 2009. Through-Silicon Via (TSV). Proc. IEEE 97, 1 (Jan 2009), 43--48.Google ScholarGoogle ScholarCross RefCross Ref
  33. Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P. Jouppi. 2007. CACTI 6.0: A Tool to Model Large Caches. Technical Report. HP Laboratories, Chicago. International Symposium on Microarchitecture.Google ScholarGoogle Scholar
  34. R. Nair. 2015. Evolution of Memory Architecture. Proc. IEEE 103, 8 (Aug 2015), 1331--1345.Google ScholarGoogle ScholarCross RefCross Ref
  35. R. Nair, S. F. Antao, C. Bertolli, P. Bose, J. R. Brunheroto, T. Chen, C. Y. Cher, C. H. A. Costa, J. Doi, C. Evangelinos, B. M. Fleischer, T. W. Fox, D. S. Gallo, L. Grinberg, J. A. Gunnels, A. C. Jacob, P. Jacob, H. M. Jacobson, T. Karkhanis, C. Kim, J. H. Moreno, J. K. O'Brien, M. Ohmacht, Y. Park, D. A. Prener, B. S. Rosenburg, K. D. Ryu, O. Sallenave, M. J. Serrano, P. D. M. Siegl, K. Sugavanam, and Z. Sura. 2015. Active Memory Cube: A processing-in-memory architecture for exascale systems. IBM Journal of Research and Development 59, 2/3 (March 2015), 17--1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Thomas Pawlowski. 2011. Hybrid Memory Cube (HMC). (August 2011). HOTCHIPS23Google ScholarGoogle Scholar
  37. M. Poremba, S. Mittal, D. Li, J. S. Vetter, and Y. Xie. 2015. DESTINY: A tool for modeling emerging 3D NVM and eDRAM caches. In 2015 Design, Automation Test in Europe Conference Exhibition (DATE). 1543--1546. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Poremba, T. Zhang, and Y. Xie. 2015. NVMain 2.0: A User-Friendly Memory Simulator to Model (Non-)Volatile Memory Systems. IEEE Computer Architecture Letters 14, 2 (July 2015), 140--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. F. Rodrigues, K. S. Hemmert, B. W. Barrett, C. Kersey, R. Oldfield, M. Weston, R. Risen, J. Cook, P. Rosenfeld, E. CooperBalls, and B. Jacob. 2011. The Structural Simulation Toolkit. SIGMETRICS Perform. Eval. Rev. 38, 4 (mar 2011), 37--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. P. Rosenfeld. 2014. Performance Exploration of the Hybrid Memory Cube. Ph.D. Dissertation. University of Maryland. Ph.D. thesis.Google ScholarGoogle Scholar
  41. P. Rosenfeld, E. Cooper-Balis, and B. Jacob. 2011. DRAMSim2: A Cycle Accurate Memory System Simulator. IEEE Computer Architecture Letters 10, 1 (Jan 2011), 16--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Boris Schling. 2011. The Boost C++ Libraries. XML Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Tezzaron Semiconductor. 2005. Tezzaron Unveils 3D SRAM. (January 2005).Google ScholarGoogle Scholar
  44. P. Siegl, R. Buchty, and M. Berekovic. 2016. Data-Centric Computing Frontiers: A Survey On Processing-In-Memory. In Proceedings of the 2016 International Symposium on Memory Systems. ACM. accepted. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. David Wang, Brinda Ganesh, Nuengwong Tuaycharoen, Kathleen Baynes, Aamer Jaleel, and Bruce Jacob. 2005. DRAMsim: A Memory System Simulator. SIGARCH Comput. Archit. News 33, 4 (nov 2005), 100--107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Christian Weis, Abdul Mutaal, Omar Naji, Matthias Jung, Andreas Hansson, and Norbert Wehn. 2016. DRAMSpec: A High-Level DRAM Timing, Power and Area Exploration Tool. International Journal of Parallel Programming (15 Nov 2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. X. Zhang, Y. Zhang, and J. Yang. 2015. DLB: Dynamic lane borrowing for improving bandwidth and performance in Hybrid Memory Cube. In 2015 33rd IEEE International Conference on Computer Design (ICCD). 125--132. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A bandwidth accurate, flexible and rapid simulating multi-HMC modeling tool

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        MEMSYS '17: Proceedings of the International Symposium on Memory Systems
        October 2017
        409 pages
        ISBN:9781450353359
        DOI:10.1145/3132402

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 2 October 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader