ABSTRACT
The performance of future manycore processors will only scale with the number of integrated cores if there is a corresponding increase in memory bandwidth. Projected scaling of electrical DRAM architectures appears unlikely to suffice, being constrained by processor and DRAM pin-bandwidth density and by total DRAM chip power, including off-chip signaling, cross-chip interconnect, and bank access energy. In this work, we redesign the DRAM main memory system using a proposed monolithically integrated silicon photonics technology and show that our photonically interconnected DRAM (PIDRAM) provides a promising solution to all of these issues. Photonics can provide high aggregate pin-bandwidth density through dense wavelength-division multiplexing. Photonic signaling provides energy-efficient communication, which we exploit to not only reduce chip-to-chip interconnect power but to also reduce cross-chip interconnect power by extending the photonic links deep into the actual PIDRAM chips. To complement these large improvements in interconnect bandwidth and power, we decrease the number of bits activated per bank to improve the energy efficiency of the PIDRAM banks themselves. Our most promising design point yields approximately a 10x power reduction for a single-chip PIDRAM channel with similar throughput and area as a projected future electrical-only DRAM. Finally, we propose optical power guiding as a new technique that allows a single PIDRAM chip design to be used efficiently in several multi-chip configurations that provide either increased aggregate capacity or bandwidth.
- J. Ahn et al. Multicore DIMM: An energy--efficient memory module with independently controlled DRAMs. IEEE Computer Architecture Letters, 8(1):5--8, Jan/June 2009. Google ScholarDigital Library
- T. Barwicz et al. Silicon photonics for compact, energy-efficient interconnects. Journal of Optical Networking, 6(1):63--73, Jan 2007.Google ScholarCross Ref
- C. Batten et al. Building manycore processor-to-DRAM networks with monolithic CMOS silicon photonics. IEEE Micro, 29(4):8--21, July/Aug 2009. Google ScholarDigital Library
- R. Drost et al. Challenges in building a flat-bandwidth memory hierarchy for a large scale computer with proximity communication. Int'l Symp. on High-Performance Interconnects, Aug 2005. Google ScholarDigital Library
- K. Fukuda et al. A 12.3 mW 12.5 Gb/s complete transceiver in 65 nm CMOS. Int'l Solid--State Circuits Conf., Feb 2010.Google Scholar
- C. Gunn. CMOS photonics for high-speed interconnects. IEEE Micro, 26(2):58--66, Mar/Apr 2006. Google ScholarDigital Library
- A. Hadke et al. OCDIMM: Scaling the DRAM memory wall using WDM based optical interconnects. Int'l Symp. on High-Performance Interconnects, Aug 2008. Google ScholarDigital Library
- C. Holzwarth et al. Localized substrate removal technique enabling strong--confinement microphotonics in bulk Si CMOS processes. Conf. on Lasers and Electro-Optics, May 2008.Google ScholarCross Ref
- A. Joshi et al. Silicon-photonic Clos networks for global on-chip communication. Int'l Symp. on Networks-on-Chip, May 2009. Google ScholarDigital Library
- I. Jung et al. Performance boosting of peripheral transistor for high density 4 Gb DRAM technologies by SiGe selective epitaxial growth technique. Int'l SiGe Technology and Device Mtg., 2006.Google Scholar
- U. Kang et al. 8 Gb 3D DDR3 DRAM using through-silicon-via technology. Int'l Solid--State Circuits Conf., Feb 2009.Google Scholar
- B. Keeth et al. DRAM Circuit Design: Fundamental and High-Speed Topics. Wiley-IEEE Press, 2008. Google ScholarDigital Library
- N. K1rman et al. Leveraging optical technology in future bus-based chip multiprocessors. MICRO, Dec 2006.Google Scholar
- H. Lee et al. A 16 Gb/s/link, 64 GB/s bidirectional asymmetric memory interface. IEEE Journal of Solid--State Circuits, 44(4):1235--1247, Apr 2009.Google ScholarCross Ref
- G. Loh. 3D-stacked memory architectures for multi-core processors. ISCA, June 2008. Google ScholarDigital Library
- Micron DDR SDRAM products. Online Datasheet, http://www.micron.com/products/dram/ddr3.Google Scholar
- T.-Y. Oh et al. A 7 Gb/s/pin GDDR5 SDRAM with 2.5 ns bank-to-bank active time and no bank--group restriction. Int'lSolid--State Circuits Conf., Feb 2010.Google Scholar
- F. O'Mahony et al. A 47x10 Gb/s 1.4 mW/(Gb/s) parallel interface in 45 nm CMOS. Int'l Solid--State Circuits Conf., Feb2010.Google Scholar
- J. Orcutt et al. Demonstration of an electronic photonic integrated circuit in a commercial scaled bulk CMOS process. Conf. on Lasers and Electro-Optics, May 2008.Google ScholarCross Ref
- H. Sun et al. 3D DRAM design and application to 3D multi-core systems. IEEE Design and Test of Computers, 26(5):36--47, Sep/Oct 2009. Google ScholarDigital Library
- D. Taillaert, P. Bienstman, and R. Baets. Compact efficient broadband grating coupler for silicon-on-insulator waveguides. Optics Letters, 29(23):2749--2751, Dec 2004.Google ScholarCross Ref
- S. Thoziyoor et al. A comprehensive memory modeling tool and its application to the design and analysis of future memory hierarchies. ISCA, June 2008. Google ScholarDigital Library
- A. N. Udipi et al. Rethinking dram design and organization for energy-constrained multi--cores. ISCA, June 2010. Google ScholarDigital Library
- D. Vantrease et al. Corona: System implications of emerging nanophotonic technology. ISCA, June 2008. Google ScholarDigital Library
- F. Ware and C. Hampel. Improving power and data efficiency with threaded memory modules. Int'l Conf. on Computer Design, Oct 2007.Google Scholar
- P. Yan et al. Firefly: Illuminating on-chip networks with nanophotonics. ISCA, June 2009. Google ScholarDigital Library
Index Terms
- Re-architecting DRAM memory systems with monolithically integrated silicon photonics
Recommendations
Enabling scalable chiplet-based uniform memory architectures with silicon photonics
MEMSYS '19: Proceedings of the International Symposium on Memory SystemsChiplet-based systems have recently received much attention for scaling-up processing power in HPC systems due to their high energy efficiency and low cost manufacturing; however, large inter-chiplet NUMA latencies, distance-related energy overheads, ...
Re-architecting DRAM memory systems with monolithically integrated silicon photonics
ISCA '10The performance of future manycore processors will only scale with the number of integrated cores if there is a corresponding increase in memory bandwidth. Projected scaling of electrical DRAM architectures appears unlikely to suffice, being constrained ...
Time-division-multiplexed arbitration in silicon nanophotonic networks-on-chip for high-performance chip multiprocessors
As the computational performance of microprocessors continues to grow through the integration of an increasing number of processing cores on a single die, the interconnection network has become the central subsystem for providing the communications ...
Comments