invited-talk

Rethinking Memory System Design (along with Interconnects)

Author:

Onur MutluAuthors Info & Claims

NoCArc '15: Proceedings of the 8th International Workshop on Network on Chip Architectures

Page 1

https://doi.org/10.1145/2835512.2835520

Published: 05 December 2015 Publication History

Abstract

The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck [27, 28]. At the same time, DRAM technology is experiencing difficult circuit and device scaling challenges that make the maintenance and enhancement of its capacity, energy-efficiency, and reliability significantly more costly with conventional techniques (see, for example [7, 8, 11, 12, 15, 17, 18, 22, 23, 32]).

In this talk, we examine some promising research and design directions to overcome challenges posed by memory scaling. Specifically, we discuss three key solution directions: 1) enabling new memory architectures, functions, interfaces, and better integration of the memory and the rest of the system, including interconnects (e.g., [1, 2, 19, 20, 34-36]), 2) designing a memory system that intelligently employs multiple memory technologies and coordinates memory and storage management using non-volatile memory technologies (e.g., [16-18, 24, 25, 32, 33, 40-42]), 3) providing predictable performance and QoS to applications sharing the memory system (e.g., [3, 9, 10, 13, 14, 26, 29, 37-39]). As we discuss challenges and solution directions in memory, we will point out research opportunities in interconnects and memory-interconnect co-design (e.g., [2, 4-6, 19, 21, 30, 31]).

References

[1]

J. Ahn et al. PIM-Enabled Instructions: A Low-Overhead, Locality-Aware Processing-in-Memory Architecture. In ISCA, 2015.

Digital Library

[2]

J. Ahn et al. A Scalable Processing-in-Memory Accelerator for Parallel Graph Processing. In ISCA, 2015.

Digital Library

[3]

R. Ausavarungnirun et al. Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems. In ISCA, 2012.

Digital Library

[4]

R. Das et al. Application-aware prioritization mechanisms for on-chip networks. In MICRO, 2009.

Digital Library

[5]

R. Das et al. Aergia: Exploiting packet latency slack in on-chip networks. In ISCA, 2010.

Digital Library

[6]

R. Das et al. Application-to-core mapping policies to reduce memory system interference in multi-core systems. In HPCA, 2013.

Digital Library

[7]

H. David et al. Memory power management via dynamic voltage/frequency scaling. In ICAC, 2011.

Digital Library

[8]

Q. Deng et al. Memscale: active low-power modes for main memory. In ASPLOS, 2011.

Digital Library

[9]

E. Ebrahimi et al. Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems. In ASPLOS, 2010.

Digital Library

[10]

E. Ebrahimi et al. Prefetch-aware shared-resource management for multi-core systems. In ISCA, 2011.

Digital Library

[11]

U. Kang et al. Co-architecting controllers and DRAM to enhance DRAM process scaling. In The Memory Forum, 2014.

[12]

S. Khan et al. The efficacy of error mitigation techniques for DRAM retention failures: A comparative experimental study. In SIGMETRICS, 2014.

Digital Library

[13]

H. Kim et al. Bounding memory interference delay in COTS-based multi-core systems. In RTAS, 2014.

[14]

Y. Kim et al. Thread cluster memory scheduling: Exploiting differences in memory access behavior. In MICRO, 2010.

Digital Library

[15]

Y. Kim et al. Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors. In ISCA, 2014.

Digital Library

[16]

E. Kultursay et al. Evaluating STT-RAM as an energy-efficient main memory alternative. In ISPASS, 2013.

[17]

B. C. Lee et al. Architecting phase change memory as a scalable DRAM alternative. In ISCA, 2009.

Digital Library

[18]

B. C. Lee et al. Phase change memory architecture and the quest for scalability. CACM, 53(7), 2010.

Digital Library

[19]

D. Lee et al. Tiered-latency DRAM: A low latency and low cost DRAM architecture. In HPCA, 2013.

Digital Library

[20]

D. Lee et al. Adaptive-latency DRAM: Optimizing DRAM timing for the common-case. In HPCA, 2015.

[21]

D. Lee et al. Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM. In PACT, 2015.

Digital Library

[22]

J. Liu et al. RAIDR: Retention-aware intelligent DRAM refresh. In ISCA, 2012.

Digital Library

[23]

J. Liu et al. An experimental study of data retention behavior in modern DRAM devices: Implications for retention time profiling mechanisms. In ISCA, 2013.

Digital Library

[24]

Y. Lu et al. Loose-ordering consistency for persistent memory. In ICCD, 2014.

[25]

J. Meza et al. A case for efficient hardware-software cooperative management of storage and memory. In WEED, 2013.

[26]

S. Muralidhara et al. Reducing memory interference in multi-core systems via application-aware memory channel partitioning. In MICRO, 2011.

Digital Library

[27]

O. Mutlu. Memory scaling: A systems architecture perspective. In IMW, 2013.

[28]

O. Mutlu and L. Subramanian. Research problems and opportunities in memory systems. SUPERFRI, 2014.

[29]

O. Mutlu et al. Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems. In ISCA, 2008.

Digital Library

[30]

G. Nychis et al. Next generation on-chip networks: What kind of congestion control do we need? In HotNets, 2010.

Digital Library

[31]

G. Pekhimenko et al. Toggle-Aware Compression for GPUs. IEEE Comp. Arch. Letters, 2015.

[32]

M. K. Qureshi et al. Scalable high performance main memory system using phase-change memory technology. In ISCA, 2009.

Digital Library

[33]

J. Ren et al. Dual-scheme checkpointing: A software-transparent mechanism for supporting crash consistency in persistent memory systems. In MICRO, 2015.

Digital Library

[34]

V. Seshadri et al. RowClone: Fast and efficient In-DRAM copy and initialization of bulk data. In MICRO, 2013.

Digital Library

[35]

V. Seshadri et al. Gather-Scatter DRAM: In-DRAM Address Translation to Improve the Spatial Locality of Non-unit Strided Accesses. In MICRO, 2015.

Digital Library

[36]

V. Seshadri et al. Fast Bulk Bitwise AND and OR in DRAM. IEEE Comp. Arch. Letters, 2015.

[37]

L. Subramanian, D. Lee, V. Seshadri, H. Rastogi, and O. Mutlu. The blacklisting memory scheduler: Achieving high performance and fairness at low cost. In ICCD, 2014.

[38]

L. Subramanian et al. MISE: Providing performance predictability and improving fairness in shared main memory systems. In HPCA, 2013.

Digital Library

[39]

L. Subramanian et al. The application slowdown model: Quantifying and controlling the impact of inter-application interference at shared caches and main memory. In MICRO, 2015.

Digital Library

[40]

H. Yoon et al. Row buffer locality aware caching policies for hybrid memories. In ICCD, 2012.

Digital Library

[41]

H. Yoon et al. Efficient data mapping and buffering techniques for multi-level cell phase-change memories. TACO, 2014.

Digital Library

[42]

J. Zhao et al. FIRM: Fair and high-performance memory control for persistent memory systems. In MICRO, 2014.

Digital Library

Index Terms

Rethinking Memory System Design (along with Interconnects)
1. Computer systems organization
2. Hardware
  1. Integrated circuits
    1. Semiconductor memory

Recommendations

Memory systems in the many-core era: challenges, opportunities, and solution directions
ISMM '11: Proceedings of the international symposium on Memory management

The memory subsystem is a fundamental performance and energy bottleneck in almost all computing systems. Recent trends towards increasingly more cores on die, consolidation of diverse workloads on a single chip, and difficulty of DRAM scaling impose new ...
Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms
Performance evaluation review

Variation has been shown to exist across the cells within a modern DRAM chip. Prior work has studied and exploited several forms of variation, such as manufacturing-process- or temperature-induced variation. We empirically demonstrate a new form of ...
Memory systems in the many-core era: challenges, opportunities, and solution directions
ISMM '11

The memory subsystem is a fundamental performance and energy bottleneck in almost all computing systems. Recent trends towards increasingly more cores on die, consolidation of diverse workloads on a single chip, and difficulty of DRAM scaling impose new ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

NoCArc '15: Proceedings of the 8th International Workshop on Network on Chip Architectures

December 2015

47 pages

ISBN:9781450339636

DOI:10.1145/2835512

Program Chairs:
Masoumeh Ebrahimi
University of Turku, Finland and KTH, Sweden
,
Riccardo Locatelli
STMicroelectronics

Copyright © 2015 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2015

Check for updates

Author Tags

Qualifiers

Invited-talk
Research
Refereed limited

Conference

NoCArc '15

NoCArc '15: International Workshop on Network on Chip Architectures

December 5, 2015

HI, Waikiki, USA

Acceptance Rates

NoCArc '15 Paper Acceptance Rate 6 of 21 submissions, 29%;

Overall Acceptance Rate 46 of 122 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
242
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten