skip to main content
10.1145/1289816.1289876acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
Article

Aggressive snoop reduction for synchronized producer-consumer communication in energy-efficient embedded multi-processors

Published: 30 September 2007 Publication History

Abstract

Snoop-based cache coherence protocols are typically used when multiple processor cores share memory through a common bus. It is well known, however, that these coherence protocols introduce an excessive power overhead.To help alleviate this problem, we propose an application-driven customization technique where application knowledge regarding data sharing in producer-consumer relationships is used in order to aggressively eliminate unnecessary and predictable snoop-induced cache tag lookups even for references to shared data, thus, achieving significant power reduction with minimal hardware cost. Snoop-induced cache tag lookups for accesses to both shared and private data are eliminated when it is ensured that such lookups will not result in extra knowledge regarding the cache state in respect to the other caches and memories.The proposed methodology relies on the combined support from the compiler, the operating system, and the hardware architecture. Our experiments show average power reductions of more than 80% compared to a general-purpose snoop protocol.

References

[1]
Jim Nilsson, Anders Landin and Per Stenstrom, "The Coherence Predictor Cache: A Resource-Efficient and Accurate Coherence Prediction Infrastructure", in ISPDP, 2003.
[2]
M. Ekman, F. Dahlgren and P. Stenstrom, "TLB and snoop energy-reduction using virtual caches in low-power chipmicroprocessors", in ISLPED, pp. 243--246, August 2002.
[3]
M. Loghi, M. Letis, L. Benini andM. Poncino, "Exploring the energy efficiency of cache coherence protocols in single-chip multi-processors", in GLSVLSI, pp. 276--281, 2005.M. Loghi, M. Letis, L. Benini and M. Poncino, "Exploring the energy efficiency of cache coherence protocols in single-chip multi-processors", in GLSVLSI, pp. 276--281, 2005.
[4]
A. Moshovos, G. Memik, A. Choudhary and B. Falsafi, "JETTY: Filtering Snoops for Reduced Energy Consumption in SMP Servers", in HPCA, 2001.
[5]
A. Moshovos, "RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence", in ISCA, 2005.
[6]
T. F. Wenisch, S. Somogyi, N. Hardavellas, J. Kim, A. Ailamaki and B. Falsafi, "Temporal Streaming of Shared Memory", in ISCA, 2005.
[7]
C. Lee, M. Potkonjak and W. H. Mangione-Smith, "MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems", in 30th MICRO, pp. 330--335, December 1997.
[8]
M.R. Guthaus, J.S. Ringenberg, D. Ernst, T.M. Austin, T. Mudge and R.B. Brown, "MiBench: A free, commercially representative embedded benchmark suite", in WWC, pp. 3--14, Dec 2001.
[9]
N. Binkert, R. Dreslinski, L. Hsu, K. Lim, A. Saidi and S. Reinhardt, "The M5 Simulator: Modeling Networked Systems", IEEE Micro, vol. 26, n. 4, pp. 52--60, 2006.
[10]
D. Tarjan, S. Thoziyoor and N. Jouppi, "CACTI 4.0: An Integrated Cache Timing, Power and Area Model", Technical report, HP Laboratories Palo Alto, June 2006.
[11]
R. Bashirullah, W. Liu and R. Cavin, "Low-power design methodology for an on-chip bus with adaptive bandwidth capability", in DAC, pp. 628--633, 2003.

Cited By

View all
  • (2022)A Survey of Natural Design for InteractionProceedings of Mensch und Computer 202210.1145/3543758.3543773(240-254)Online publication date: 4-Sep-2022
  • (2020)Compiler Optimizing for Power Efficiency of On-Chip MemoryAdvanced Computer Architecture10.1007/978-981-15-8135-9_21(290-303)Online publication date: 5-Sep-2020
  • (2012)Design of Cache Controller for Multi-core Systems using Multilevel Scheduling MethodProceedings of the 2012 Fifth International Conference on Emerging Trends in Engineering and Technology10.1109/ICETET.2012.47(167-173)Online publication date: 5-Nov-2012
  • Show More Cited By

Index Terms

  1. Aggressive snoop reduction for synchronized producer-consumer communication in energy-efficient embedded multi-processors

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        CODES+ISSS '07: Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
        September 2007
        284 pages
        ISBN:9781595938244
        DOI:10.1145/1289816
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 30 September 2007

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. cache coherence
        2. embedded systems
        3. low-power multiprocessor system

        Qualifiers

        • Article

        Conference

        ESWEEK07
        ESWEEK07: Third Embedded Systems Week
        September 30 - October 3, 2007
        Salzburg, Austria

        Acceptance Rates

        Overall Acceptance Rate 280 of 864 submissions, 32%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)2
        • Downloads (Last 6 weeks)1
        Reflects downloads up to 08 Mar 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2022)A Survey of Natural Design for InteractionProceedings of Mensch und Computer 202210.1145/3543758.3543773(240-254)Online publication date: 4-Sep-2022
        • (2020)Compiler Optimizing for Power Efficiency of On-Chip MemoryAdvanced Computer Architecture10.1007/978-981-15-8135-9_21(290-303)Online publication date: 5-Sep-2020
        • (2012)Design of Cache Controller for Multi-core Systems using Multilevel Scheduling MethodProceedings of the 2012 Fifth International Conference on Emerging Trends in Engineering and Technology10.1109/ICETET.2012.47(167-173)Online publication date: 5-Nov-2012
        • (2010)Energy- and Performance-Efficient Communication Framework for Embedded MPSoCs through Application-Driven Release ConsistencyACM Transactions on Design Automation of Electronic Systems10.1145/1870109.187011716:1(1-39)Online publication date: 1-Nov-2010
        • (2009)Low-power inter-core communication through cache partitioning in embedded multiprocessorsProceedings of the 22nd Annual Symposium on Integrated Circuits and System Design: Chip on the Dunes10.1145/1601896.1601903(1-6)Online publication date: 31-Aug-2009
        • (2009)Broadcast filteringJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2009.01.00155:3(196-208)Online publication date: 1-Mar-2009
        • (2008)Latency and bandwidth efficient communication through system customization for embedded multiprocessorsProceedings of the 45th annual Design Automation Conference10.1145/1391469.1391665(766-771)Online publication date: 8-Jun-2008

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media