skip to main content
10.1109/ISCA.2006.21acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article

Flexible Snooping: Adaptive Forwarding and Filtering of Snoops in Embedded-Ring Multiprocessors

Published:01 May 2006Publication History

ABSTRACT

A simple and low-cost approach to supporting snoopy cache coherence is to logically embed a unidirectional ring in the network of a multiprocessor, and use it to transfer snoop messages. Other messages can use any link in the network. While this scheme works for any network topology, a naive implementation may result in long response times or in many snoop messages and snoop operations. To address this problem, this paper proposes Flexible Snooping algorithms, a family of adaptive forwarding and filtering snooping algorithms. In these algorithms, a node receiving a snoop request may either forward it to another node and then perform the snoop, or snoop and then forward it, or simply forward it without snooping. The resulting design space offers trade-offs in number of snoop operations and messages, response time, and energy consumption. Our analysis using SPLASH-2, SPECjbb, and SPECweb workloads finds several snooping algorithms that are more costeffective than current ones. Specifically, our choice for a highperformance snooping algorithm is faster than the currently fastest algorithm while consuming 9-17% less energy; our choice for an energy-efficient algorithm is only 3-6% slower than the previous one while consuming 36-42% less energy.

References

  1. {1} M. E. Acacio, J. González, J. M. García, and J. Duato. Owner Prediction for Accelerating Cache-to-Cache Transfer Misses in a cc-NUMA Architecture. In High Performance Computing, Networks and Storage Conference (SC), Nov 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. {2} L. Barroso and M. Dubois. The Performance of Cache-Coherent Ring-based Multiprocessors. In International Symposium on Computer Architecture, May 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. {3} B. Bloom. Space/time Trade-offs in Hash Coding with Allowable Errors. Communications of the ACM, 11(7):422-426, July 1970. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. {4} J. F. Cantin, M. H. Lipasti, and J. E. Smith. Improving Multiprocessor Performance with Coarse-Grain Coherence Tracking. In International Symposium on Computer Architecture, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. {5} D. E. Culler and J. P. Singh. Parallel Computer Architecture; A Hard-ware/Software Approach. Morgan Kaufmann, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. {6} M. Ekman, F. Dahlgren, and P. Stenström. Evaluation of Snoop-Energy Reduction Techniques for Chip-Multiprocessors. In Workshop on Duplicating, Deconstructing, and Debunking, May 2002.Google ScholarGoogle Scholar
  7. {7} HyperTransport Technology Consortium. HyperTransport I/O Link Specification , 2.00b edition, April 2005.Google ScholarGoogle Scholar
  8. {8} R. Kumar, V. Zyuban, and D. M. Tullsen. Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling. In International Symposium on Computer Architecture, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. {9} M. Martin, P. Harper, D. Sorin, M. Hill, and D. Wood. Using Destination-Set Prediction to Improve the Latency/Bandwidth Tradeoff in Shared-Memory Multiprocessors. In International Symposium on Computer Architecture, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. {10} M. Martin, M. Hill, and D. Wood. Token Coherence: Decoupling Performance and Correctness. In International Symposium on Computer Architecture, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. {11} M. Marty, J. Bingham, M. Hill, A. Hu, M. Martin, and D. Wood. Improving Multiple-CMP Systems Using Token Coherence. In International Symposium on High-Performance Computer Architecture, Feb 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. {12} Micron Technology, Inc. System-Power Calculator. http://www.micron.com/products/dram/syscalc.html.Google ScholarGoogle Scholar
  13. {13} A. Moshovos. RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence. In International Symposium on Computer Architecture, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. {14} A. Moshovos, G. Memik, B. Falsafi, and A. Choudhary. JETTY: Filtering Snoops for Reduced Energy Consumption in SMP Servers. In International Symposium on High-Perfomance Computer Architecture, Jan 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. {15} J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, K. Strauss, S. Sarangi, P. Sack, and P. Montesinos. SESC Simulator, Jan 2005. http://sesc.sourceforge.net.Google ScholarGoogle Scholar
  16. {16} C. Saldanha and M. Lipasti. Power Efficient Cache Coherence. In Workshop on Memory Performance Issues, June 2001.Google ScholarGoogle Scholar
  17. {17} X. Shen. A Snoop-and-Forward Cache Coherence Protocol for SMP Systems with Ring-based Address Networks. Technical report, IBM T. J. Watson Research Center, June 2004.Google ScholarGoogle Scholar
  18. {18} P. Shivakumar and N. Jouppi. CACTI 3.0: An Integrated Cache Timing, Power and Area Model. Technical Report 2001/2, Compaq Computer Corporation, Aug 2001.Google ScholarGoogle Scholar
  19. {19} Silicon Graphics. Silicon Graphics Altrix 3000 Scalable 64-bit Linux Platform. http://www.sgi.com/products/servers/altix/.Google ScholarGoogle Scholar
  20. {20} Standard Performace Evaluation Corporation (SPEC). http://www.spec.org.Google ScholarGoogle Scholar
  21. {21} Sun Microsystems. Sun Enterprise 10000 Server Overview. http://www.sun.com/servers/highend/e10000/.Google ScholarGoogle Scholar
  22. {22} J. M. Tendler, J. S. Dodson, J. S. Fields, H. Le, and B. Sinharoy. POWER4 System Microarchitecture. In IBM Journal of Research and Development, Jan 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. {23} Virtutech. Virtutech Simics. http://www.virtutech.com/products.Google ScholarGoogle Scholar
  24. {24} Z. Vranesic, M. Stumm, D. Lewis, and R. White. Hector: A Hierarchically Structured Shared-Memory Multiprocessor. In IEEE Computer Magazine, Jan 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. {25} H. S. Wang, X. P. Zhu, L. S. Peh, and S. Malik. Orion:A Power-Performance Simulator for Interconnection Networks. In International Symposium on Microarchitecture , Nov 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. {26} S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 Programs: Characterization and Methodological Considerations. In International Symposium on Computer Architecture, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Flexible Snooping: Adaptive Forwarding and Filtering of Snoops in Embedded-Ring Multiprocessors

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ISCA '06: Proceedings of the 33rd annual international symposium on Computer Architecture
          June 2006
          383 pages
          ISBN:076952608X
          • cover image ACM SIGARCH Computer Architecture News
            ACM SIGARCH Computer Architecture News  Volume 34, Issue 2
            May 2006
            383 pages
            ISSN:0163-5964
            DOI:10.1145/1150019
            Issue’s Table of Contents

          Publisher

          IEEE Computer Society

          United States

          Publication History

          • Published: 1 May 2006

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          ISCA '06 Paper Acceptance Rate31of234submissions,13%Overall Acceptance Rate543of3,203submissions,17%

          Upcoming Conference

          ISCA '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader