skip to main content
10.1145/1228784.1228818acmconferencesArticle/Chapter ViewAbstractPublication PagesglsvlsiConference Proceedingsconference-collections
Article

Reducing snoop-energy in shared bus-based mpsocs by filtering useless broadcasts

Published: 11 March 2007 Publication History

Abstract

In shared bus-based multiprocessor system-on-a-chips (MPSoCs), snoop-based schemes are widely used to maintain cache coherency. However, many of broadcasts are useless because remote caches seldom have the matching blocks and their tag lookups do not supply data. From the energy perspective, such tag lookups consume unnecessary energy and make the system energy wasteful. In this paper, we propose a broadcast filtering technique to reduce snoop-energy in both of cache and bus. Broadcast filtering is achieved by help of snooping cache and split-bus. The snooping cache checks if matching blocks exist in remote caches before broad casting a coherency request. If no remote cache has the matching block, it eliminates the broadcast. If broadcasting is necessary, only a part of split-bus is used so that the request is selectively broadcasted only to the remote caches that have matching blocks. Simulation results show that our technique reduces 90%, 50%, and 30% of cache lookups, bus usage, and snoop-energy, respectively, with only 2% of degradation in performance. Our technique reduces more energy than other state-of-the-art techniques.

References

[1]
J. Goodacre and A. N. Sloss, "Parallelism and the ARM instruction set architecture," IEEE Computer, July 2005.
[2]
D. Courtright, "MIPS32 M4K core for multi-CPU applications," Embedded Processors Forum, April 2002.
[3]
L. Hammond, B. A. Hubbert, M. Siu, M. K. Prabhu, M. Chen, K. Olukotun, "The Stanford Hydra CMP," IEEE Micro, March-April 2000.
[4]
A. Moshovos, G. Memik, B. Falsafi, and A. Choudhary, "Jetty: filtering snoops for reduced energy consumption in SMP servers," Proc. of the 7th International Symposium on High- Performance Computer Architecture, January 2001.
[5]
M. Ekman, F. Dahlgren, and P. Stenström, "Evaluation of snoop-energy reduction techniques for chip-multiprocessors," Proc. of the First Workshop on Duplicating, Deconstructing, and Debunking, May 2002.
[6]
A. Moshovos, "RegionScout: exploiting coarse grain sharing in snoop-based coherence," Proc. of the 32nd International Symposium on Computer Architecture, June 2005.
[7]
C. Saldanha and M. Lipasti, "Power efficient cache coherence," Workshop on Memory Performance Issues, in conjunction with ISCA, June 2001.
[8]
K. Strauss, X. Shen, and J. Torrellas, "Flexible snooping: adaptive forwarding and filtering of snoops in embedded-ring multiprocessors," Proc. of the 33rd international Symposium on Computer Architecture, June 2006.
[9]
C. T. Heish and M. Pedram, "Architectural energy optimization by bus splitting," IEEE Transactions on Computer-Aided Design of Integrated Circuits And Systems, April 2002.
[10]
D. Kim, S. Ha, and R. Gupta, "CATS:Cycle Accurate Transaction-driven Simulation with Multiple Processor Simulators," Proc. of Design Automation and Test in Europe, April 2007.
[11]
D. Burget and T. Austin, "The SimpleScalar tool set version 4.0," http://www.simplescalar.com/v4test.html.
[12]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, "The SPLASH-2 programs: characterization and methodological considerations," Proc. of the 22nd Annual International Symposium on Computer Architecture, June 1995.
[13]
P. Shivakumar and N. P. Jouppi, "CACTI 3.0: an integrated cache timing, power, and area model," WRL Research Report 2001/2, August 2001.
[14]
D. E. Culler, J. P. Singh, and A. Gupta, Parallel computer architecture: a hardware/software approach, Morgan Kaufmann Publishers, 1999.

Cited By

View all
  • (2007)Broadcast filtering-aware task assignment techniques for low-power MPSoCsProceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture10.1145/1327171.1327182(89-96)Online publication date: 16-Sep-2007
  • (2007)An Energy Efficient Parallel Architecture Using Near Threshold Operation16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007)10.1109/PACT.2007.4336210(175-188)Online publication date: Sep-2007

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GLSVLSI '07: Proceedings of the 17th ACM Great Lakes symposium on VLSI
March 2007
626 pages
ISBN:9781595936059
DOI:10.1145/1228784
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 March 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. MPSoC
  2. broadcast filtering
  3. low-energy cache coherency

Qualifiers

  • Article

Conference

GLSVLSI07
Sponsor:
GLSVLSI07: Great Lakes Symposium on VLSI 2007
March 11 - 13, 2007
Stresa-Lago Maggiore, Italy

Acceptance Rates

Overall Acceptance Rate 312 of 1,156 submissions, 27%

Upcoming Conference

GLSVLSI '25
Great Lakes Symposium on VLSI 2025
June 30 - July 2, 2025
New Orleans , LA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2007)Broadcast filtering-aware task assignment techniques for low-power MPSoCsProceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture10.1145/1327171.1327182(89-96)Online publication date: 16-Sep-2007
  • (2007)An Energy Efficient Parallel Architecture Using Near Threshold Operation16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007)10.1109/PACT.2007.4336210(175-188)Online publication date: Sep-2007

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media