ACM Home Page
Please provide us with feedback. Feedback
A case for low-complexity MP architectures
Full text pdf formatPdf (310 KB)
Source
Conference on High Performance Networking and Computing archive
Proceedings of the 2007 ACM/IEEE conference on Supercomputing table of contents
Reno, Nevada
SESSION: System architecture table of contents
Article No. 19  
Year of Publication: 2007
ISBN:978-1-59593-764-3
Authors
Håkan Zeffer  Uppsala University, Uppsala, Sweden
Erik Hagersten  Uppsala University, Uppsala, Sweden
Sponsors
IEEE-CS\DATC : IEEE Computer Society
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 36,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1362622.1362648
What is a DOI?

ABSTRACT

Advances in semiconductor technology have driven shared-memory servers toward processors with multiple cores per die and multiple threads per core. This paper presents simple hardware primitives enabling flexible and low-complexity multi-chip designs supporting an efficient inter-node coherence protocol implemented in software.

We argue that our primitives and the example design presented in this paper have lower hardware overhead, have easier (and later) verification requirements, and provide the opportunity for flexible coherence protocols and simpler protocol bug corrections than traditional designs.

Our evaluation is based on detailed full-system simulations of modern chip-multiprocessors and both commercial and HPC workloads. We compare a low-complexity system based on the proposed primitives with aggressive hardware multi-chip shared-memory systems and show that the performance is competitive across a large design space.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
 
4
5
 
6
 
7
Gharachorloo, K., et al. Efficient ECC-Based Directory Implementations for Scalable Multiprocessors. In Computer Architecture and High-Performance Computing (Oct. 2000).
 
8
Hagersten, E., et al. Simple COMA Node Implementations. In HICSS (Jan. 1994).
 
9
10
 
11
 
12
 
13
Krewell, K. Power5 Tops on Bandwidth. In Microprocessor Report (Dec. 2003).
14
15
16
 
17
18
 
19
20
 
21
Nowatzyk, A., et al. The S3.mp Scalable Shared Memory Multiprocessor. In ICPP (Aug. 1995), vol. I.
22
 
23
OpenSPARC.net, June 2006. Available from http://www.opensparc.net.
 
24
25
26
27
 
28
Standard Performance Evaluation Corporation. SPECjbb2000. A Java Business Benchmark. White Paper.
 
29
Tendler, J. M., et al. Power4 system microarchitecture. IBM Journal of Research and Development 46, 1 (Jan. 2002).
 
30
31
 
32
Wallin, D., et al. Vasa: A Simulator Infrastructure with Adjustable Fidelity. In PDCS (Nov. 2005).
 
33
Weaver, D. L., and Germond, T., Eds.The SPARC Architecture Manual, Version 9. PTR, Prentice Hall, 2000.
34
35
Collaborative Colleagues:
Håkan Zeffer: colleagues
Erik Hagersten: colleagues