ACM Home Page
Please provide us with feedback. Feedback
Architectural support for scalable speculative parallelization in shared-memory multiprocessors
Full text PdfPdf (253 KB)
Source International Symposium on Computer Architecture archive
Proceedings of the 27th annual international symposium on Computer architecture table of contents
Vancouver, British Columbia, Canada
Pages: 13 - 24  
Year of Publication: 2000
ISBN:1-58113-232-8
Also published in ...
Authors
Marcelo Cintra  Department of Computer Science, University of Illinois at Urbana-Champaign
José F. Martínez  Department of Computer Science, University of Illinois at Urbana-Champaign
Josep Torrellas  Department of Computer Science, University of Illinois at Urbana-Champaign
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 74,   Citation Count: 40
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/339647.363382
What is a DOI?

ABSTRACT

Speculative parallelization aggressively executes in parallel codes that cannot be fully parallelized by the compiler. Past proposals of hardware schemes have mostly focused on single-chip multiprocessors (CMPs), whose effectiveness is necessarily limited by their small size. Very few schemes have attempted this technique in the context of scalable shared-memory systems. In this paper, we present and evaluate a new hardware scheme for scalable speculative parallelization. This design needs relatively simple hardware and is efficiently integrated into a cache-coherent NUMA system. We have designed the scheme in a hierarchical manner that largely abstracts away the internals of the node. We effectively utilize a speculative CMP as the building block for our scheme. Simulations show that the architecture proposed delivers good speedups at a modest hardware cost. For a set of important non-analyzable scientific loops, we report average speedups of 4.2 for 16 processors. We show that support for per-word speculative state is required by our applications, or else the performance suffers greatly.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
 
5
 
6
7
8
9
 
10
 
11
12
13
14
15
16
 
17
J. G. Steffan and T. C. Mowry. "Architectural Support for Thread- Level Data Speculation." Tech. Rep. CMU-CS-97-188, Carnegie Mellon University, November 1997.
18
 
19
M. Tremblay. "MAJC: Microprocessor Architecture for Java Computing." Presentation at Hot Chips, August 1999.
 
20
 
21
 
22
 
23
 
24

CITED BY  40
 
 
 
 
 
 
 
 

Collaborative Colleagues:
Marcelo Cintra: colleagues
José F. Martínez: colleagues
Josep Torrellas: colleagues

Peer to Peer - Readers of this Article have also read: