ACM Home Page
Please provide us with feedback. Feedback
Operation tables for scheduling in the presence of incomplete bypassing
Full text PdfPdf (152 KB)
Source
International Conference on Hardware Software Codesign archive
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis table of contents
Stockholm, Sweden
SESSION: Software and hardware techniques for performance optimisation of embedded applications table of contents
Pages: 194 - 199  
Year of Publication: 2004
ISBN:1-58113- 937-3
Authors
Aviral Shrivastava  University of California, Irvine, CA
Eugene Earlie  Intel Labs, Hudson, MA
Nikil Dutt  University of California, Irvine, CA
Alex Nicolau  University of California, Irvine, CA
Sponsors
SIGDA: ACM Special Interest Group on Design Automation
SIGBED: ACM Special Interest Group on Embedded Systems
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 30,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1016720.1016768
What is a DOI?

ABSTRACT

Register bypassing is a powerful and widely used feature in modern processors to eliminate certain data hazards. Although complete bypassing is ideal for performance, bypassing has significant impact on cycle time, area, and power consumption of the processor. Due to the strict constraints on performance, cost and power consumption in embedded processors, architects need to evaluate and implement incomplete register bypassing mechanisms. However traditional data hazard detection and/or avoidance techniques used in retargetable schedulers break down in the presence of incomplete bypassing. In this paper, we present the concept of Operation Tables, which can be used to detect data hazards, even in the presence of incomplete bypassing. Furthermore our technique integrates the detection of both data, as well as resource hazards, and can be easily employed in a compiler to generate better schedules. Our experimental results on the popular Intel XScale embedded processor platform show that even with a simple intra-basic block scheduling technique, we achieve upto 20% performance improvement over fully optimized GCC generated code on embedded applications from the MiBench suite.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Intel xscale microarchitecture programmers reference manual.
 
2
 
3
 
4
E. Bloch. The engineering design of the stretch computer. In Proc. of Eastern Joint Computer Conference, pages 48--59, 1959.
5
 
6
E. S. Davidson. The design and control of pipelined function generators. Int. IEEE Conf. on Systems Networks and Computers, pages 19--21, 1971.
 
7
K. Fan, N. Clark, M. Chu, K. V. Manjunath, R. Ravindran, M. Smelyanskiy, and S. Mahlke. Systematic register bypass customization for application-specific processors. In Proc. of IEEE Intl. Conf. on ASSAP, 2003.
 
8
M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. Mibench: A free, commercially representative embedded benchmark suite. In IEEE Workshop in workload characterization, 2001.
9
 
10
A. Halambi, A. Shrivastava, N. Dutt, and A. Nicolau. A customizable compiler framework for embedded systems. In SCOPES, 2001.
 
11
 
12
 
13
 
14
The Trimaran Consortium. The Trimaran Compiler Infrastructure for Instruction Level Parallelism.


Collaborative Colleagues:
Aviral Shrivastava: colleagues
Eugene Earlie: colleagues
Nikil Dutt: colleagues
Alex Nicolau: colleagues