| Frequent loop detection using efficient non-intrusive on-chip hardware |
| Full text |
Pdf
(278 KB)
|
| Source
|
International Conference on Compilers, Architecture and Synthesis for Embedded Systems
archive
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
table of contents
San Jose, California, USA
SESSION: Microprocessor architecture
table of contents
Pages: 117 - 124
Year of Publication: 2003
ISBN:1-58113-676-5
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 51, Citation Count: 6
|
|
|
ABSTRACT
Dynamic software optimization methods are becoming increasingly popular for improving software performance and power. The first step in dynamic optimization consists of detecting frequently executed code, or "critical regions." Previous critical region detectors have been targeted to desktop processors. We introduce a critical region detector targeted to embedded processors, with the unique features of being very size and power efficient, and being completely non-intrusive to the software's execution - features needed in timing-sensitive embedded systems. Our detector not only finds the critical regions, but also determines their relative frequencies, a potentially important feature for selecting among alternative dynamic optimization methods. Our detector uses a tiny cache coupled with a small amount of logic. We provide results of extensive explorations across seventeen embedded system benchmarks. We show that highly accurate results can be achieved with only a 0.02% power overhead and acceptable size overhead. Our detector is currently being used as part of a dynamic hardware/software partitioning approach, but is applicable to a wide-variety of situations.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Jennifer M. Anderson , Lance M. Berc , Jeffrey Dean , Sanjay Ghemawat , Monika R. Henzinger , Shun-Tak A. Leung , Richard L. Sites , Mark T. Vandevoorde , Carl A. Waldspurger , William E. Weihl, Continuous profiling: where have all the cycles gone?, Proceedings of the sixteenth ACM symposium on Operating systems principles, p.1-14, October 05-08, 1997, Saint Malo, France
|
| |
2
|
Artisan, http://www.artisan.com.
|
 |
3
|
Vasanth Bala , Evelyn Duesterwald , Sanjeev Banerjia, Dynamo: a transparent dynamic optimization system, Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, p.1-12, June 18-21, 2000, Vancouver, British Columbia, Canada
|
| |
4
|
|
| |
5
|
Burger, D., Austin, T., Bennet, S. Evaluating future microprocessors: the simplescalar toolset. University of Wisconsin-Madison. Computer Science Department Tech. Report CS-TR-1308, July 2000.
|
| |
6
|
Brad Calder , Peter Feller , Alan Eustace, Value profiling, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.259-269, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
7
|
|
| |
8
|
Jeffrey Dean , James E. Hicks , Carl A. Waldspurger , William E. Weihl , George Chrysos, ProfileMe: hardware support for instruction-level profiling on out-of-order processors, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.292-302, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
9
|
|
| |
10
|
|
 |
11
|
Susan L. Graham , Peter B. Kessler , Marshall K. Mckusick, Gprof: A call graph execution profiler, Proceedings of the 1982 SIGPLAN symposium on Compiler construction, p.120-126, June 23-25, 1982, Boston, Massachusetts, United States
|
| |
12
|
IEEE, IEEE 1149.1 Standard Test Access Port and Boundary-Scan Architecture, http://standards .ieee.org, 2001.
|
 |
13
|
|
| |
14
|
|
| |
15
|
Kiefendorff, K.. Transistor Budgets Go Ballistic. Microprocessor Report, Volume 12, Number 10, August 1998, pp. 34--43.
|
| |
16
|
Klaiber, A. The technology behind crusoe processors. Transmeta Technical Brief. January 2000.
|
| |
17
|
Chunho Lee , Miodrag Potkonjak , William H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.330-335, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
18
|
Lea Hwang Lee , Bill Moyer , John Arends, Instruction fetch energy reduction using loop caches for embedded applications with small tight loops, Proceedings of the 1999 international symposium on Low power electronics and design, p.267-269, August 16-17, 1999, San Diego, California, United States
[doi> 10.1145/313817.313944]
|
 |
19
|
|
 |
20
|
|
 |
21
|
|
 |
22
|
Matthew C. Merten , Andrew R. Trick , Christopher N. George , John C. Gyllenhaal , Wen-mei W. Hwu, A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization, Proceedings of the 26th annual international symposium on Computer architecture, p.136-147, May 01-04, 1999, Atlanta, Georgia, United States
|
| |
23
|
MIPS Technologies, http://www.mips.com/content/Products/Cores/32BitCores/MIPS324KFamily/ProductCatalog/P_MIPS324KFamily/productBrief
|
 |
24
|
|
| |
25
|
|
 |
26
|
|
 |
27
|
Dinesh C. Suresh , Walid A. Najjar , Frank Vahid , Jason R. Villarreal , Greg Stitt, Profiling tools for hardware/software partitioning of embedded applications, Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems, June 11-13, 2003, San Diego, California, USA
|
| |
28
|
Synopsys Inc., http://www.synopsys.com.
|
| |
29
|
|
| |
30
|
Vtune Environment, Intel Corp., http://developer.intel.com/vtune
|
| |
31
|
|
| |
32
|
Marco Zagha , Brond Larson , Steve Turner , Marty Itzkowitz, Performance analysis using the MIPS R10000 performance counters, Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM), p.16-es, January 01-01, 1996, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/369028.369059]
|
 |
33
|
Xiaolan Zhang , Zheng Wang , Nicholas Gloy , J. Bradley Chen , Michael D. Smith, System support for automatic profiling and optimization, Proceedings of the sixteenth ACM symposium on Operating systems principles, p.15-26, October 05-08, 1997, Saint Malo, France
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE conference on Design automation
Gwo-Dong Chen
, Daniel D. Gajski
|