ABSTRACT
In this paper, we propose an efficient code overlay technique that automatically generates an overlay structure for a given memory size for multicores with explicitly-managed memory hierarchies. We observe that finding an efficient overlay structure with minimum memory copying overhead is similar to the problem that finds a code placement with minimum conflict misses in the instruction cache. Our algorithm exploits the temporal-ordering information between functions during program execution. Experimental results on the Cell BE processor indicate that our approach is effective and promising.
- N. Gloy, T. Blackwell, M. D. Smith, and B. Calder. Procedure placement using temporal ordering information. In MICRO 30: Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, pages 303--313, 1997. Google ScholarDigital Library
- IBM, Sony, and Toshiba. Cell Broadband Engine Architecture. IBM, October 2007. http://www.ibm.com/developerworks/power/cell/.Google Scholar
Index Terms
- SRC: an automatic code overlaying technique for multicores with explicitly-managed memory hierarchies
Recommendations
An automatic code overlaying technique for multicores with explicitly-managed memory hierarchies
CGO '12: Proceedings of the Tenth International Symposium on Code Generation and OptimizationThe explicitly-managed memory hierarchies, where a hierarchy of distinct memories is exposed to the programmer and managed explicitly by software, are not only found in typical embedded processors but also found in a class of high performance multicore ...
Automatic code overlay generation and partially redundant code fetch elimination
There is an increasing interest in explicitly managed memory hierarchies, where a hierarchy of distinct memories is exposed to the programmer and managed explicitly in software. These hierarchies can be found in typical embedded systems and an emerging ...
SRC: virtual i/o caching: dynamic storage cache management for concurrent workloads
ICS '11: Proceedings of the international conference on SupercomputingA leading cause of unpredictable application performance in distributed systems is contention at the storage layer, where resources are multiplexed among concurrent data intensive workloads. We target the shared storage cache, used to alleviate disk I/O ...
Comments