ABSTRACT
In this paper, we describe a new method of instruction prefetching that reduces the cache miss penalty by anticipating the cache behavior based on previous execution. Our observations indicate that instruction cache misses often repeat in clusters under certain conditions prevalent in real time embedded networking systems. By identifying the start of a cluster miss sequence and preparing an instruction buffer for the upcoming cache misses, the miss penalty can be reduced if a miss does occur. A sample industrial networking example is used to illustrate the effectiveness of this technique compared with other prefetch methods.
- Choiu, D., Jain, P., Rudolph, L., Devadas, S. Application Specific Memory Management for Embedded Systems Using Software-Controlled Caches. Proceedings of 37th Design Automation Conference, pp 416--420, June 2000. Google ScholarDigital Library
- Panda, P.R., Dutt, N., and Nicolau, A. Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration. Kluwer Academic Publishers, 1999. Google ScholarDigital Library
- Reinmann, G., Calder, B., Austin, T. Fetch Directed Instruction Prefetching. Proc. of the 32nd Annual Int. Symposium on Micro Architecture, Nov. 1999, pp. 16--27. Google ScholarDigital Library
- Lysecky, R., Vahid, F., Prefetching for Improved Bus Wrapper Performance in Cores. ACM Trans. on Design Automation of Electronic Systems, vol 7, no.1, Jan. 2002. Google ScholarDigital Library
- Hamblen, J., A VHDL Synthesis Model of the MIPS Processor for Use in Computer Architecture Laboratories. IEEE Trans. On Education, vol 40, no 4, Nov 1997. Google ScholarDigital Library
- Sebek, F. Instruction Cache Memory Issues in Real-Time Systems. PHD Thesis, Malardalen Univ., Sweden. 2002.Google Scholar
- Cotterell, S., Vahid, F., Synthesis of Customized Loop Caches for Core-Based Embedded Systems. Int. Conf. on Computer Aided Design, Nov. 2002, pp. 655--662. Google ScholarDigital Library
- Memik, G., Mangione-Smith, W., Hu, W. Netbench: A Benchmarking Suite for Network Processors. Proc. of Int. Conference on Computer-Aided Design. Nov. 2001. Google ScholarDigital Library
- Smith, A.J. Cache Memories. Computing Surveys, 14, September 1982. Google ScholarDigital Library
- Tan, Y., Mooney, V., A Prioritized cache for Real-Time Systems. Proc of the 11th Workshop on Synthesis and System Integration of Mixed Information Technologies (SASIMI'03), pp. 168--175, April 2003.Google Scholar
- Hsu, W., Smith, J., A Performance Study of Instruction Cache Prefetch Methods. IEEE Transactions on Computers. May 1998, Vol. 47, No. 5. pp. 497--508. Google ScholarDigital Library
- Jouppi, N., Improving Direct Mapped Cache Performance by the Addition of Small Fully-Associative Cache and Prefetch Buffers. 17th International Symposium on Computer Architecture, June 1990. pp 364--373. Google ScholarDigital Library
- Ravi, S., et al. System Design Methodologies for Wireless Security Processing Platform. Proc. of 20th DAC June 2002. pp 777--782. Google ScholarDigital Library
- Wuytack, S., Silva, J., Catthoor, F., Jong, G., Ykman-Couvreur, C. Memory Management for Embedded Network Applications. IEEE Trans. On CAD Design of Integrated Circuits and Systems, vol 18, no 5, May 1999. Google ScholarDigital Library
- Crosbie, N., Kandemir, M., Kolcu, I., Ramanujam, J., Choudhary, A. Strategies for Improving Data Locality in Embedded Applications. Proceedings of 15th Int. Conf. on VLSI Design, 2002. Google ScholarDigital Library
- Benakar, R., et al. Scratchpad Memory: A Design Alternative for Cache On-chip Memory in Embedded Systems. 10th International Workshop on Hardware/Software Codesign, May 2002. pp 73--78. Google ScholarDigital Library
- Chiu, J., Shiu, R., Chi, S., Chung, C. Instruction Cache Prefetching Directed by Branch Prediction. IEEE Proc., Computers & Digital Techniques, vol 146, no. 5, 1999.Google Scholar
- Gordon-Ross, A., Cotterell, S., Vahid, F. Exploting Fixed Programs in Embedded Systems: A Loop Cache Example. IEEE Computer Architecture Letters, Vol 1, Jan. 2002. Google ScholarDigital Library
- Jouppi, N.P., Cache Write Policies and Performance. Computer Architecture, 1993. Proceedings of the 20th Annual International Symposium on Computer Architecture, May 1993. pp 191--201. Google ScholarDigital Library
- Chiueh, T., Pradhan, P., Cache Memory Design for Network Processors. Proc. of IEEE 6th HPCA, Jan 2000. pp 409--418. Google ScholarDigital Library
- Vahid, F., Lysecky, R., Zhang, C., Stitt, G., Highly Configurable Platforms for Embedded Computing Systems. Microelectronics Journal, Elsevier Publishers, Volume 34, Issue 11, Nov. 2003, pp 1025--1029.Google Scholar
- Benini, L., et al. From Architecture to Layout: Partitioned Memory Synthesis for Embedded Systems-on-Chip. Proc. of 19th DAC June 2002. pp 777--782. Google ScholarDigital Library
Index Terms
Cluster miss prediction for instruction caches in embedded networking applications
Recommendations
Cluster miss prediction with prefetch on miss for embedded CPU instruction caches
CASES '04: Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systemsSoft CPU cores are often used in embedded systems, yet they limit opportunities to improve cache performance to hardware assistance outside the CPU core. Instruction prefetching is commonly used, but the popular Prefetch On Miss (POM) technique is less ...
Revisiting level-0 caches in embedded processors
CASES '12: Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systemsLevel-0 (L0) caches have been proposed in the past as an inexpensive way to improve performance and reduce energy consumption in resource-constrained embedded processors. This paper proposes new L0 data cache organizations using the assumption that an ...
Comments