ACM Home Page
Please provide us with feedback. Feedback
A case for shared instruction cache on chip multiprocessors running OLTP
Full text PdfPdf (272 KB)
Source ACM SIGARCH Computer Architecture News archive
Volume 32 ,  Issue 3  (June 2004) table of contents
Special issue: MEDEA-2003 workshop
Pages: 11 - 18  
Year of Publication: 2004
ISSN:0163-5964
Also published in ...
Authors
Partha Kundu  Intel Corporation
Murali Annavaram  Intel Corporation
Trung Diep  Intel Corporation
John Shen  Intel Corporation
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 38,   Citation Count: 4
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1024295.1024297
What is a DOI?

ABSTRACT

Due to their large code footprint, OLTP workloads suffer from significant I-cache miss rates on contemporary microprocessors. This paper analyzes the I-stream behavior of an OLTP workload, called the Oracle Database Benchmark (ODB), on Chip-Multiprocessors (CMP). Our results show that, although, the overall code footprint of ODB is large, multiple ODB threads running concurrently on multiple processors tend to access common code segments frequently, thus exhibiting significant constructive sharing. In fact, in a CMP system, an I-cache shared between multiple processors incurs similar miss rate as a dedicated I-cache per processor where the per processor I-cache has the same capacity as the shared I-cache. Based on these observations, this paper makes the case for a shared I-cache organization in a CMP, instead of the traditional approach of using a dedicated I-cache per processor.Furthermore, this paper shows that OLTP code stream exhibits good spatial locality. Adding a simple dedicated Line Buffer per processor can exploit this spatial locality effectively, to reduce latency and bandwidth requirements on the shared cache. The proposed shared I-cache organization results in an improvement of at least 5X in miss rate over a dedicated cache organization, for the same total capacity.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
4
5
6
7
8
 
9
Standard Performance Council. The SPEC95 CPU Benchmark Suite. http://www.spec.org/cpu2000
 
10
11
12
 
13
P. S. Magnusson, F. Dahlgren, H. Grahn, M. Karlsson, F. Larsson, F. Lundholm, A. Moestedt, J. Nilsson, P. Stenström, and B. Werner. SimICS/sun4m: A Virtual Workstation. In Proceedings of the Usenix Annual Technical Conference, pages 119--130, June 1998.

Collaborative Colleagues:
Partha Kundu: colleagues
Murali Annavaram: colleagues
Trung Diep: colleagues
John Shen: colleagues