skip to main content
article

Virtual private caches

Published: 09 June 2007 Publication History

Abstract

Virtual Private Machines (VPM) provide a framework for Quality of Service (QoS) in CMP-based computer systems. VPMs incorporate microarchitecture mechanisms that allow shares of hardware resources to be allocated to executing threads, thus providing applications with an upper bound on execution time regardless of other thread activity. Virtual Private Caches (VPCs) are an important element of VPMs. VPC hardware consists of two major components: the VPC Arbiter, which manages shared cache bandwidth, and the VPC Capacity Manager, which manages the cache storage. Both the VPC Arbiter and VPC Capacity Manager provide minimum service guarantees that, when combined, achieve QoS for the cache subsystem. Simulation-based evaluation shows that conventional cache bandwidth management policies allow concurrently executing threads to affect each other significantly in an uncontrollable manner. The evaluation targets cache bandwidth because the effects of cache capacity sharing have been studied elsewhere. In contrast with the conventional policies, the VPC Arbiter meets its QoS performance objectives on all workloads studied and over a range of allocated bandwidth levels. The VPC Arbiter’s fairness policy, which distributes leftover bandwidth, mitigates the effects of cache preemption latencies, thus ensuring threads a high-degree of performance isolation. Furthermore, the VPC Arbiter eliminates negative bandwidth interference which can improve aggregate throughput and resource utilization.

References

[1]
Banga, G., Druschel, P., and Mogul, J., Resource containers: A new facility for resource management in server systems, In Proc. of the 3rd USENIX Symp. On Operating Systems and Design Implementation, Feb. 1999. pp 45--58.
[2]
Bennett, J. C., and Zhang, H., Hierarchical packet fair queuing algorithms. In Trans. On Networking, Oct. 1997. pp 675--689.
[3]
Brown, J., Application Customized CPU Design: The Xbox 360 Story, on IBM Developerworks, Dec. 2005.
[4]
Cazorla, F. J., Ram1rez, A., Valero, M., Knijnenburg, P. M. W., Sakellariou, R., and Fernandez., E., QoS for High-Performance SMT Processors in Embedded Systems. IEEE Micro, 2004. pp 24--31.
[5]
Chetto, H., and Chetto, M., Some Results of the Earliest Deadline Scheduling Algorithm. IEEE Trans. on Software Engineering. 15, 10, Oct. 1989. pp 1261--1269.
[6]
Emer J., et al., Asim: A Performance Model Framework. IEEE Computer, Feb. 2002. pp 68--76.
[7]
Kim, S., Chandra, D., and Solihin, Y., Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture. In Proc. of the 13th Intl. Conf. on Parallel Architecture and Compiler Techniques, Sept. 2004. pp 111--122.
[8]
Gupta, D., Cherkasova, L., Gardner, R., and Vahdat, A., Enforcing Performance Isolation Across Virtual Machines in Xen. In Proc. of the USENIX 7th Intl. Middleware Conference, Dec.2006.
[9]
Hennessy J. L., and Patterson, D., A., Computer Architecture: A Quantitative Approach, Third Edition, Morgan Kaugmann, 2002.
[10]
Hsu, L. R., Reinhardt, S. K., Iyer, R., and Makineni, S., Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource. In Proc. of the 15th Intl. Conf. on Parallel Architectures and Compilation Techniques, Sept. 2006. pp 13--22.
[11]
IBM PowerPC 970FX RISC Microprocessor User's Manual, Version 1.6, Dec. 2005.
[12]
Iyengar, V. S., Trevillyan, L. H., and Bose, P., Representative Traces for Processor Models with Infinite Cache. In Proc. of the 2nd Symp. on High-Performance Computer Architecture, Feb. 1996. pp 62--72.
[13]
Iyer, R. CQoS: a framework for enabling QoS in shared caches of CMP platforms. In Proc, of the 18th Intl. Conf. on Supercomputing, June 26, 2004. pp 257--266.
[14]
Kumar, R., Zyuban, V., and Tullsen, D. M., Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling. In Proc. of the 32nd Intl. Symp. on Computer Architecture, June 2005. pp 408--419.
[15]
Kongetira, P., Aingaran, K., and Olukotun, K., Niagara: A 32-Way Multithreaded Sparc Processor. IEEE Micro, 25, 2, Mar. 2005. pp 21--29.
[16]
Le Boudec, J.Y., and Thiran, P., Network Calculus, Springer Verlag, 2004.
[17]
Lee, J. W. and Asanovic, K., METERG: Measurement-Based End-to-End Performance Estimation Technique in QoS-Capable Multiprocessors. In Proc. of the 12th IEEE Real-Time and Embedded Technology and Applications Symp, April 2006. pp 135--147.
[18]
Luo, K., Gummaraju, J., and Franklin, M., Balancing throughput and fairness in SMT processors. In Proc. of the Intl. Symp. on Performance Analysis of Systems and Software, Jan. 2001. pp 164--171.
[19]
Mak, P., et al., Shared-cache clusters in a system with a fully shared memory. In IBM Journal of R&D Vol. 41 July/Sept. 1997. pp 429--448.
[20]
Micron., 1Gb DDR2 SDRAM Component: MT47H128M8B7-25E, June 2006.
[21]
Nesbit, K.J., Aggarwal, N., Laudon, J., and Smith, J.E., Fair Queuing Memory Systems, In Proc. of 39th Intl. Symp. On Microarchitecture, Dec 2006. pp 208--222.
[22]
Qureshi, M. K. and Patt, Y. N. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches. In Proceedings of the 39th Intl. Symp. on Microarchitecture, Dec. 2006. pp 423--432.
[23]
Rafique, N., Lim, W., and Thottethodi, M. Architectural support for operating system-driven CMP cache management. In Proceedings of the 15th Intl. Conf. on Parallel Architectures and Compilation Techniques, Sept. 2006. pp 2--12.
[24]
Sariowan, H., Cruz R.L., and Polyzos G.C., Scheduling for quality of service guarantees via service curves. In Proc. of the 4th Intl. Conf. on Computer Communication and Networks, Sept. 1995. pp 512--520.
[25]
Shreedhar, M., and Varghese, G., Efficient fair queueing using deficit round robin. In Proc. of the Conference on Applications, Technologies, Architectures, and Protocols For Computer Communication, August 1995. pp 231--242.
[26]
Silberschatz, A., Galbin, P. B., and Gagne, G., Operating System Concepts, Seven Edition, John Wiley & Sons, Inc., 2004.
[27]
Skadron, K., and Clark, D. W., Design Issues and Tradeoffs for Write Buffers. In Proc. of the 3rd Symp. on High-Performance Computer Architecture. Feb. 1997. pp 144--155.
[28]
Stewart, D. B., and Mortier, R., Virtual private machines: user-centric performance. In Proc. of the 11th Workshop on ACM SIGOPS European Workshop: Beyond the PC, Sept., 2004. pp 36--40.
[29]
Suh, G. E., Devadas, S., and Rudolph, L., A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning. In Proceedings of the 8th Intl. Symp. on High-Performance Computer Architecture, Feb. 2002. pp 117--128.
[30]
Tendler, J. M., et. al., Power4 System Mircoarchitecture, Technical white paper, Oct. 2001.
[31]
Verghese, B., Gupta, A., and Rosenblum, M. Performance isolation: sharing and isolation in shared-memory multiprocessors. In Proc. of the 8th Intl. Conf. on Architecture Support For Programming Language and Operating Systems, Oct. 1998. pp 181--192.
[32]
Wilton, S., and Jouppi, N., CACTI: An Enhanced cache Access and Cycle Time Model, In Journal of Solid-State Circuits, Vol. 31, May 1996. pp 677--688.
[33]
Zhang H., Service Disciplines for Guaranteed Performance Service in Packet-switching Networks, In Proc. of the IEEE, vol.83, Oct. 1995. pp 1374--1398.

Cited By

View all
  • (2020)Novel Fairness-aware Co-scheduling for Shared Cache Contention Game on Chip MultiprocessorsInformation Sciences10.1016/j.ins.2020.03.078Online publication date: Apr-2020
  • (2019)Co-scheduling HPC workloads on cache-partitioned CMP platformsInternational Journal of High Performance Computing Applications10.1177/109434201984695633:6(1221-1239)Online publication date: 1-Nov-2019
  • (2019)Cache Characterization of Workloads in a Microservice Environment2019 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM)10.1109/CCEM48484.2019.00010(45-50)Online publication date: Sep-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 35, Issue 2
May 2007
527 pages
ISSN:0163-5964
DOI:10.1145/1273440
Issue’s Table of Contents
  • cover image ACM Conferences
    ISCA '07: Proceedings of the 34th annual international symposium on Computer architecture
    June 2007
    542 pages
    ISBN:9781595937063
    DOI:10.1145/1250662
    • General Chair:
    • Dean Tullsen,
    • Program Chair:
    • Brad Calder
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2007
Published in SIGARCH Volume 35, Issue 2

Check for updates

Author Tags

  1. chip multiprocessor
  2. performance isolation
  3. quality of service
  4. shared caches
  5. soft real-time

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)4
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Novel Fairness-aware Co-scheduling for Shared Cache Contention Game on Chip MultiprocessorsInformation Sciences10.1016/j.ins.2020.03.078Online publication date: Apr-2020
  • (2019)Co-scheduling HPC workloads on cache-partitioned CMP platformsInternational Journal of High Performance Computing Applications10.1177/109434201984695633:6(1221-1239)Online publication date: 1-Nov-2019
  • (2019)Cache Characterization of Workloads in a Microservice Environment2019 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM)10.1109/CCEM48484.2019.00010(45-50)Online publication date: Sep-2019
  • (2019)Cache control techniques to provide QoS on real systemsThe Journal of Supercomputing10.1007/s11227-019-02789-775:8(5161-5188)Online publication date: 1-Aug-2019
  • (2018)Co-Scheduling HPC Workloads on Cache-Partitioned CMP Platforms2018 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2018.00052(348-358)Online publication date: Sep-2018
  • (2018)Keeping up with Real TimeAdvances in Aeronautical Informatics10.1007/978-3-319-75058-3_9(121-133)Online publication date: 11-May-2018
  • (2016)Developing Graph-Based Co-Scheduling Algorithms on Multicore ComputersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2015.246822327:6(1617-1632)Online publication date: 12-May-2016
  • (2016)Reducing NoC and Memory Contention for ManycoresProceedings of the 29th International Conference on Architecture of Computing Systems -- ARCS 2016 - Volume 963710.1007/978-3-319-30695-7_22(293-305)Online publication date: 4-Apr-2016
  • (2015)A Survey of Security and Privacy Challenges in Cloud Computing: Solutions and Future DirectionsJournal of Computing Science and Engineering10.5626/JCSE.2015.9.3.1199:3(119-133)Online publication date: 30-Sep-2015
  • (2014)MerlinProceedings of the ACM Symposium on Cloud Computing10.1145/2670979.2670993(1-14)Online publication date: 3-Nov-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media