skip to main content
research-article

Inter-core cooperative TLB for chip multiprocessors

Published:13 March 2010Publication History
Skip Abstract Section

Abstract

Translation Lookaside Buffers (TLBs) are commonly employed in modern processor designs and have considerable impact on overall system performance. A number of past works have studied TLB designs to lower access times and miss rates, specifically for uniprocessors. With the growing dominance of chip multiprocessors (CMPs), it is necessary to examine TLB performance in the context of parallel workloads.

This work is the first to present TLB prefetchers that exploit commonality in TLB miss patterns across cores in CMPs. We propose and evaluate two Inter-Core Cooperative (ICC) TLB prefetching mechanisms, assessing their effectiveness at eliminating TLB misses both individually and together. Our results show these approaches require at most modest hardware and can collectively eliminate 19% to 90% of data TLB (D-TLB) misses across the surveyed parallel workloads.

We also compare performance improvements across a range of hardware and software implementation possibilities. We find that while a fully-hardware implementation results in average performance improvements of 8-46% for a range of TLB sizes, a hardware/software approach yields improvements of 4-32%. Overall, our work shows that TLB prefetchers exploiting inter-core correlations can effectively eliminate TLB misses.

References

  1. T.Anderson et al. The Interaction of Architecture and Operating System Design., Intl. Symp. on Architecture Support for Programming Languages and Operating Systems, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A.Bhattacharjee and M.Martonosi. Characterizing the TLB Behavior of Emerging Parallel Workloads on Chip Multiprocessors. Intl. Conf. on Parallel Architectures and Compilation Techniques, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C.Bienia et al. The PARSEC Benchmark Suite: Characterization and Architectural Implications. Intl. Conf. on Parallel Architectures and Compilation Techniques, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J.B. Chen, A.Borg, and N.Jouppi. A Simulation Based Study of TLB Performance. Intl. Symp. on Computer Architecture, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T.Chen and J.Baer. Effective Hardware-based Data Prefetching for High-Performance Processors. IEEE Trans. on Computers, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D.Clark and J.Emer.Performance of the VAX-11/780 Translation Buffers: Simulation and Measurement. ACM Transactions on Computer Systems, 3(1), 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F.Dahlgren, M.Dubois, and P.Stenstrom. Fixed and Adaptive Sequential Prefetching in Shared Memory Multiprocessors. Intl. Conf. on Parallel Processing, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. H.Huck and H.Hays. Architectural Support for Translation Table Management in Large Address Space Machines. Intl. Symp. on Computer Architecture, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B.Jacob and T.Mudge. Software-Managed Address Translation. Intl. Symp. on High Performance Computer Architecture, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B.Jacob and T.Mudge. A Look at Several Memory Management Units: TLB-Refill, and Page Table Organizations. Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. B.Jacob and T.Mudge. Virtual Memory in Contemporary Microprocessors. IEEE Micro, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D.Joseph and D.Grunwald. Prefetching using Markov Predictors. Intl. Symp. on Computer Architecture, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G.Kandiraju and A.Sivasubramaniam. Characterizing the d-TLB Behavior of SPEC CPU2000 Benchmarks. ACM SIGMETRICS Intl. Conf. on Measurement and Modeling of Computer Systems, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G.Kandiraju and A.Sivasubramaniam. Going the Distance for TLB Prefetching: An Application-Driven Study. Intl. Symp. on Computer Architecture, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M.Martin et al. Multifacet's General Execution-Driven Multiprocessor Simulator (GEMS) Toolset. Comp. Arch. News, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D.Nagle et al. Design Tradeoffs for Software Managed TLBs. Intl. Symp. on Computer Architecture, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. X.Qui and M.Dubois. Options for Dynamic Address Translations in COMAs. Intl. Symp. on Comp. Arch., 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M.Rosenblum et al. The Impact of Architectural Trends on Operating System Performance. ACM Transactions on Modeling and Computer Simulation, 1995.Google ScholarGoogle Scholar
  19. A.Saulsbury, F.Dahlgren, and P.Stenstrom. Recency-Based TLB Preloading.Intl. Symp. on Comp. Arch., 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V.Srinivasan, E.Davidson, and G.Tyson. A Prefetch Taxonomy. IEEE Transaction on Computers, 53(2), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Sun. UltraSPARC III Cu User's Manual. 2004.Google ScholarGoogle Scholar
  22. M.Talluri. Use of Superpages and Subblocking in the Address Translation Hierarchy. PhD Thesis, Dept. of CS, Univ. of Wisc., 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M.Talluri and M.Hill. Surpassing the TLB Performance of Superpages with Less Operating System Support. Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Virtutech.Simics for Multicore Software. 2007.Google ScholarGoogle Scholar

Index Terms

  1. Inter-core cooperative TLB for chip multiprocessors

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM SIGARCH Computer Architecture News
              ACM SIGARCH Computer Architecture News  Volume 38, Issue 1
              ASPLOS '10
              March 2010
              399 pages
              ISSN:0163-5964
              DOI:10.1145/1735970
              Issue’s Table of Contents
              • cover image ACM Conferences
                ASPLOS XV: Proceedings of the fifteenth International Conference on Architectural support for programming languages and operating systems
                March 2010
                422 pages
                ISBN:9781605588391
                DOI:10.1145/1736020
                • General Chair:
                • James C. Hoe,
                • Program Chair:
                • Vikram S. Adve

              Copyright © 2010 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 13 March 2010

              Check for updates

              Qualifiers

              • research-article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader