Abstract
Thread-level speculation is a technique that enables parallel execution of sequential applications on a multiprocessor. This paper describes the complete implementation of the support for threadlevel speculation on the Hydra chip multiprocessor (CMP). The support consists of a number of software speculation control handlers and modifications to the shared secondary cache memory system of the CMP This support is evaluated using five representative integer applications. Our results show that the speculative support is only able to improve performance when there is a substantial amount of medium--grained loop-level parallelism in the application. When the granularity of parallelism is too small or there is little inherent parallelism in the application, the overhead of the software handlers overwhelms any potential performance benefits from speculative-thread parallelism. Overall, thread-level speculation still appears to be a promising approach for expanding the class of applications that can be automatically parallelized, but more hardware intensive implementations for managing speculation control are required to achieve performance improvements on a wide class of integer applications.
- 1 M. Franklin and G. S. Sohi, "The expandable split window paradigm for exploiting fine-grain parallelism," Proceedings of the 19th Annual International Symposium on Computer Architecture, pp. 58-67, Gold Coast, Australia, May 1992.]] Google ScholarDigital Library
- 2 M. Franklin and G. Sohi, "ARB: A hardware mechanism for dynamic reordering of memory references," IEEE Transactions on Computers, vol. 45, no. 5, pp. 552-571, May t996.]] Google ScholarDigital Library
- 3 S. Gopal, T. N. Vijaykumar, J. E. Smith, and G. S. Sohi, "Speculative versioning cache," Proceedings of the Fourth blternational Symposium on High-Performance Computer Architecture (HPCA-4), Las Vegas, NV, February 1998.]] Google ScholarDigital Library
- 4 L. Hammond and K. Olukotun, Considerations in the Design of Hydra: a Multiprocessor-on-a-Chip Microarchitecture, Stanford University Technical Report No. CSL-TR-98-749, Stanford University, February 1998.]] Google ScholarDigital Library
- 5 N.P. Jouppi, "Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers," Proceedings of the 17th Annual International Symposium of Computer Architecture, pp. 364-373, Seattle, WA, June 1990.]] Google ScholarDigital Library
- 6 T. Knight, "An architecture for mostly functional languages," Proceedings of the ACM Lisp and Functional Programming Conference, pp. 500-519, August 1996.]] Google ScholarDigital Library
- 7 M.S. Lam and R. P. Wilson, "Limits of control flow on parallelism," Proceedings of the 19th Annual International Symposium on Computer Architecture, pp. 46-57, Gold Coast, Australia, May 1992.]] Google ScholarDigital Library
- 8 K. Olukotun, K. Chang, L. Hammond, B. Nayfeh, and K. Wilson, "The case for a single chip multiprocessor," Proceedings of the 7th Int. Conf. for Architectural Support for Programming Languages and Operating Systems (ASPLOS- VII), pp. 2-11, Cambridge, MA 1996.]] Google ScholarDigital Library
- 9 J. Oplinger, D. Heine, S.-W. Liao, B. A. Nayfeh, M. S. Lain, and K. Olukotun, Software and Hardware for Exploiting Speculative Parallelism in Multiprocessors, Computer Systems Laboratory Technical Report CSL-TR-97-715, Stanford University, February 1997.]] Google ScholarDigital Library
- 10 J. Oplinger, D. Heine, M. Lain, and K. Olukotun, In Search of Speculative Thread-Level Parallelism, Stanford University, Computer Systems Laboratory Technical Report CSL-TR-98- 765, July 1998.]]Google Scholar
- 11 G. Sohi, S. Breach, and T. Vijaykumar, "Multiscalar processors," Proceedings of the 22nd Annual international Symposium on Computer Architecture, pp. 414-425, Ligure, Italy, June 1995]] Google ScholarDigital Library
- 12 J.G. Steffan and T. Mowry, "The potential for using threadlevel data speculation to facilitate automatic parallelization," Proceedings of the Fourth International Symposium on High- Performance Computer Architecture (HPCA-4), Las Vegas, NV, February 1998.]] Google ScholarDigital Library
- 13 R. Wilson and M. Lam, "Efficient context-sensitive pointer analysis for C programs," Proceedings of Prog. Lang. Design and Implementaion, pp. 1-12,, June, 1995.]] Google ScholarDigital Library
Index Terms
- Data speculation support for a chip multiprocessor
Recommendations
Data speculation support for a chip multiprocessor
ASPLOS VIII: Proceedings of the eighth international conference on Architectural support for programming languages and operating systemsThread-level speculation is a technique that enables parallel execution of sequential applications on a multiprocessor. This paper describes the complete implementation of the support for threadlevel speculation on the Hydra chip multiprocessor (CMP). ...
Data speculation support for a chip multiprocessor
Thread-level speculation is a technique that enables parallel execution of sequential applications on a multiprocessor. This paper describes the complete implementation of the support for threadlevel speculation on the Hydra chip multiprocessor (CMP). ...
Architecture of the Atlas Chip-Multiprocessor: Dynamically Parallelizing Irregular Applications
Single-chip multiprocessors are an important research direction for future microprocessors. The stigma of this approach is that many important applications cannot be automatically parallelized. This paper presents a single-chip multiprocessor that ...
Comments