skip to main content
10.1145/1134760.1134769acmconferencesArticle/Chapter ViewAbstractPublication PagesveeConference Proceedingsconference-collections
Article

Vector LLVA: a virtual vector instruction set for media processing

Published:14 June 2006Publication History

ABSTRACT

We present Vector LLVA, a virtual instruction set architecture (VISA) that exposes extensive static information about vector parallelism while avoiding the use of hardware-specific parameters. We provide both arbitrary-length vectors (for targets that allow vectors of arbitrary length, or where the target length is not known) and fixed-length vectors (for targets that have a fixed vector length, such as subword SIMD extensions), together with a rich set of operations on both vector types. We have implemented translators that compile (1) Vector LLVA written with arbitrary-length vectors to the Motorola RSVP architecture and (2) Vector LLVA written with fixed-length vectors to both AltiVec and Intel SSE2. Our translatorgenerated code achieves speedups competitive with handwritten native code versions of several benchmarks on all three architectures. These experiments show that our V-ISA design captures vector parallelism for two quite different classes of architectures and provides virtual object code portability within the class of subword SIMD architectures.

References

  1. V. Adve, C. Lattner, M. Brukman, A. Shukla, and B. Gaeke. LLVA: A Low-Level Virtual Instruction Set Architecture. In Proc. ACM/IEEE Int'l Symp. on Microarchitecture (MICRO), pages 205--216, San Diego, CA, Dec. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Allen and K. Kennedy. Optimizing Compilers for Modern Architectures. Morgan Kaufmann Publishers, Inc., San Francisco, CA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Apple Computer, Inc. AltiVec/SSE Migration Guide. http://developer.apple.com/documentation/Performance/VelocityEngine-date.html, 2005.Google ScholarGoogle Scholar
  4. L. Baumstark, Jr., and L. Wills. Exposing Data-Level Parallelism in Sequential Image Processing Algorithms. In Proc. Working Conf. on Reverse Engineering (WCRE), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. J. Bik. The Software Vectorization Handbook: Applying Multimedia Extensions for Maximum Performance. Intel Press, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. E. Blelloch and S. Chatterjee. VCODE: A Data-Parallel Intermediate Language. In Proc. Symp. on the Frontiers of Massively Parallel Computation, pages 471--480, Oct. 1990.Google ScholarGoogle ScholarCross RefCross Ref
  7. G. Cheong and M. Lam. An Optimizer for Multimedia Instruction Sets. In Proc. Second SUIF Compiler Workshop, 1997.Google ScholarGoogle Scholar
  8. S. Ciricescu, R. Essick, B. Lucas, P. May, K. Moat, J. Norris, M. Schuette, and A. Saidi. The Reconfigurable Streaming Vector Processor (RSVP). In Proc. ACM/IEEE Int'l Symp. on Microarchitecture (MICRO). IEEE Computer Society, Dec. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Diefendorff, P. K. Dubey, R. Hochsprung, and H. Scales. AltiVec Extension to PowerPC Accelerates Media Processing. In Proc. ACM/IEEE Int'l Symp. on Microarchitecture (MICRO), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Eichenberger, P. Wu, and K. O'Brien. Vectorization for SIMD Architectures with Alignment Constraints. In Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Fisher and H. Dietz. Compiling for SIMD Within a Register. In Proc. Int'l Workshop on Languages and Compilers for Parallel Computing (LCPC), 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, and D. Shippy. Introduction to the Cell Multiprocessor. IBM Journal of Research and Development, 49(4/5):589--604, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. U. J. Kapasi, S. Rixner, W. J. Dally, B. Khailany, J. H. Ahn, P. Mattson, and J. D. Owens. Programmable Stream Processors. IEEE Computer, pages 54--62, Aug. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Kudriavtsev and P. Kogge. Generation of Permutations for SIMD Processors. In Conf. on Language, Compiler, and Tool Support for Embedded Systems (LCTES), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. Labonte, P. Mattson, I. Buck, C. Kozyrakis, and M. Horowitz. The Stream Virtual Machine. In Proc. Int'l Conf. on Parallel Architectures and Compilation Techniques (PACT), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Larsen and S. Amarasinghe. Exploiting Superword Level Parallelism with Multimedia Instruction Sets. In Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Larsen, E. Witchel, and S. Amarasinghe. Increasing and Detecting Memory Address Congruence. In Proc. Int'l Conf. on Parallel Architectures and Compilation Techniques (PACT), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation. In Proc. Int'l Symp. on Code Generation and Optimization (CGO), San Jose, Mar 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. Lindholm and F. Yellin. The Java Virtual Machine Specification. Addison-Wesley, Reading, MA, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. R. Mattson. A Programming System for the Imagine Media Processor. PhD thesis, Computer Science Dept., Stanford University, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. E. Meijer and J. Gough. A Technical Overview of the Common Language Infrastructure. http://research.microsoft.com/ meijer, 2002.Google ScholarGoogle Scholar
  22. G. Ren, P. Wu, and D. Padua. An Empirical Study on the Vectorization of Multimedia Applications for Multimedia Extensions. In Proc. Int'l Parallel and Distributed Processing Symp., 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Serebrin, J. D. Owens, C. H. Chen, S. P. Crago, U. J. Kapasi, B. Khailany, P. Mattson, J. Namkoong, S. Rixner, and W. J. Dally. A Stream Processor Development Platform. In Proc. Int'l Conf. on Computer Design (CDES), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Shin, J. Chame, and M. Hall. Exploiting Superword-Level Locality in Multimedia Extension Architectures. Journal of Instruction-Level Parallelism, 31(5):1--28, 2003.Google ScholarGoogle Scholar
  25. W. Thies, M. Karczmarek, and S. Amarasinghe. StreamIt: A Language for Streaming Applications. In Proc. Int'l Conf. on Compiler Construction (CC), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. Wu, A. Eichenberger, and A. Wang. Efficient SIMD Code Generation for Runtime Alignment and Length Conversion. In Proc. Int'l Symp. on Code Generation and Optimization (CGO), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Wu, A. Eichenberger, A. Wang, and P. Zhao. An Integrated Simdization Framework Using Virtual Vectors. In Proc. Int'l Conf. on Supercomputing (ICS), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Xiong, J. Johnson, R. Johnson, and D. Padua. SPL: A Language and Compiler for DSP Algorithms. In Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Vector LLVA: a virtual vector instruction set for media processing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      VEE '06: Proceedings of the 2nd international conference on Virtual execution environments
      June 2006
      194 pages
      ISBN:1595933328
      DOI:10.1145/1134760

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 14 June 2006

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate80of235submissions,34%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader