skip to main content
10.1145/1572769.1572792acmconferencesArticle/Chapter ViewAbstractPublication PageshpgConference Proceedingsconference-collections
research-article

Understanding the efficiency of ray traversal on GPUs

Published:01 August 2009Publication History

ABSTRACT

We discuss the mapping of elementary ray tracing operations---acceleration structure traversal and primitive intersection---onto wide SIMD/SIMT machines. Our focus is on NVIDIA GPUs, but some of the observations should be valid for other wide machines as well. While several fast GPU tracing methods have been published, very little is actually understood about their performance. Nobody knows whether the methods are anywhere near the theoretically obtainable limits, and if not, what might be causing the discrepancy. We study this question by comparing the measurements against a simulator that tells the upper bound of performance for a given kernel. We observe that previously known methods are a factor of 1.5--2.5X off from theoretical optimum, and most of the gap is not explained by memory bandwidth, but rather by previously unidentified inefficiencies in hardware work distribution. We then propose a simple solution that significantly narrows the gap between simulation and measurement. This results in the fastest GPU ray tracer to date. We provide results for primary, ambient occlusion and diffuse interreflection rays.

References

  1. Blelloch, G. 1990. Prefix sums and their applications. In Synthesis of Parallel Algorithms, Morgan Kaufmann, J. H. Reif, Ed.Google ScholarGoogle Scholar
  2. Ernst, M., and Greiner, G. 2007. Early split clipping for bounding volume hierarchies. In Proc. IEEE/Eurographics Symposium of Interactive Ray Tracing 2007, 73--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Günther, J., Popov, S., Seidel, H.-P., and Slusallek, P. 2007. Realtime ray tracing on GPU with BVH-based packet traversal. In Proc. IEEE/Eurographics Symposium on Interactive Ray Tracing 2007, 113--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Lindholm, E., Nickolls, J., Oberman, S., and Montrym, J. 2008. Nvidia tesla: A unified graphics and computing architecture. IEEE Micro 28, 2, 39--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. NVIDIA. 2008. NVIDIA CUDA Programming Guide Version 2.1.Google ScholarGoogle Scholar
  6. Reshetov, A., Soupikov, A., and Hurley, J. 2005. Multi-level ray tracing algorithm. ACM Trans. Graph. 24, 3, 1176--1185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Wächter, C., and Keller, A. 2006. Instant ray tracing: The bounding interval hierarchy. In Proc. Eurographics Symposium on Rendering 2006, 139--149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Wald, I., Benthin, C., and Wagner, M. 2001. Interactive rendering with coherent ray tracing. Computer Graphics Forum 20, 3, 153--164.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Wald, I., Boulos, S., and Shirley, P. 2007. Ray Tracing Deformable Scenes using Dynamic Bounding Volume Hierarchies. ACM Trans. Graph. 26, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Wald, I., Benthin, C., and Boulos, S. 2008. Getting rid of packets: Efficient SIMD single-ray traversal using multibranching bvhs. In Proc. IEEE/Eurographics Symposium on Interactive Ray Tracing 2008.Google ScholarGoogle Scholar
  11. Woop, S. 2004. A Ray Tracing Hardware Architecture for Dynamic Scenes. Tech. rep., Saarland University.Google ScholarGoogle Scholar
  12. Zhou, K., Hou, Q., Wang, R., and Guo, B. 2008. Real-time KD-tree construction on graphics hardware. ACM Trans. Graph. 27, 5, 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Understanding the efficiency of ray traversal on GPUs

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            HPG '09: Proceedings of the Conference on High Performance Graphics 2009
            August 2009
            185 pages
            ISBN:9781605586038
            DOI:10.1145/1572769

            Copyright © 2009 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 August 2009

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate15of44submissions,34%

            Upcoming Conference

            HPG '24
            High-Performance Graphics
            July 26 - 28, 2024
            Denver , CO , USA

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader