research-article

MergeTree: A Fast Hardware HLBVH Constructor for Animated Ray Tracing

Authors:
Timo Viitanen

Tampere University of Technology, Finland

Tampere University of Technology, Finland

0000-0003-1082-9587
View Profile

,
Matias Koskela

Tampere University of Technology, Finland

Tampere University of Technology, Finland
View Profile

,
Pekka Jääskeläinen

Tampere University of Technology, Finland

Tampere University of Technology, Finland
View Profile

,
Heikki Kultala

Tampere University of Technology, Finland

Tampere University of Technology, Finland
View Profile

,
Jarmo Takala

Tampere University of Technology, Finland

Tampere University of Technology, Finland
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 36 Issue 5Article No.: 169pp 1–14https://doi.org/10.1145/3132702

Published:11 October 2017Publication History

ACM Transactions on Graphics

Abstract

Ray tracing is a computationally intensive rendering technique traditionally used in offline high-quality rendering. Powerful hardware accelerators have been recently developed that put real-time ray tracing even in the reach of mobile devices. However, rendering animated scenes remains difficult, as updating the acceleration trees for each frame is a memory-intensive process. This article proposes MergeTree, the first hardware architecture for Hierarchical Linear Bounding Volume Hierarchy (HLBVH) construction, designed to minimize memory traffic. For evaluation, the hardware constructor is synthesized on a 28nm process technology. Compared to a state-of-the-art binned surface area heuristic sweep (SAH) builder, the present work speeds up construction by a factor of 5, reduces build energy by a factor of 3.2, and memory traffic by a factor of 3. A software HLBVH builder on a graphics processing unit (GPU) requires 3.3 times more memory traffic. To take tree quality into account, a rendering accelerator is modeled alongside the builder. Given the use of a toplevel build to improve tree quality, the proposed builder reduces system energy per frame by an average 41% with primary rays and 13% with diffuse rays. In large ( > 500K triangles) scenes, the difference is more pronounced, 62% and 35%, respectively.

Supplemental Material

tog36-5-a169-viitanen.mp4

mp4

242.6 MB

Download

References

Attila T. Áfra and László Szirmay-Kalos. 2014. Stackless multi-BVH Traversal for CPU, MIC and GPU ray tracing. Comput. Graph. Forum 33, 1 (2014), 129--140. Google ScholarDigital Library
Alok Aggarwal and Jeffrey Scott Vitter. 1988. The input/output complexity of sorting and related problems. Commun. ACM 31, 9 (1988), 1116--1127. Google ScholarDigital Library
Timo Aila and Tero Karras. 2010. Architecture considerations for tracing incoherent rays. In Proceedings of High Performance Graphics. 113--122.Google Scholar
Timo Aila and Samuli Laine. 2009. Understanding the efficiency of ray traversal on GPUs. In Proceedings of High Performance Graphics. 145--149. Google ScholarDigital Library
Tomas Akenine-Möller and Jacob Strom. 2008. Graphics processing units for handhelds. Proc. IEEE 96, 5 (2008), 779--789. Google ScholarCross Ref
Ciprian Apetrei. 2014. Fast and simple agglomerative LBVH construction. In Proceedings of the Computer Graphics and Visual Computing Conference (CGVC’14).Google Scholar
Rakesh D. Barve, Edward F. Grove, and Jeffrey Scott Vitter. 1997. Simple randomized mergesort on parallel disks. Parallel Comput. 23, 4 (1997), 601--631. Google ScholarDigital Library
Ranjita Bhagwan and Bill Lin. 2000. Fast and scalable priority queue architecture for high-speed network switches. In Proceedings of the Annual Joint Conference of the IEEE Computer and Communications Societies, Vol. 2. 538--547. Google ScholarCross Ref
Jared Casper and Kunle Olukotun. 2014. Hardware acceleration of database operations. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 151--160. Google ScholarDigital Library
Karthik Chandrasekar, Christian Weis, Yonghui Li, Benny Akesson, Norbert Wehn, and Kees Goossens. 2012. DRAMPower: Open-source DRAM power 8 energy estimation tool. Retrieved February 30, 2017 from http://www.drampower.info.Google Scholar
Erwin Coumans. 2017. Bullet physics library. Retrieved March 6, 2017 from http://www.bulletphysics.org.Google Scholar
Leonardo R. Domingues and Helio Pedrini. 2015. Bounding volume hierarchy optimization through agglomerative treelet restructuring. In Proceedings of High Performance Graphics. 13--20. Google ScholarDigital Library
Michael Doyle, Colin Fowler, and Michael Manzke. 2013. A hardware unit for fast SAH-optimized BVH construction. ACM Trans. Graph. 32, 4 (2013), 139:1--10.Google ScholarDigital Library
Michael Doyle, Ciaran Tuohy, and Michael Manzke. 2017. Evaluation of a BVH construction accelerator architecture for high-quality visualization. IEEE Trans. Multi-Scale Comput. Syst. Early access. Retrieved from http://ieeexplore.ieee.org/abstract/document/7903616/.Google Scholar
Sameh Galal and Mark Horowitz. 2011. Energy-efficient floating-point unit design. IEEE Trans. on Comput. 60, 7 (2011), 913--922. Google ScholarDigital Library
Per Ganestam and Michael Doggett. 2016. SAH guided spatial split partitioning for fast BVH construction. Comput. Graph. Forum 35, 2 (2016), 285--293. Google ScholarDigital Library
Kirill Garanzha, Jacopo Pantaleoni, and David McAllister. 2011a. Simpler and faster HLBVH with work queues. In Proceedings of High Performance Graphics. 59--64. Google ScholarDigital Library
Kirill Garanzha, Simon Premože, Alexander Bely, and Vladimir Galaktionov. 2011b. Grid-based SAH BVH construction on a GPU. Vis. Comput. 27, 6--8 (2011), 697--706.Google ScholarDigital Library
Jeffrey Goldsmith and John Salmon. 1987. Automatic creation of object hierarchies for ray tracing. IEEE Comput. Graph. Appl. 7, 5 (1987), 14--20. Google ScholarDigital Library
Aggelos Ioannou and Manolis G. H. Katevenis. 2007. Pipelined heap (priority queue) management for advanced scheduling in high-speed networks. IEEE/ACM Trans. Netw. 15, 2 (2007), 450--461. Google ScholarDigital Library
Ilan Kadar and Ohad Ben-Shahar. 2013. SceneNet: A perceptual ontology database for scene understanding. J. Vis. 13, 9 (2013), 1310--1310. Google ScholarCross Ref
Tero Karras. 2012. Maximizing parallelism in the construction of BVHs, octrees, and k-d trees. In Proceedings of High Performance Graphics. 33--37.Google Scholar
Tero Karras and Timo Aila. 2013. Fast parallel construction of high-quality bounding volume hierarchies. In Proceedings of High Performance Graphics. 89--99. Google ScholarDigital Library
Stephen W. Keckler, William J. Dally, Brucek Khailany, Michael Garland, and David Glasco. 2011. GPUs and the future of parallel computing. IEEE Micro 31, 5 (2011), 7--17. Google ScholarDigital Library
Sean Keely. 2014. Reduced precision hardware for ray tracing. In Proceedings of High Performance Graphics. 29--40.Google ScholarDigital Library
Yoongu Kim, Weikun Yang, and Onur Mutlu. 2015. Ramulator: A fast and extensible DRAM simulator. IEEE Comput. Arch. Lett. PP, 99 (2015), 1--1.Google Scholar
Dirk Koch and Jim Torresen. 2011. FPGASort: A high performance sorting architecture exploiting run-time reconfiguration on FPGAs for large problem sorting. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 45--54. Google ScholarDigital Library
Daniel Kopta, Konstantin Shkurko, J. Spjut, Erik Brunvand, and Al Davis. 2015. Memory considerations for low energy ray tracing. Comput. Graph. Forum 34, 1 (2015), 47--59. Google ScholarDigital Library
Samuli Laine, Tero Karras, and Timo Aila. 2013. Megakernels considered harmful: Wavefront path tracing on GPUs. In Proceedings of High Performance Graphics. 137--143.Google ScholarDigital Library
Christian Lauterbach, Michael Garland, Shubhabrata Sengupta, David Luebke, and Dinesh Manocha. 2009. Fast BVH construction on GPUs. Comput. Graph. Forum 28, 2 (2009), 375--384. Google ScholarCross Ref
Jaedon Lee, Won-Jong Lee, Youngsam Shin, Seokjoong Hwang, Soojung Ryu, and Jeongwook Kim. 2014. Two-AABB traversal for mobile real-time ray tracing. In Proceedings of the SIGGRAPH Asia Symposium on Mobile Graphics and Interactive Applications 14. Google ScholarDigital Library
Won-Jong Lee, Youngsam Shin, Jaedon Lee, Jin-Woo Kim, Jae-Ho Nah, Seokyoon Jung, Shihwa Lee, Hyun-Sang Park, and Tack-Don Han. 2013. SGRT: A mobile GPU architecture for real-time ray tracing. In Proceedings of High Performance Graphics. 109--119. Google ScholarDigital Library
Wei Liu and Alberto Nannarelli. 2012. Power efficient division and square root unit. IEEE Trans. Comput. 61, 8 (2012), 1059--1070. Google ScholarDigital Library
Xingyu Liu, Yangdong Deng, Yufei Ni, and Zonghui Li. 2015. FastTree: A hardware KD-tree construction acceleration engine for real-time ray tracing. In Proceedings of the Design, Automation 8 Test in Europe Conference 8 Exhibition. 1595--1598.Google ScholarCross Ref
Jan Lucas, Sohan Lal, Michael Andersch, Mauricio Alvarez-Mesa, and Ben Juurlink. 2013. How a single chip causes massive power bills GPUSimPow: A GPGPU power simulator. In Proceedings of the IEEE International Symposium on Performance Analysis and System Software. 97--106. Google ScholarCross Ref
J. David MacDonald and Kellogg S. Booth. 1990. Heuristics for ray tracing using space subdivision. Vis. Comput. 6, 3 (1990), 153--166. Google ScholarDigital Library
Morgan McGuire. 2011. Computer graphics archive. Retrieved Feb 30, 2017 from http://graphics.cs.williams.edu/data/meshes.xml.Google Scholar
Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P. Jouppi. 2009. CACTI 6.0: A Tool to Model Large Caches. Technical Report. HP Laboratories. 22--31 pages.Google Scholar
J. Nah, H. Kwon, D. Kim, C. Jeong, J. Park, T. Han, D. Manocha, and W. Park. 2014. RayCore: A ray-tracing hardware architecture for mobile devices. ACM Trans. Graph. 33, 5 (2014), 162:1--15.Google ScholarDigital Library
Jae-Ho Nah, Jin-Woo Kim, Junho Park, Won-Jong Lee, Jeong-Soo Park, Seok-Yoon Jung, Woo-Chan Park, Dinesh Manocha, and Tack-Don Han. 2015. HART: A hybrid architecture for ray tracing animated scenes. IEEE Trans. Vis. Comput. Graph. 21, 3 (2015), 389--401. Google ScholarDigital Library
Jae-Ho Nah, Jeong-Soo Park, Chanmin Park, Jin-Woo Kim, Yun-Hye Jung, Woo-Chan Park, and Tack-Don Han. 2011. T8I engine: Traversal and intersection engine for hardware accelerated ray tracing. ACM Trans. Graph. 30, 6 (2011), 160.Google ScholarDigital Library
Jacopo Pantaleoni and David Luebke. 2010. HLBVH: Hierarchical LBVH construction for real-time ray tracing of dynamic geometry. In Proceedings of High Performance Graphics. 87--95.Google Scholar
Anuj Pathania, Alexandru Eugen Irimiea, Alok Prakash, and Tulika Mitra. 2015. Power-performance modelling of mobile gaming workloads on heterogeneous MPSoCs. In Proceedings of the Design Automation Conference. 201.Google ScholarDigital Library
PowerVR. 2015. PowerVR Ray Tracing. Retrieved Feb 30, 2017 from https://imgtec.com/powervr/ray-tracing/.Google Scholar
Maxim Shevtsov, Alexei Soupikov, Alexander Kapustin, and Nizhniy Novorod. 2007. Ray-triangle intersection algorithm for modern CPU architectures. In Proceedings of GraphiCon, Vol. 2007. 33--39.Google Scholar
Hojun Shim, Nachyuck Chang, and Massoud Pedram. 2004. A compressed frame buffer to reduce display power consumption in mobile systems. In Proceedings of the Asia and South Pacific Design Automation Conference. 819--824.Google Scholar
Josef Spjut, Andrew Kensler, Daniel Kopta, and Erik Brunvand. 2009. TRaX: A multicore hardware architecture for real-time ray tracing. Trans. Comput.-Aid. Des. Integr. Circ. Syst. 28, 12 (2009), 1802--1815. Google ScholarDigital Library
Joseph Spjut, Daniel Kopta, Erik Brunvand, and Al Davis. 2012. A mobile accelerator architecture for ray tracing. In Proceedings of the Workshop on SoCs, Heterogeneous Architectures and Workloads.Google Scholar
Timo Viitanen, Matias Koskela, Pekka Jääskeläinen, Heikki Kultala, and Jarmo Takala. 2015. MergeTree: A HLBVH constructor for mobile systems. In Proceedings of SIGGRAPH Asia, Technical Briefs. 12. Google ScholarDigital Library
Timo Viitanen, Matias Koskela, Pekka Jääskeläinen, and Jarmo Takala. 2016. Multi bounding volume hierarchies for ray tracing pipelines. In Proceedings of SIGGRAPH Asia, Technical Briefs. 8. Google ScholarDigital Library
Ingo Wald. 2007. On fast construction of SAH-based bounding volume hierarchies. In Proceedings of the IEEE Symposium on Interactive Ray Tracing. 33--40. Google ScholarDigital Library
Ingo Wald, Solomon Boulos, and Peter Shirley. 2007. Ray tracing deformable scenes using dynamic bounding volume hierarchies. ACM Trans. Graph. 26, 1 (2007), 6.Google ScholarDigital Library
Sven Woop, Jörg Schmittler, and Philipp Slusallek. 2005. RPU: A programmable ray processing unit for realtime ray tracing. ACM Trans. Graph. 24, 3 (2005), 434--444. Google ScholarDigital Library

Index Terms

MergeTree: A Fast Hardware HLBVH Constructor for Animated Ray Tracing
1. Computing methodologies
  1. Computer graphics
    1. Graphics systems and interfaces
      1. Graphics processors
    2. Rendering
      1. Ray tracing

Recommendations

RayCore: A Ray-Tracing Hardware Architecture for Mobile Devices

We present RayCore, a mobile ray-tracing hardware architecture. RayCore facilitates high-quality rendering effects, such as reflection, refraction, and shadows, on mobile devices by performing real-time Whitted ray tracing. RayCore consists of two major ...
Read More
A hardware unit for fast SAH-optimised BVH construction

Ray-tracing algorithms are known for producing highly realistic images, but at a significant computational cost. For this reason, a large body of research exists on various techniques for accelerating these costly algorithms. One approach to achieving ...
Read More
PLOCTree: A Fast, High-Quality Hardware BVH Builder

In the near future, GPUs are expected to have hardware support for real-time ray tracing in order to, e.g., help render complex lighting effects in video games and enable photorealistic augmented reality. One challenge in real-time ray tracing is ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Graphics Volume 36, Issue 5
October 2017
161 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3127587
Editor:
Kavita Bala
Cornell University
Issue’s Table of Contents
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 October 2017
- Accepted: 1 June 2017
- Received: 1 March 2017
Published in tog Volume 36, Issue 5

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
BVH
HLBVH
Ray tracing
bounding volume hierarchy
ray-tracing hardware
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 11
  Total Citations
  View Citations
- 475
  Total Downloads
- Downloads (Last 12 months)53
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

MergeTree: A Fast Hardware HLBVH Constructor for Animated Ray Tracing

ACM Transactions on Graphics

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

RayCore: A Ray-Tracing Hardware Architecture for Mobile Devices

A hardware unit for fast SAH-optimised BVH construction

PLOCTree: A Fast, High-Quality Hardware BVH Builder

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

MergeTree: A Fast Hardware HLBVH Constructor for Animated Ray Tracing

ACM Transactions on Graphics

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

RayCore: A Ray-Tracing Hardware Architecture for Mobile Devices

A hardware unit for fast SAH-optimised BVH construction

PLOCTree: A Fast, High-Quality Hardware BVH Builder

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media