skip to main content
10.1145/3626495.3626500acmotherconferencesArticle/Chapter ViewAbstractPublication PagescvmpConference Proceedingsconference-collections
research-article
Open Access

LFSphereNet: Real Time Spherical Light Field Reconstruction from a Single Omnidirectional Image

Published:30 November 2023Publication History

ABSTRACT

Recent developments in immersive imaging technologies have enabled improved telepresence applications. Being fully matured in the commercial sense, omnidirectional (360-degree) content provides full vision around the camera with three degrees of freedom (3DoF). Considering the applications in real-time immersive telepresence, this paper investigates how a single omnidirectional image (ODI) can be used to extend 3DoF to 6DoF. To achieve this, we propose a fully learning-based method for spherical light field reconstruction from a single omnidirectional image. The proposed LFSphereNet utilizes two different networks: The first network learns to reconstruct the light field in cubemap projection (CMP) format given the six cube faces of an omnidirectional image and the corresponding cube face positions as input. The cubemap format implies a linear re-projection, which is more appropriate for a neural network. The second network refines the reconstructed cubemaps in equirectangular projection (ERP) format by removing cubemap border artifacts. The network learns the geometric features implicitly for both translation and zooming when an appropriate cost function is employed. Furthermore, it runs with very low inference time, which enables real-time applications. We demonstrate that LFSphereNet outperforms state-of-the-art approaches in terms of quality and speed when tested on different synthetic and real world scenes. The proposed method represents a significant step towards achieving real-time immersive remote telepresence experiences.

Skip Supplemental Material Section

Supplemental Material

References

  1. Benjamin Attal, Selena Ling, Aaron Gokaslan, Christian Richardt, and James Tompkin. 2020. MatryODShka: Real-time 6DoF video view synthesis using multi-sphere images. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I. Springer, 441–459.Google ScholarGoogle Scholar
  2. Kyuho Bae, Andre Ivan, Hajime Nagahara, and In Kyu Park. 2021. 5d light field synthesis from a monocular video. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 7157–7164.Google ScholarGoogle ScholarCross RefCross Ref
  3. Michael Broxton, John Flynn, Ryan Overbeck, Daniel Erickson, Peter Hedman, Matthew Duvall, Jason Dourgarian, Jay Busch, Matt Whalen, and Paul Debevec. 2020. Immersive Light Field Video with a Layered Mesh Representation. ACM Trans. Graph. 39, 4, Article 86 (aug 2020), 15 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Kjell Brunnström, Elijs Dima, Tahir Qureshi, Mathias Johanson, Mattias Andersson, and Mårten Sjöström. 2020. Latency impact on quality of experience in a virtual reality simulator for remote control of machines. Signal Processing: Image Communication 89 (2020), 116005.Google ScholarGoogle ScholarCross RefCross Ref
  5. Fabio Bruno, Antonio Lagudi, Loris Barbieri, Domenico Rizzo, Maurizio Muzzupappa, and Luigi De Napoli. 2018. Augmented reality visualization of scene depth for aiding ROV pilots in underwater manipulation. Ocean Engineering 168 (2018), 140–154.Google ScholarGoogle ScholarCross RefCross Ref
  6. Paramanand Chandramouli, Kanchana Vaishnavi Gandikota, Andreas Goerlitz, Andreas Kolb, and Michael Moeller. 2020. A generative model for generic light field reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 4 (2020), 1712–1724.Google ScholarGoogle ScholarCross RefCross Ref
  7. Bin Chen, Lingyan Ruan, and Miu-Ling Lam. 2020. LFGAN: 4D Light Field Synthesis from a Single RGB Image. ACM Trans. Multimedia Comput. Commun. Appl. 16 (2 2020). Issue 1. https://doi.org/10.1145/3366371Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yangling Chen, Shuo Zhang, Song Chang, and Youfang Lin. 2022. Light Field Reconstruction Using Efficient Pseudo 4D Epipolar-Aware Structure. IEEE Transactions on Computational Imaging 8 (2022), 397–410.Google ScholarGoogle ScholarCross RefCross Ref
  9. François Chollet. 2017. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1251–1258.Google ScholarGoogle ScholarCross RefCross Ref
  10. Taco S. Cohen, Mario Geiger, Jonas Köhler, and Max Welling. 2018. Spherical CNNs. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  11. Xiaodong Cun, Feng Xu, Chi-Man Pun, and Hao Gao. 2019. Depth-Assisted Full Resolution Network for Single Image-Based View Synthesis. IEEE Computer Graphics and Applications 39 (2019), 52–64. Issue 2. https://doi.org/10.1109/MCG.2018.2884188Google ScholarGoogle ScholarCross RefCross Ref
  12. Elijs Dima and Mårten Sjöström. 2021. Camera and Lidar-Based View Generation for Augmented Remote Operation in Mining Applications. IEEE Access 9 (2021), 82199–82212.Google ScholarGoogle ScholarCross RefCross Ref
  13. Keyan Ding, Kede Ma, Shiqi Wang, and Eero P Simoncelli. 2020. Image quality assessment: Unifying structure and texture similarity. IEEE transactions on pattern analysis and machine intelligence 44, 5 (2020), 2567–2581.Google ScholarGoogle Scholar
  14. Carlos Esteves, Christine Allen-Blanchette, Ameesh Makadia, and Kostas Daniilidis. 2018. Learning SO(3) Equivariant Representations with Spherical CNNs. In Proceedings of the European Conference on Computer Vision (ECCV).Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. John Flynn, Michael Broxton, Paul Debevec, Matthew DuVall, Graham Fyffe, Ryan Overbeck, Noah Snavely, and Richard Tucker. 2019. DeepView: View Synthesis With Learned Gradient Descent. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarCross RefCross Ref
  16. Kai Gu, Thomas Maugey, Sebastian Knorr, and Christine Guillemot. 2022. Omni-NeRF: Neural Radiance Field from 360° Image Captures. (2022), 1–6.Google ScholarGoogle Scholar
  17. Kang Han and Wei Xiang. 2022. Inference-Reconstruction Variational Autoencoder for Light Field Image Reconstruction. IEEE Transactions on Image Processing 31 (2022), 5629–5644. https://doi.org/10.1109/TIP.2022.3197976Google ScholarGoogle ScholarCross RefCross Ref
  18. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.Google ScholarGoogle ScholarCross RefCross Ref
  19. Andre Ivan, In Kyu Park, 2019. Synthesizing a 4D spatio-angular consistent light field from a single image. arXiv preprint arXiv:1903.12364 (2019).Google ScholarGoogle Scholar
  20. Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. Springer, 694–711.Google ScholarGoogle ScholarCross RefCross Ref
  21. Marc Levoy and Pat Hanrahan. 1996. Light field rendering. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques. 31–42.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Qinbo Li and Nima Khademi Kalantari. 2020. Synthesizing Light Field from a Single Image with Variable MPI and Two Network Fusion. ACM Trans. Graph. 39 (11 2020). Issue 6.Google ScholarGoogle Scholar
  23. Xiao Li, Wen Yi, Hung-Lin Chi, Xiangyu Wang, and Albert P.C. Chan. 2018. A critical review of virtual and augmented reality (VR/AR) applications in construction safety. Automation in Construction 86 (2018), 150–162.Google ScholarGoogle ScholarCross RefCross Ref
  24. Kai-En Lin, Zexiang Xu, Ben Mildenhall, Pratul P Srinivasan, Yannick Hold-Geoffroy, Stephen DiVerdi, Qi Sun, Kalyan Sunkavalli, and Ravi Ramamoorthi. 2020. Deep multi depth panoramas for view synthesis. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIII. Springer, 328–344.Google ScholarGoogle Scholar
  25. Wen Liu, Zhixin Piao, Jie Min, Wenhan Luo, Lin Ma, and Shenghua Gao. 2019. Liquid warping gan: A unified framework for human motion imitation, appearance transfer and novel view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5904–5913.Google ScholarGoogle ScholarCross RefCross Ref
  26. Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2021. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Commun. ACM 65, 1 (dec 2021), 99–106.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ren Ng, Marc Levoy, Mathieu Brédif, Gene Duval, Mark Horowitz, and Pat Hanrahan. 2005. Light Field Photography with a Hand-held Plenoptic Camera. Research Report CSTR 2005-02. Stanford university. Stanford University Computer Science Tech Report pages.Google ScholarGoogle Scholar
  28. Ryan S Overbeck, Daniel Erickson, Daniel Evangelakos, Matt Pharr, and Paul Debevec. 2018. A system for acquiring, processing, and rendering panoramic light field stills for virtual reality. ACM Transactions on Graphics (TOG) 37, 6 (2018), 1–15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Abhilash Sunder Raj, Michael Lowney, Raj Shah, and Gordon Wetzstein. 2016. Stanford lytro light field archive.Google ScholarGoogle Scholar
  30. Haoyu Ren, Mostafa El-Khamy, and Jungwon Lee. 2017. Image super resolution based on fusing multiple convolution neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 54–61.Google ScholarGoogle ScholarCross RefCross Ref
  31. Martin Rerabek and Touradj Ebrahimi. 2016. New light field image dataset. In 8th International Conference on Quality of Multimedia Experience (QoMEX).Google ScholarGoogle Scholar
  32. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 234–241.Google ScholarGoogle Scholar
  33. Ana Serrano, Incheol Kim, Zhili Chen, Stephen DiVerdi, Diego Gutierrez, Aaron Hertzmann, and Belen Masia. 2019. Motion parallax for 360 RGBD video. IEEE Transactions on Visualization and Computer Graphics 25, 5 (2019), 1817–1827.Google ScholarGoogle ScholarCross RefCross Ref
  34. Hamid R Sheikh and Alan C Bovik. 2006. Image information and visual quality. IEEE Transactions on image processing 15, 2 (2006), 430–444.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Lixin Shi, Haitham Hassanieh, Abe Davis, Dina Katabi, and Fredo Durand. 2014. Light field reconstruction using sparsity in the continuous fourier domain. ACM Transactions on Graphics (TOG) 34, 1 (2014), 1–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Pratul P Srinivasan, Tongzhou Wang, Ashwin Sreelal, Ravi Ramamoorthi, and Ren Ng. 2017. Learning to Synthesize a 4D RGBD Light Field From a Single Image. Proceedings of the IEEE International Conference on Computer Vision (ICCV).Google ScholarGoogle ScholarCross RefCross Ref
  37. Yu-Chuan Su and Kristen Grauman. 2019. Kernel Transformer Networks for Compact Spherical Convolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarCross RefCross Ref
  38. Xiaoyang Tian, Jie Shao, Deqiang Ouyang, and Heng Tao Shen. 2021. Uav-satellite view synthesis for cross-view geo-localization. IEEE Transactions on Circuits and Systems for Video Technology 32, 7 (2021), 4804–4815.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Paolo Tripicchio, Emanuele Ruffaldi, Paolo Gasparello, Shingo Eguchi, Junya Kusuno, Keita Kitano, Masaki Yamada, Alfredo Argiolas, Marta Niccolini, Matteo Ragaglia, 2017. A stereo-panoramic telepresence system for construction machines. Procedia Manufacturing 11 (2017), 1552–1559.Google ScholarGoogle ScholarCross RefCross Ref
  40. Richard Tucker and Noah Snavely. 2020. Single-view view synthesis with multiplane images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 551–560.Google ScholarGoogle ScholarCross RefCross Ref
  41. Suren Vagharshakyan, Robert Bregovic, and Atanas Gotchev. 2018. Light Field Reconstruction Using Shearlet Transform. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 1 (2018), 133–147. https://doi.org/10.1109/TPAMI.2017.2653101Google ScholarGoogle ScholarCross RefCross Ref
  42. John Waidhofer, Richa Gadgil, Anthony Dickson, Stefanie Zollmann, and Jonathan Ventura. 2022. PanoSynthVR: Toward Light-weight 360-Degree View Synthesis from a Single Panoramic Input. In 2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 584–592.Google ScholarGoogle ScholarCross RefCross Ref
  43. Xintao Wang, Liangbin Xie, Chao Dong, and Ying Shan. 2021. Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1905–1914.Google ScholarGoogle ScholarCross RefCross Ref
  44. Yunlong Wang, Fei Liu, Zilei Wang, Guangqi Hou, Zhenan Sun, and Tieniu Tan. 2018. End-to-end View Synthesis for Light Field Imaging with Pseudo 4DCNN. Proceedings of the European Conference on Computer Vision (ECCV).Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Zhou Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612. https://doi.org/10.1109/TIP.2003.819861Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Zhou Wang, Eero P Simoncelli, and Alan C Bovik. 2003. Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, Vol. 2. Ieee, 1398–1402.Google ScholarGoogle ScholarCross RefCross Ref
  47. Gaochang Wu, Yebin Liu, Lu Fang, Qionghai Dai, and Tianyou Chai. 2019. Light Field Reconstruction Using Convolutional Network on EPI and Extended Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 7 (2019), 1681–1694. https://doi.org/10.1109/TPAMI.2018.2845393Google ScholarGoogle ScholarCross RefCross Ref
  48. G Wu, L Zhao, L Wang, Q Dai, T Chai, and Y Liu. 2017. Light field reconstruction using deep convolutional network on epi, IEEE. In CVF Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  49. Han Xu, Jiayi Ma, Junjun Jiang, Xiaojie Guo, and Haibin Ling. 2020. U2Fusion: A unified unsupervised image fusion network. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 1 (2020), 502–518.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Jiale Xu, Jia Zheng, Yanyu Xu, Rui Tang, and Shenghua Gao. 2021. Layout-guided novel view synthesis from a single indoor panorama. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16438–16447.Google ScholarGoogle ScholarCross RefCross Ref
  51. Yeohun Yun, Seung Joon Lee, and Suk-Ju Kang. 2020. Motion recognition-based robot arm control system using head mounted display. IEEE Access 8 (2020), 15017–15026.Google ScholarGoogle ScholarCross RefCross Ref
  52. Lin Zhang, Lei Zhang, Xuanqin Mou, and David Zhang. 2011. FSIM: A Feature Similarity Index for Image Quality Assessment. IEEE Transactions on Image Processing 20, 8 (2011), 2378–2386.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In CVPR.Google ScholarGoogle Scholar
  54. Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz. 2016. Loss functions for image restoration with neural networks. IEEE Transactions on computational imaging 3, 1 (2016), 47–57.Google ScholarGoogle ScholarCross RefCross Ref
  55. Tinghui Zhou, Richard Tucker, John Flynn, Graham Fyffe, and Noah Snavely. 2018. Stereo Magnification: Learning View Synthesis Using Multiplane Images. ACM Trans. Graph. 37, 4, Article 65 (jul 2018), 12 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Wenhui Zhou, Gaomin Liu, Jiangwei Shi, Hua Zhang, and Guojun Dai. 2020. Depth-guided view synthesis for light field reconstruction from a single image. Image and Vision Computing 95 (2020), 103874.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Wenhui Zhou, Jiangwei Shi, Yongjie Hong, Lili Lin, and Ercan Engin Kuruoglu. 2021. Robust dense light field reconstruction from sparse noisy sampling. Signal Processing 186 (9 2021). https://doi.org/10.1016/j.sigpro.2021.108121Google ScholarGoogle ScholarCross RefCross Ref
  58. Nikolaos Zioulis, Antonis Karakottas, Dimitrios Zarpalas, Federico Alvarez, and Petros Daras. 2019. Spherical View Synthesis for Self-Supervised 360° Depth Estimation. Proceedings - 2019 International Conference on 3D Vision, 3DV 2019 (2019), 690–699. https://doi.org/10.1109/3DV.2019.00081Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. LFSphereNet: Real Time Spherical Light Field Reconstruction from a Single Omnidirectional Image

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        CVMP '23: Proceedings of the 20th ACM SIGGRAPH European Conference on Visual Media Production
        November 2023
        112 pages
        ISBN:9798400704260
        DOI:10.1145/3626495

        Copyright © 2023 Owner/Author

        This work is licensed under a Creative Commons Attribution International 4.0 License.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 30 November 2023

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate40of67submissions,60%
      • Article Metrics

        • Downloads (Last 12 months)195
        • Downloads (Last 6 weeks)58

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format