ABSTRACT
With advances of recent technologies, augmented reality systems and autonomous vehicles gained a lot of interest from academics and industry. Both these areas rely on scene geometry understanding, which usually requires depth map estimation. However, in case of systems with limited computational resources, such as smartphones or autonomous robots, high resolution dense depth map estimation may be challenging. In this paper, we study the problem of semi-dense depth map interpolation along with low resolution depth map upsampling. We present an end-to-end learnable residual convolutional neural network architecture that achieves fast interpolation of semi-dense depth maps with different sparse depth distributions: uniform, sparse grid and along intensity image gradient. We also propose a loss function combining classical mean squared error with perceptual loss widely used in intensity image super-resolution and style transfer tasks. We show that with some modifications, this architecture can be used for depth map super-resolution. Finally, we evaluate our results on both synthetic and real data, and consider applications for autonomous vehicles and creating AR/MR video games.
- Oisin Mac Aodha, Neill D. F. Campbell, Arun Nair, and Gabriel J. Brostow. 2012. Patch Based Synthesis for Single Depth Image Super-resolution Proceedings of the 12th European Conference on Computer Vision - Volume Part III (ECCV'12). Springer-Verlag, Berlin, Heidelberg, 71--84. Google ScholarDigital Library
- Sebastian Brandt, Elem Güzel Kalayci, Roman Kontchakov, Vladislav Ryzhikov, Guohui Xiao, and Michael Zakharyaschev. 2017. Ontology-Based Data Access with a Horn Fragment of Metric Temporal Logic Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4--9, 2017, San Francisco, California, USA., bibfieldeditorSatinder P. Singh and Shaul Markovitch (Eds.). AAAI Press, NY, 1070--1076. http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14881Google Scholar
- Daniel J. Butler, Jonas Wulff, Garrett B. Stanley, and Michael J. Black. 2012. A Naturalistic Open Source Movie for Optical Flow Evaluation Proceedings of the 12th European Conference on Computer Vision - Volume Part VI (ECCV'12). Springer-Verlag, Berlin, Heidelberg, 611--625. Google ScholarDigital Library
- Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter. 2015. Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289 Vol. 1511.07289 (2015), 1--14.Google Scholar
- James Diebel and Sebastian Thrun. 2005. An Application of Markov Random Fields to Range Sensing Proceedings of the 18th International Conference on Neural Information Processing Systems (NIPS'05). MIT Press, Cambridge, MA, USA, 291--298. http://dl.acm.org/citation.cfm?id=2976248.2976285 Google ScholarDigital Library
- Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2014. Learning a Deep Convolutional Network for Image Super-Resolution. Springer International Publishing, Cham, 184--199.Google Scholar
- Jakob Engel, Vladlen Koltun, and Daniel Cremers. 2016. Direct sparse odometry. arXiv preprint arXiv:1607.02565 Vol. 1607.02565 (2016), 1--17.Google Scholar
- Jakob Engel, Thomas Schöps, and Daniel Cremers. 2014. LSD-SLAM: Large-Scale Direct Monocular SLAM. Springer International Publishing, Cham, 834--849.Google Scholar
- D. Ferstl, C. Reinbacher, R. Ranftl, M. Ruether, and H. Bischof. 2013. Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation 2013 IEEE International Conference on Computer Vision. IEEE, New York, NY, USA, 993--1000. /dl.acm.org/citation.cfm?id=938978.939190 Google ScholarDigital Library
- Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, and Victor Lempitsky. 2016. Texture Networks: Feed-forward Synthesis of Textures and Stylized Images Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 (ICML'16). JMLR.org, New York, NY, USA, 1349--1357. http://dl.acm.org/citation.cfm?id=3045390.3045533 Google ScholarDigital Library
- L. Wang, H. Wu, and C. Pan. 2014. Fast Image Upsampling via the Displacement Field. IEEE Transactions on Image Processing Vol. 23, 12 (Dec. 2014), 5123--5135. showISSN1057-7149Google ScholarCross Ref
- Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing Vol. 13, 4 (2004), 600--612. Google ScholarDigital Library
- Jianchao Yang, John Wright, Thomas S. Huang, and Yi Ma. 2010. Image Super-resolution via Sparse Representation. Trans. Img. Proc., Vol. 19, 11 (Nov.. 2010), 2861--2873. showISSN1057--7149 Google ScholarDigital Library
- Raymond Yeh, Chen Chen, Teck Yian Lim, Mark Hasegawa-Johnson, and Minh N Do. 2016. Semantic Image Inpainting with Perceptual and Contextual Losses. arXiv preprint arXiv:1607.07539 Vol. 1607.07539 (2016), 1--10.Google Scholar
Index Terms
Semi-Dense Depth Interpolation using Deep Convolutional Neural Networks
Recommendations
On Reproducing Semi-dense Depth Map Reconstruction using Deep Convolutional Neural Networks with Perceptual Loss
MM '19: Proceedings of the 27th ACM International Conference on MultimediaIn our recent papers, we proposed a new family of residual convolutional neural networks trained for semi-dense and sparse depth reconstruction without use of RGB channel. The proposed models can be used in low-resolution depth sensors or SLAM methods ...
Super-resolution of interpolated downsampled semi-dense depth map
Web3D '18: Proceedings of the 23rd International ACM Conference on 3D Web TechnologyWe study depth map reconstruction for a specific task of fast rough depth approximation having sparse depth samples obtained from low-cost depth sensors or SLAM algorithms. We propose a model interpolating downsampled semi-dense depth values and then ...
Fast Semi-dense Depth Map Estimation
RETech'18: Proceedings of the 2018 ACM Workshop on Multimedia for Real Estate TechWe consider the problem of depth reconstruction from downsampled sparse depth values. We compare our approach with semi-dense depth map interpolation and direct RGB-to-Depth reconstruction solutions on several datasets, including Matterport 3D dataset ...
Comments