ABSTRACT
High-speed video has been commonly adopted in consumer-grade cameras, augmenting these videos with a corresponding depth stream will enable new multimedia applications, such as 3D slow-motion video. In this paper, we present a hybrid camera system that combines a high-speed color camera with a depth sensor, e.g. Kinect depth sensor, to generate a depth stream that can produce both high-speed and high-resolution RGB+depth stream. Simply interpolating the low-speed depth frames is not satisfactory, where interpolation artifacts and lose in surface details are often visible. We have developed a novel framework that utilizes both shading constraints within each frame and optical flow constraints between neighboring frames. More specifically we present (a) an effective method to find the intrinsics images to allow more accurate normal estimation; and (b) an optimization-based framework to estimate the high-resolution/high-speed depth stream, taking into consideration temporal smoothness and shading/depth consistency. We evaluated our holistic framework with both synthetic and real sequences, it showed superior performance than previous state-of-the-art.
- R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk. Slic superpixels compared to state-of-the-art superpixel methods. IEEE TPAMI, 34(11):2274--2282, 2012. Google ScholarDigital Library
- O. M. Aodha, N. D. Campbell, A. Nair, and G. J. Brostow. Patch based synthesis for single depth image super-resolution. In ECCV, pages 71--84, 2012. Google ScholarDigital Library
- J. T. Barron and J. Malik. High-frequency shape and albedo from shading using natural image statistics. In CVPR, pages 2521--2528, 2011. Google ScholarDigital Library
- J. T. Barron and J. Malik. Shape, illumination, and reflectance from shading. IEEE TPAMI, 37(8):1670--1687, 2015.Google ScholarDigital Library
- J. T. Barron and J. Malik. Intrinsic scene properties from a single rgb-d image. IEEE TPAMI, 38(4):690--703, 2016. Google ScholarDigital Library
- H. G. Barrow and J. M. Tenenbaum. Recovering intrinsic scene characteristics from images. Comput. Vis. Syst., 1978.Google Scholar
- S. Bi, X. Han, and Y. Yu. An l1 image transform for edge-preserving smoothing and scene-level intrinsic decomposition. ACM Trans. Graph., 34(4):78, 2015. Google ScholarDigital Library
- N. Bonneel, K. Sunkavalli, J. Tompkin, D. Sun, S. Paris, and H. Pfister. Interactive intrinsic video editing. ACM Trans. Graph., 33(6):197, 2014. Google ScholarDigital Library
- A. Bousseau, S. Paris, and F. Durand. User-assisted intrinsic images. ACM Trans. Graph., 28(5):130, 2009. Google ScholarDigital Library
- T. Brox and J. Malik. Large displacement optical flow: descriptor matching in variational motion estimation. IEEE TPAMI, 33(3):500--513, 2011. Google ScholarDigital Library
- Q. Chen and V. Koltun. A simple model for intrinsic image decomposition with depth cues. In ICCV, pages 241--248, 2013. Google ScholarDigital Library
- J. Durou, M. Falcone, and M. Sagona. Numerical methods for shape-from-shading: A new survey with benchmarks. CVIU, 109(1):22--43, 2008. Google ScholarDigital Library
- D. Ferstl, C. Reinbacher, R. Ranftl, M. Rüther, and H. Bischof. Image guided depth upsampling using anisotropic total generalized variation. In ICCV, pages 993--1000, 2013. Google ScholarDigital Library
- E. Garces, A. Munoz, J. Lopez-Moreno, and D. D. Gutierrez. Intrinsic images by clustering. In Computer Graphics Forum, volume 31, pages 1415--1424, 2012. Google ScholarDigital Library
- Y. Han, J. Lee, and I. Kweon. High quality shape from a single rgb-d image under uncalibrated natural illumination. In ICCV, pages 1617--1624, 2013. Google ScholarDigital Library
- S. M. Haque, A. Chatterjee, and V. M. Govindu. High quality photometric reconstruction using a depth camera. In CVPR, pages 2283--2290, 2014. Google ScholarDigital Library
- E. Herbst, X. Ren, and D. Fox. Rgb-d flow: Dense 3-d motion estimation using color and depth. In ICRA, pages 2276--2282, 2013.Google ScholarCross Ref
- B. K. P. Horn. Shape from Shading: A Method for Obtaining the Shape of a Smooth Opaque Object from One View. PhD thesis, MIT, 1970.Google ScholarDigital Library
- J. Jeon, S. Cho, X. Tong, and S. Lee. Intrinsic image decomposition using structure-texture separation and surface normals. In ECCV, pages 218--233, 2014.Google ScholarCross Ref
- N. Kong, P. Gehler, and M. Black. Intrinsic video. In ECCV, pages 360--375. 2014.Google ScholarCross Ref
- P. Laffont, A. Bousseau, S. Paris, F. Durand, and G. Drettakis. Coherent intrinsic images from photo collections. ACM Trans. Graph., 31(6), 2012. Google ScholarDigital Library
- E. H. Land and J. J. McCann. Lightness and retinex theory. JOSA, 1971.Google ScholarCross Ref
- K. J. Lee, Q. Zhao, X. Tong, M. Gong, S. Izadi, S. U. Lee, P. Tan, and S. Lin. Estimation of intrinsic image sequences from imageGoogle Scholar
- depth video. In ECCV, pages 327--340. 2012.Google Scholar
- R. Or-El, G. Rosman, A. Wetzler, R. Kimmel, and A. M. Bruckstein. Rgbd-fusion: Real-time high precision depth recovery. In CVPR, pages 5407--5416, 2015.Google Scholar
- J. Park, H. Kim, Y. Tai, M. S. Brown, and I. Kweon. High quality depth map upsampling for 3d-tof cameras. In ICCV, pages 1623--1630, 2011. Google ScholarDigital Library
- R. Ramamoorthi and P. Hanrahan. An efficient representation for irradiance environment maps. In ACM SIGGRAPH, pages 497--500, 2001. Google ScholarDigital Library
- C. Richardt, C. Stoll, N. A. Dodgson, H. Seidel, and C. Theobalt. Coherent spatiotemporal filtering, upsampling and rendering of rgbz videos. In Computer Graphics Forum, volume 31, pages 247--256, 2012. Google ScholarDigital Library
- J. Shi, Y. Dong, X. Tong, and Y. Chen. Efficient intrinsic image decomposition for rgbd images. In ACM VRST, pages 17--25, 2015. Google ScholarDigital Library
- Y. Tai, H. Du, M. S. Brown, and L. Stephen. Correction of spatially varying image and video motion blur using a hybrid camera. IEEE TPAMI, 32(6):1012--1028, 2010. Google ScholarDigital Library
- M. F. Tappen, W. T. Freeman, and E. H. Adelson. Recovering intrinsic images from a single image. IEEE TPAMI, 27(9):1459--1472, 2005. Google ScholarDigital Library
- C. Wu, K. Varanasi, Y. Liu, H. Seidel, and C. Theobalt. Shading-based dynamic shape refinement from multi-view video under general illumination. In ICCV, pages 1108--1115, 2011. Google ScholarDigital Library
- C. Wu, M. Zollhöfer, M. Nießner, M. Stamminger, S. Izadi, and C. Theobalt. Real-time shading-based refinement for consumer depth cameras. ACM Trans. Graph., 33(3), 2014. Google ScholarDigital Library
- Q. Yang, R. Yang, J. Davis, and D. Nistér. Spatial-depth super resolution for range images. In CVPR, 2007.Google ScholarCross Ref
- G. Ye, E. Garces, Y. Liu, Q. Dai, and D. Gutierrez. Intrinsic video and applications. ACM Trans. Graph., 33(4):80, 2014. Google ScholarDigital Library
- L. Yu, S. Yeung, Y. Tai, and S. Lin. Shading-based shape refinement of rgb-d images. In CVPR, pages 1415--1422, 2013. Google ScholarDigital Library
- Q. Zhang, M. Ye, R. Yang, Y. Matsushita, B. Wilburn, and H. Yu. Edge-preserving photometric stereo via depth fusion. In CVPR, pages 2472--2479, 2012. Google ScholarDigital Library
- Q. Zhao, P. Tan, Q. Dai, L. Shen, E. Wu, and S. Lin. A closed-form solution to retinex with nonlocal texture constraints. IEEE TPAMI, 34(7):1437--1444, 2012. Google ScholarDigital Library
- J. Zhu, L. Wang, R. Yang, J. Davis, and Z. Pan. Reliability fusion of time-of-flight depth and stereo geometry for high quality depth maps. IEEE TPAMI, 33(7):1400--1414, 2011. Google ScholarDigital Library
Index Terms
- High-speed Depth Stream Generation from a Hybrid Camera
Recommendations
Generation of high-quality depth maps using hybrid camera system for 3-D video
In this paper, we present a hybrid camera system combining one time-of-flight depth camera and multiple video cameras to generate multi-view video sequences and their corresponding depth maps. In order to obtain the multi-view video-plus-depth data ...
High Quality Photometric Reconstruction Using a Depth Camera
CVPR '14: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern RecognitionIn this paper we present a depth-guided photometric 3D reconstruction method that works solely with a depth camera like the Kinect. Existing methods that fuse depth with normal estimates use an external RGB camera to obtain photometric information and ...
3-D video generation using hybrid camera system
IMMERSCOM '09: Proceedings of the 2nd International Conference on Immersive TelecommunicationsIn this paper, we present a new camera system combining a time-of-flight depth camera and multiple video cameras to generate a multiview video-plus-depth. In order to get the 3-D video using the hybrid camera system, we first obtain a multiview image ...
Comments