Abstract
We contribute a new pipeline for live multi-view performance capture, generating temporally coherent high-quality reconstructions in real-time. Our algorithm supports both incremental reconstruction, improving the surface estimation over time, as well as parameterizing the nonrigid scene motion. Our approach is highly robust to both large frame-to-frame motion and topology changes, allowing us to reconstruct extremely challenging scenes. We demonstrate advantages over related real-time techniques that either deform an online generated template or continually fuse depth data nonrigidly into a single reference model. Finally, we show geometric reconstruction results on par with offline methods which require orders of magnitude more processing time and many more RGBD cameras.
Supplemental Material
Available for Download
Supplemental files.
- Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, R. W., and Gross, M. 2011. High-quality passive facial performance capture using anchor frames. ACM Transactions on Graphics (TOG) 30, 4, 75. Google ScholarDigital Library
- Bleyer, M., Rhemann, C., and Rother, C. 2011. Patchmatch stereo: Stereo matching with slanted support windows. In Proc. BMVC, vol. 11, 1--11.Google Scholar
- Bogo, F., Black, M. J., Loper, M., and Romero, J. 2015. Detailed full-body reconstructions of moving people from monocular RGB-D sequences. In ICCV, 2300--2308. Google ScholarDigital Library
- Bojsen-Hansen, M., Li, H., and Wojtan, C. 2012. Tracking surfaces with evolving topology. ACM Trans. Graph. 31, 4, 53. Google ScholarDigital Library
- Bradley, D., Popa, T., Sheffer, A., Heidrich, W., and Boubekeur, T. 2008. Markerless garment capture. ACM TOG (Proc. SIGGRAPH) 27, 3, 99. Google ScholarDigital Library
- Cagniart, C., Boyer, E., and Ilic, S. 2010. Free-form mesh tracking: a patch-based approach. In Proc. CVPR.Google Scholar
- Chen, Y., and Medioni, G. 1992. Object modelling by registration of multiple range images. CVIU 10, 3, 144--155. Google ScholarDigital Library
- Chen, J., Bautembach, D., and Izadi, S. 2013. Scalable real-time volumetric surface reconstruction. ACM TOG. Google ScholarDigital Library
- Collet, A., Chuang, M., Sweeney, P., Gillett, D., Evseev, D., Calabrese, D., Hoppe, H., Kirk, A., and Sullivan, S. 2015. High-quality streamable free-viewpoint video. ACM TOG 34, 4, 69. Google ScholarDigital Library
- Curless, B., and Levoy, M. 1996. A volumetric method for building complex models from range images. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, ACM, 303--312. Google ScholarDigital Library
- de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.-P., and Thrun, S. 2008. Performance capture from sparse multi-view video. ACM TOG (Proc. SIGGRAPH) 27, 1--10. Google ScholarDigital Library
- Dou, M., Fuchs, H., and Frahm, J.-M. 2013. Scanning and tracking dynamic objects with commodity depth cameras. In Proc. ISMAR, IEEE, 99--106.Google Scholar
- Dou, M., Taylor, J., Fuchs, H., Fitzgibbon, A., and Izadi, S. 2015. 3d scanning deformable objects with a single rgbd sensor. In CVPR.Google Scholar
- Engels, C., Stewénius, H., and Nistér, D. 2006. Bundle adjustment rules. Photogrammetric computer vision 2, 124--131.Google Scholar
- Gall, J., Stoll, C., De Aguiar, E., Theobalt, C., Rosenhahn, B., and Seidel, H.-P. 2009. Motion capture using joint skeleton tracking and surface estimation. In Proc. CVPR, IEEE, 1746--1753.Google Scholar
- Guo, K., Xu, F., Wang, Y., Liu, Y., and Dai, Q. 2015. Robust non-rigid motion tracking and surface reconstruction using 10 regularization. In ICCV, 3083--3091. Google ScholarDigital Library
- Krähenbüh, P., and Koltun, V. 2011. Efficient inference in fully connected crfs with gaussian edge potentials. NIPS.Google Scholar
- Kutulakos, K. N., and Seitz, S. M. 2000. A theory of shape by space carving. IJCV. Google ScholarDigital Library
- Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM TOG. Google ScholarDigital Library
- Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. IJCV. Google ScholarDigital Library
- Mitra, N. J., Flöry, S., Ovsjanikov, M., Gelfand, N., Guibas, L. J., and Pottmann, H. 2007. Dynamic geometry registration. In Proc. SGP, 173--182. Google ScholarDigital Library
- Mori, M., MacDorman, K. F., and Kageki, N. 2012. The uncanny valley {from the field}. Robotics & Automation Magazine, IEEE 19, 2, 98--100.Google Scholar
- Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohli, P., Shotton, J., Hodges, S., and Fitzgibbon, A. 2011. KinectFusion: Real-time dense surface mapping and tracking. In Proc. ISMAR, 127--136. Google ScholarDigital Library
- Newcombe, R. A., Fox, D., and Seitz, S. M. 2015. Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In CVPR, 343--352.Google Scholar
- Pons-Moll, G., Taylor, J., Shotton, J., Hertzmann, A., and Fitzgibbon, A. 2015. Metric regression forests for correspondence estimation. IJCV 113, 3, 163--175. Google ScholarDigital Library
- Pradeep, V., Rhemann, C., Izadi, S., Zach, C., Bleyer, M., and Bathiche, S. 2013. MonoFusion: Real-time 3D reconstruction of small scenes with a single web camera. In Proc. ISMAR, IEEE, 83--88.Google Scholar
- Revaud, J., Weinzaepfel, P., Harchaoui, Z., and Schmid, C. 2015. Epicflow: Edge-preserving interpolation of correspondences for optical flow. CVPR.Google Scholar
- Rosten, E., and Drummond, T. 2005. Fusing points and lines for high performance tracking. In ICCV. Google ScholarDigital Library
- Rusinkiewicz, S., and Levoy, M. 2001. Efficient variants of the icp algorithm. In 3DIM, 145--152.Google Scholar
- Shotton, J., Glocker, B., Zach, C., Izadi, S., Criminisi, A., and Fitzgibbon, A. 2013. Scene coordinate regression forests for camera relocalization in rgb-d images. In CVPR. Google ScholarDigital Library
- Smolic, A. 2011. 3d video and free viewpoint videofrom capture to display. Pattern recognition 44, 9, 1958--1968. Google ScholarDigital Library
- Starck, J., and Hilton, A. 2007. Surface capture for performance-based animation. Computer Graphics and Applications 27, 3, 21--31. Google ScholarDigital Library
- Stoll, C., Hasler, N., Gall, J., Seidel, H., and Theobalt, C. 2011. Fast articulated motion tracking using a sums of gaussians body model. In Proc. ICCV, IEEE, 951--958. Google ScholarDigital Library
- Sumner, R. W., Schmid, J., and Pauly, M. 2007. Embedded deformation for shape manipulation. ACM TOG 26, 3, 80. Google ScholarDigital Library
- Tevs, A., Berner, A., Wand, M., Ihrke, I., Bokeloh, M., Kerber, J., and Seidel, H.-P. 2012. Animation cartography-intrinsic reconstruction of shape and motion. ACM TOG. Google ScholarDigital Library
- Theobalt, C., de Aguiar, E., Stoll, C., Seidel, H.-P., and Thrun, S. 2010. Performance capture from multi-view video. In Image and Geometry Processing for 3D-Cinematography, R. Ronfard and G. Taubin, Eds. Springer, 127ff.Google Scholar
- Vineet, V., Warrell, J., and Torr, P. H. S. 2012. Filter-based mean-field inference for random fields with higher-order terms and product label-spaces. In ECCV. Google ScholarDigital Library
- Vlasic, D., Baran, I., Matusik, W., and Popović, J. 2008. Articulated mesh animation from multi-view silhouettes. ACM TOG (Proc. SIGGRAPH). Google ScholarDigital Library
- Vlasic, D., Peers, P., Baran, I., Debevec, P., Popovic, J., Rusinkiewicz, S., and Matusik, W. 2009. Dynamic shape capture using multi-view photometric stereo. ACM TOG (Proc. SIGGRAPH Asia) 28, 5, 174. Google ScholarDigital Library
- Wand, M., Adams, B., Ovsjanikov, M., Berner, A., Bokeloh, M., Jenke, P., Guibas, L., Seidel, H.-P., and Schilling, A. 2009. Efficient reconstruction of nonrigid shape and motion from real-time 3D scanner data. ACM TOG. Google ScholarDigital Library
- Wang, S., Fanello, S. R., Rhemann, C., Izadi, S., and Kohli, P. 2016. The global patch collider. CVPR.Google Scholar
- Waschbüsch, M., Würmlin, S., Cotting, D., Sadlo, F., and Gross, M. 2005. Scalable 3D video of dynamic scenes. In Proc. Pacific Graphics, 629--638.Google Scholar
- Wei, L., Huang, Q., Ceylan, D., Vouga, E., and Li, H. 2015. Dense human body correspondences using convolutional networks. arXiv preprint arXiv:1511.05904.Google Scholar
- Weinzaepfel, P., Revaud, J., Harchaoui, Z., and Schmid, C. 2013. Deepflow: Large displacement optical flow with deep matching. In ICCV. Google ScholarDigital Library
- Ye, M., and Yang, R. 2014. Real-time simultaneous pose and shape estimation for articulated objects using a single depth camera. In CVPR, IEEE. Google ScholarDigital Library
- Ye, M., Zhang, Q., Wang, L., Zhu, J., Yang, R., and Gall, J. 2013. A survey on human motion analysis from depth data. In Time-of-Flight and Depth Imaging. Sensors, Algorithms, and Applications. Springer, 149--187.Google Scholar
- Zach, C. 2014. Robust bundle adjustment revisited. In Computer Vision--ECCV 2014. Springer, 772--787.Google ScholarCross Ref
- Zeng, M., Zheng, J., Cheng, X., and Liu, X. 2013. Template-less quasi-rigid shape modeling with implicit loop-closure. In Proc. CVPR, IEEE, 145--152. Google ScholarDigital Library
- Zhang, Q., Fu, B., Ye, M., and Yang, R. 2014. Quality dynamic human body modeling using a single low-cost depth camera. In CVPR, IEEE, 676--683. Google ScholarDigital Library
- Zollhöfer, M., Niessner, M., Izadi, S., Rhemann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., et al. 2014. Real-time non-rigid reconstruction using an rgb-d camera. ACM TOG. Google ScholarDigital Library
Index Terms
- Fusion4D: real-time performance capture of challenging scenes
Recommendations
Motion2fusion: real-time volumetric performance capture
We present Motion2Fusion, a state-of-the-art 360 performance capture system that enables *real-time* reconstruction of arbitrary non-rigid scenes. We provide three major contributions over prior work: 1) a new non-rigid fusion pipeline allowing for far ...
Monocular Template-based Reconstruction of Inextensible Surfaces
We present a monocular 3D reconstruction algorithm for inextensible deformable surfaces. It uses point correspondences between a single image of the deformed surface taken by a camera with known intrinsic parameters and a template. The main assumption ...
OmniKinect: real-time dense volumetric data acquisition and applications
VRST '12: Proceedings of the 18th ACM symposium on Virtual reality software and technologyReal-time three-dimensional acquisition of real-world scenes has many important applications in computer graphics, computer vision and human-computer interaction. Inexpensive depth sensors such as the Microsoft Kinect allow to leverage the development ...
Comments