research-article

Fusion4D: real-time performance capture of challenging scenes

Authors:
Mingsong Dou

Microsoft Research

Microsoft Research
View Profile

,
Sameh Khamis

Microsoft Research

Microsoft Research
View Profile

,
Yury Degtyarev

Microsoft Research

Microsoft Research
View Profile

,
Philip Davidson

Microsoft Research

Microsoft Research
View Profile

,
Sean Ryan Fanello

Microsoft Research

Microsoft Research
View Profile

,
Adarsh Kowdle

Microsoft Research

Microsoft Research
View Profile

,
Sergio Orts Escolano

Microsoft Research

Microsoft Research
View Profile

,
Christoph Rhemann

Microsoft Research

Microsoft Research
View Profile

,
David Kim

Microsoft Research

Microsoft Research
View Profile

,
Jonathan Taylor

Microsoft Research

Microsoft Research
View Profile

,
Pushmeet Kohli

Microsoft Research

Microsoft Research
View Profile

,
Vladimir Tankovich

Microsoft Research

Microsoft Research
View Profile

,
Shahram Izadi

Microsoft Research

Microsoft Research
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 35 Issue 4Article No.: 114pp 1–13https://doi.org/10.1145/2897824.2925969

Published:11 July 2016Publication History

ACM Transactions on Graphics

Abstract

We contribute a new pipeline for live multi-view performance capture, generating temporally coherent high-quality reconstructions in real-time. Our algorithm supports both incremental reconstruction, improving the surface estimation over time, as well as parameterizing the nonrigid scene motion. Our approach is highly robust to both large frame-to-frame motion and topology changes, allowing us to reconstruct extremely challenging scenes. We demonstrate advantages over related real-time techniques that either deform an online generated template or continually fuse depth data nonrigidly into a single reference model. Finally, we show geometric reconstruction results on par with offline methods which require orders of magnitude more processing time and many more RGBD cameras.

Supplemental Material

a114.mp4

mp4

323.4 MB

Download

Available for Download

zip

a114-dou-supp.zip (554.9 MB)

Supplemental files.

References

Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, R. W., and Gross, M. 2011. High-quality passive facial performance capture using anchor frames. ACM Transactions on Graphics (TOG) 30, 4, 75. Google ScholarDigital Library
Bleyer, M., Rhemann, C., and Rother, C. 2011. Patchmatch stereo: Stereo matching with slanted support windows. In Proc. BMVC, vol. 11, 1--11.Google Scholar
Bogo, F., Black, M. J., Loper, M., and Romero, J. 2015. Detailed full-body reconstructions of moving people from monocular RGB-D sequences. In ICCV, 2300--2308. Google ScholarDigital Library
Bojsen-Hansen, M., Li, H., and Wojtan, C. 2012. Tracking surfaces with evolving topology. ACM Trans. Graph. 31, 4, 53. Google ScholarDigital Library
Bradley, D., Popa, T., Sheffer, A., Heidrich, W., and Boubekeur, T. 2008. Markerless garment capture. ACM TOG (Proc. SIGGRAPH) 27, 3, 99. Google ScholarDigital Library
Cagniart, C., Boyer, E., and Ilic, S. 2010. Free-form mesh tracking: a patch-based approach. In Proc. CVPR.Google Scholar
Chen, Y., and Medioni, G. 1992. Object modelling by registration of multiple range images. CVIU 10, 3, 144--155. Google ScholarDigital Library
Chen, J., Bautembach, D., and Izadi, S. 2013. Scalable real-time volumetric surface reconstruction. ACM TOG. Google ScholarDigital Library
Collet, A., Chuang, M., Sweeney, P., Gillett, D., Evseev, D., Calabrese, D., Hoppe, H., Kirk, A., and Sullivan, S. 2015. High-quality streamable free-viewpoint video. ACM TOG 34, 4, 69. Google ScholarDigital Library
Curless, B., and Levoy, M. 1996. A volumetric method for building complex models from range images. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, ACM, 303--312. Google ScholarDigital Library
de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.-P., and Thrun, S. 2008. Performance capture from sparse multi-view video. ACM TOG (Proc. SIGGRAPH) 27, 1--10. Google ScholarDigital Library
Dou, M., Fuchs, H., and Frahm, J.-M. 2013. Scanning and tracking dynamic objects with commodity depth cameras. In Proc. ISMAR, IEEE, 99--106.Google Scholar
Dou, M., Taylor, J., Fuchs, H., Fitzgibbon, A., and Izadi, S. 2015. 3d scanning deformable objects with a single rgbd sensor. In CVPR.Google Scholar
Engels, C., Stewénius, H., and Nistér, D. 2006. Bundle adjustment rules. Photogrammetric computer vision 2, 124--131.Google Scholar
Gall, J., Stoll, C., De Aguiar, E., Theobalt, C., Rosenhahn, B., and Seidel, H.-P. 2009. Motion capture using joint skeleton tracking and surface estimation. In Proc. CVPR, IEEE, 1746--1753.Google Scholar
Guo, K., Xu, F., Wang, Y., Liu, Y., and Dai, Q. 2015. Robust non-rigid motion tracking and surface reconstruction using 10 regularization. In ICCV, 3083--3091. Google ScholarDigital Library
Krähenbüh, P., and Koltun, V. 2011. Efficient inference in fully connected crfs with gaussian edge potentials. NIPS.Google Scholar
Kutulakos, K. N., and Seitz, S. M. 2000. A theory of shape by space carving. IJCV. Google ScholarDigital Library
Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM TOG. Google ScholarDigital Library
Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. IJCV. Google ScholarDigital Library
Mitra, N. J., Flöry, S., Ovsjanikov, M., Gelfand, N., Guibas, L. J., and Pottmann, H. 2007. Dynamic geometry registration. In Proc. SGP, 173--182. Google ScholarDigital Library
Mori, M., MacDorman, K. F., and Kageki, N. 2012. The uncanny valley {from the field}. Robotics & Automation Magazine, IEEE 19, 2, 98--100.Google Scholar
Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohli, P., Shotton, J., Hodges, S., and Fitzgibbon, A. 2011. KinectFusion: Real-time dense surface mapping and tracking. In Proc. ISMAR, 127--136. Google ScholarDigital Library
Newcombe, R. A., Fox, D., and Seitz, S. M. 2015. Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In CVPR, 343--352.Google Scholar
Pons-Moll, G., Taylor, J., Shotton, J., Hertzmann, A., and Fitzgibbon, A. 2015. Metric regression forests for correspondence estimation. IJCV 113, 3, 163--175. Google ScholarDigital Library
Pradeep, V., Rhemann, C., Izadi, S., Zach, C., Bleyer, M., and Bathiche, S. 2013. MonoFusion: Real-time 3D reconstruction of small scenes with a single web camera. In Proc. ISMAR, IEEE, 83--88.Google Scholar
Revaud, J., Weinzaepfel, P., Harchaoui, Z., and Schmid, C. 2015. Epicflow: Edge-preserving interpolation of correspondences for optical flow. CVPR.Google Scholar
Rosten, E., and Drummond, T. 2005. Fusing points and lines for high performance tracking. In ICCV. Google ScholarDigital Library
Rusinkiewicz, S., and Levoy, M. 2001. Efficient variants of the icp algorithm. In 3DIM, 145--152.Google Scholar
Shotton, J., Glocker, B., Zach, C., Izadi, S., Criminisi, A., and Fitzgibbon, A. 2013. Scene coordinate regression forests for camera relocalization in rgb-d images. In CVPR. Google ScholarDigital Library
Smolic, A. 2011. 3d video and free viewpoint videofrom capture to display. Pattern recognition 44, 9, 1958--1968. Google ScholarDigital Library
Starck, J., and Hilton, A. 2007. Surface capture for performance-based animation. Computer Graphics and Applications 27, 3, 21--31. Google ScholarDigital Library
Stoll, C., Hasler, N., Gall, J., Seidel, H., and Theobalt, C. 2011. Fast articulated motion tracking using a sums of gaussians body model. In Proc. ICCV, IEEE, 951--958. Google ScholarDigital Library
Sumner, R. W., Schmid, J., and Pauly, M. 2007. Embedded deformation for shape manipulation. ACM TOG 26, 3, 80. Google ScholarDigital Library
Tevs, A., Berner, A., Wand, M., Ihrke, I., Bokeloh, M., Kerber, J., and Seidel, H.-P. 2012. Animation cartography-intrinsic reconstruction of shape and motion. ACM TOG. Google ScholarDigital Library
Theobalt, C., de Aguiar, E., Stoll, C., Seidel, H.-P., and Thrun, S. 2010. Performance capture from multi-view video. In Image and Geometry Processing for 3D-Cinematography, R. Ronfard and G. Taubin, Eds. Springer, 127ff.Google Scholar
Vineet, V., Warrell, J., and Torr, P. H. S. 2012. Filter-based mean-field inference for random fields with higher-order terms and product label-spaces. In ECCV. Google ScholarDigital Library
Vlasic, D., Baran, I., Matusik, W., and Popović, J. 2008. Articulated mesh animation from multi-view silhouettes. ACM TOG (Proc. SIGGRAPH). Google ScholarDigital Library
Vlasic, D., Peers, P., Baran, I., Debevec, P., Popovic, J., Rusinkiewicz, S., and Matusik, W. 2009. Dynamic shape capture using multi-view photometric stereo. ACM TOG (Proc. SIGGRAPH Asia) 28, 5, 174. Google ScholarDigital Library
Wand, M., Adams, B., Ovsjanikov, M., Berner, A., Bokeloh, M., Jenke, P., Guibas, L., Seidel, H.-P., and Schilling, A. 2009. Efficient reconstruction of nonrigid shape and motion from real-time 3D scanner data. ACM TOG. Google ScholarDigital Library
Wang, S., Fanello, S. R., Rhemann, C., Izadi, S., and Kohli, P. 2016. The global patch collider. CVPR.Google Scholar
Waschbüsch, M., Würmlin, S., Cotting, D., Sadlo, F., and Gross, M. 2005. Scalable 3D video of dynamic scenes. In Proc. Pacific Graphics, 629--638.Google Scholar
Wei, L., Huang, Q., Ceylan, D., Vouga, E., and Li, H. 2015. Dense human body correspondences using convolutional networks. arXiv preprint arXiv:1511.05904.Google Scholar
Weinzaepfel, P., Revaud, J., Harchaoui, Z., and Schmid, C. 2013. Deepflow: Large displacement optical flow with deep matching. In ICCV. Google ScholarDigital Library
Ye, M., and Yang, R. 2014. Real-time simultaneous pose and shape estimation for articulated objects using a single depth camera. In CVPR, IEEE. Google ScholarDigital Library
Ye, M., Zhang, Q., Wang, L., Zhu, J., Yang, R., and Gall, J. 2013. A survey on human motion analysis from depth data. In Time-of-Flight and Depth Imaging. Sensors, Algorithms, and Applications. Springer, 149--187.Google Scholar
Zach, C. 2014. Robust bundle adjustment revisited. In Computer Vision--ECCV 2014. Springer, 772--787.Google ScholarCross Ref
Zeng, M., Zheng, J., Cheng, X., and Liu, X. 2013. Template-less quasi-rigid shape modeling with implicit loop-closure. In Proc. CVPR, IEEE, 145--152. Google ScholarDigital Library
Zhang, Q., Fu, B., Ye, M., and Yang, R. 2014. Quality dynamic human body modeling using a single low-cost depth camera. In CVPR, IEEE, 676--683. Google ScholarDigital Library
Zollhöfer, M., Niessner, M., Izadi, S., Rhemann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., et al. 2014. Real-time non-rigid reconstruction using an rgb-d camera. ACM TOG. Google ScholarDigital Library

Index Terms

Fusion4D: real-time performance capture of challenging scenes
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
        Motion capture

Recommendations

Motion2fusion: real-time volumetric performance capture

We present Motion2Fusion, a state-of-the-art 360 performance capture system that enables *real-time* reconstruction of arbitrary non-rigid scenes. We provide three major contributions over prior work: 1) a new non-rigid fusion pipeline allowing for far ...
Read More
Monocular Template-based Reconstruction of Inextensible Surfaces

We present a monocular 3D reconstruction algorithm for inextensible deformable surfaces. It uses point correspondences between a single image of the deformed surface taken by a camera with known intrinsic parameters and a template. The main assumption ...
Read More
OmniKinect: real-time dense volumetric data acquisition and applications
VRST '12: Proceedings of the 18th ACM symposium on Virtual reality software and technology

Real-time three-dimensional acquisition of real-world scenes has many important applications in computer graphics, computer vision and human-computer interaction. Inexpensive depth sensors such as the Microsoft Kinect allow to leverage the development ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Graphics Volume 35, Issue 4
July 2016
1396 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2897824
Issue’s Table of Contents

Copyright © 2016 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 July 2016
Published in tog Volume 35, Issue 4

Check for updates
Author Tags
4D reconstruction
multi-view
nonrigid
real-time
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 343
  Total Citations
  View Citations
- 4,815
  Total Downloads
- Downloads (Last 12 months)171
- Downloads (Last 6 weeks)23
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fusion4D: real-time performance capture of challenging scenes

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Motion2fusion: real-time volumetric performance capture

Monocular Template-based Reconstruction of Inextensible Surfaces

OmniKinect: real-time dense volumetric data acquisition and applications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Fusion4D: real-time performance capture of challenging scenes

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Motion2fusion: real-time volumetric performance capture

Monocular Template-based Reconstruction of Inextensible Surfaces

OmniKinect: real-time dense volumetric data acquisition and applications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media