skip to main content
10.1145/1399504.1360697acmconferencesArticle/Chapter ViewAbstractPublication PagessiggraphConference Proceedingsconference-collections
research-article

Performance capture from sparse multi-view video

Published: 01 August 2008 Publication History

Abstract

This paper proposes a new marker-less approach to capturing human performances from multi-view video. Our algorithm can jointly reconstruct spatio-temporally coherent geometry, motion and textural surface appearance of actors that perform complex and rapid moves. Furthermore, since our algorithm is purely meshbased and makes as few as possible prior assumptions about the type of subject being tracked, it can even capture performances of people wearing wide apparel, such as a dancer wearing a skirt. To serve this purpose our method efficiently and effectively combines the power of surface- and volume-based shape deformation techniques with a new mesh-based analysis-through-synthesis framework. This framework extracts motion constraints from video and makes the laser-scan of the tracked subject mimic the recorded performance. Also small-scale time-varying shape detail is recovered by applying model-guided multi-view stereo to refine the model surface. Our method delivers captured performance data at high level of detail, is highly versatile, and is applicable to many complex types of scenes that could not be handled by alternative marker-based or marker-free recording techniques.

Supplementary Material

FLV File (23.flv)
MOV File (a98-de_aguilar.mov)

References

[1]
Allen, B., Curless, B., and Popović, Z. 2002. Articulated body deformation from range scan data. ACM Trans. Graph. 21, 3, 612--619.
[2]
Balan, A. O., Sigal, L., Black, M. J., Davis, J. E., and Haussecker, H. W. 2007. Detailed human shape and pose from images. In Proc. CVPR.
[3]
Bickel, B., Botsch, M., Angst, R., Matusik, W., Otaduy, M., Pfister, H., and Gross, M. 2007. Multi-scale capture of facial geometry and motion. In Proc. of SIGGRAPH, 33.
[4]
Botsch, M., and Sorkine, O. 2008. On linear variational surface deformation methods. IEEE TVCG 14, 1, 213--230.
[5]
Botsch, M., Pauly, M., Wicke, M., and Gross, M. 2007. Adaptive space deformations based on rigid cells. Computer Graphics Forum 26, 3, 339--347.
[6]
Byrd, R., Lu, P., Nocedal, J., and Zhu, C. 1995. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comp. 16, 5, 1190--1208.
[7]
Carranza, J., Theobalt, C., Magnor, M., and Seidel, H.-P. 2003. Free-viewpoint video of human actors. In Proc. SIGGRAPH, 569--577.
[8]
de Aguiar, E., Theobalt, C., Stoll, C., and Seidel, H.-P. 2007. Marker-less deformable mesh tracking for human shape and motion capture. In Proc. CVPR, IEEE, 1--8.
[9]
de Aguiar, E., Theobalt, C., Stoll, C., and Seidel, H. 2007. Marker-less 3d feature tracking for mesh-based human motion capture. In Proc. ICCV HUMO07, 1--15.
[10]
de Aguiar, E., Theobalt, C., Thrun, S., and Seidel, H.-P. 2008. Automatic conversion of mesh animations into skeleton-based animations. Computer Graphics Forum (Proc. Eurographics EG'08) 27, 2 (4), 389--397.
[11]
Einarsson, P., Chabert, C.-F., Jones, A., Ma, W.-C., Lamond, B., im Hawkins, Bolas, M., Sylwan, S., and Debevec, P. 2006. Relighting human locomotion with flowed reflectance fields. In Proc. EGSR, 183--194.
[12]
Goesele, M., Curless, B., and Seitz, S. M. 2006. Multiview stereo revisited. In Proc. CVPR, 2402--2409.
[13]
Gross, M., Würmlin, S., Näf, M., Lamboray, E., Spagno, C., Kunz, A., Koller-Meier, E., Svoboda, T., Gool, L. V., Lang, S., Strehlke, K., Moere, A. V., and Staadt, O. 2003. blue-c: a spatially immersive display and 3d video portal for telepresence. ACM TOG 22, 3, 819--827.
[14]
Kanade, T., Rander, P., and Narayanan, P. J. 1997. Virtualized reality: Constructing virtual worlds from real scenes. IEEE MultiMedia 4, 1, 34--47.
[15]
Kazhdan, M., Bolitho, M., and Hoppe, H. 2006. Poisson surface reconstruction. In Proc. SGP, 61--70.
[16]
Leordeanu, M., and Hebert, M. 2005. A spectral technique for correspondence problems using pairwise constraints. In Proc. ICCV.
[17]
Lowe, D. G. 1999. Object recognition from local scale-invariant features. In Proc. ICCV, vol. 2, 1150ff.
[18]
Matusik, W., Buehler, C., Raskar, R., Gortler, S., and McMillan, L. 2000. Image-based visual hulls. In Proc. SIGGRAPH, 369--374.
[19]
Menache, A., and Manache, A. 1999. Understanding Motion Capture for Computer Animation and Video Games. Morgan Kaufmann.
[20]
Mitra, N. J., Flory, S., Ovsjanikov, M., Gelfand, N., as, L. G., and Pottmann, H. 2007. Dynamic geometry registration. In Proc. SGP, 173--182.
[21]
Moeslund, T. B., Hilton, A., and Krüger, V. 2006. A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104, 2, 90--126.
[22]
Müller, M., Dorsey, J., McMillan, L., Jagnow, R., and Cutler, B. 2002. Stable real-time deformations. In Proc. of SCA, ACM, 49--54.
[23]
Paramount, 2007. Beowulf movie page. http://www.beowulfmovie.com/.
[24]
Park, S. I., and Hodgins, J. K. 2006. Capturing and animating skin deformation in human motion. ACM TOG (SIGGRAPH 2006) 25, 3 (Aug.).
[25]
Poppe, R. 2007. Vision-based human motion analysis: An overview. CVIU 108, 1.
[26]
Rosenhahn, B., Kersting, U., Powel, K., and Seidel, H.-P. 2006. Cloth x-ray: Mocap of people wearing textiles. In LNCS 4174: Proc. DAGM, 495--504.
[27]
Sand, P., McMillan, L., and Popović, J. 2003. Continuous capture of skin deformation. ACM TOG 22, 3.
[28]
Scholz, V., Stich, T., Keckeisen, M., Wacker, M., and Magnor, M. 2005. Garment motion capture using colorcoded patterns. Computer Graphics Forum (Proc. Eurographics EG'05) 24, 3 (Aug.), 439--448.
[29]
Shinya, M. 2004. Unifying measured point sequences of deforming objects. In Proc. of 3DPVT, 904--911.
[30]
Sorkine, O., and Alexa, M. 2007. As-rigid-as-possible surface modeling. In Proc. SGP, 109--116.
[31]
Starck, J., and Hilton, A. 2007. Surface capture for performance based animation. IEEE CGAA 27(3), 21--31.
[32]
Stoll, C., Karni, Z., Rössl, C., Yamauchi, H., and Seidel, H.-P. 2006. Template deformation for point cloud fitting. In Proc. SGP, 27--35.
[33]
Sumner, R. W., and Popović, J. 2004. Deformation transfer for triangle meshes. In SIGGRAPH '04, 399--405.
[34]
Vedula, S., Baker, S., and Kanade, T. 2005. Image-based spatio-temporal modeling and view interpolation of dynamic events. ACM Trans. Graph. 24, 2, 240--261.
[35]
Wand, M., Jenke, P., Huang, Q., Bokeloh, M., Guibas, L., and Schilling, A. 2007. Reconstruction of deforming geometry from time-varying point clouds. In Proc. SGP, 49--58.
[36]
Waschbüsch, M., Würmlin, S., Cotting, D., Sadlo, F., and Gross, M. 2005. Scalable 3D video of dynamic scenes. In Proc. Pacific Graphics, 629--638.
[37]
White, R., Crane, K., and Forsyth, D. 2007. Capturing and animating occluded cloth. In ACM TOG (Proc. SIGGRAPH).
[38]
Wilburn, B., Joshi, N., Vaish, V., Talvala, E., Antunez, E., Barth, A., Adams, A., Horowitz, M., and Levoy, M. 2005. High performance imaging using large camera arrays. ACM TOG 24, 3, 765--776.
[39]
Xu, W., Zhou, K., Yu, Y., Tan, Q., Peng, Q., and Guo, B. 2007. Gradient domain editing of deforming mesh sequences. In Proc. SIGGRAPH, ACM, 84ff.
[40]
Yamauchi, H., Gumhold, S., Zayer, R., and Seidel, H.-P. 2005. Mesh segmentation driven by gaussian curvature. Visual Computer 21, 8--10, 649--658.
[41]
Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. ACM TOG 23, 3, 600--608.

Cited By

View all
  • (2024)ST-4DGS: Spatial-Temporally Consistent 4D Gaussian Splatting for Efficient Dynamic Scene RenderingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657520(1-11)Online publication date: 13-Jul-2024
  • (2024)Factorized Motion Fields for Fast Sparse Input Dynamic View SynthesisACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657498(1-12)Online publication date: 13-Jul-2024
  • (2024)Ultra Inertial Poser: Scalable Motion Capture and Tracking from Sparse Inertial Sensors and Ultra-Wideband RangingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657465(1-11)Online publication date: 13-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGGRAPH '08: ACM SIGGRAPH 2008 papers
August 2008
887 pages
ISBN:9781450301121
DOI:10.1145/1399504
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. marker-less scene reconstruction
  2. multi-view video analysis
  3. performance capture

Qualifiers

  • Research-article

Conference

SIGGRAPH '08
Sponsor:

Acceptance Rates

SIGGRAPH '08 Paper Acceptance Rate 90 of 518 submissions, 17%;
Overall Acceptance Rate 1,822 of 8,601 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)52
  • Downloads (Last 6 weeks)5
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)ST-4DGS: Spatial-Temporally Consistent 4D Gaussian Splatting for Efficient Dynamic Scene RenderingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657520(1-11)Online publication date: 13-Jul-2024
  • (2024)Factorized Motion Fields for Fast Sparse Input Dynamic View SynthesisACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657498(1-12)Online publication date: 13-Jul-2024
  • (2024)Ultra Inertial Poser: Scalable Motion Capture and Tracking from Sparse Inertial Sensors and Ultra-Wideband RangingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657465(1-11)Online publication date: 13-Jul-2024
  • (2024)Neural Novel Actor: Learning a Generalized Animatable Neural Representation for Human ActorsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.330543330:8(5719-5732)Online publication date: Aug-2024
  • (2024)InterGen: Diffusion-Based Multi-human Motion Generation Under Complex InteractionsInternational Journal of Computer Vision10.1007/s11263-024-02042-6132:9(3463-3483)Online publication date: 26-Mar-2024
  • (2024)InstantGeoAvatar: Effective Geometry and Appearance Modeling of Animatable Avatars from Monocular VideoComputer Vision – ACCV 202410.1007/978-981-96-0960-4_16(255-277)Online publication date: 8-Dec-2024
  • (2024)LiveHPS++: Robust and Coherent Motion Capture in Dynamic Free EnvironmentComputer Vision – ECCV 202410.1007/978-3-031-73397-0_8(127-144)Online publication date: 3-Nov-2024
  • (2024)MIGS: Multi-Identity Gaussian Splatting via Tensor DecompositionComputer Vision – ECCV 202410.1007/978-3-031-72691-0_22(388-408)Online publication date: 3-Nov-2024
  • (2023)SAILOR: Synergizing Radiance and Occupancy Fields for Live Human Performance CaptureACM Transactions on Graphics10.1145/361837042:6(1-15)Online publication date: 5-Dec-2023
  • (2023)MP‐NeRF: Neural Radiance Fields for Dynamic Multi‐person synthesis from Sparse ViewsComputer Graphics Forum10.1111/cgf.1464641:8(317-325)Online publication date: 20-Mar-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media