skip to main content
research-article

Reconstructing detailed dynamic face geometry from monocular video

Published:01 November 2013Publication History
Skip Abstract Section

Abstract

Detailed facial performance geometry can be reconstructed using dense camera and light setups in controlled studios. However, a wide range of important applications cannot employ these approaches, including all movie productions shot from a single principal camera. For post-production, these require dynamic monocular face capture for appearance modification. We present a new method for capturing face geometry from monocular video. Our approach captures detailed, dynamic, spatio-temporally coherent 3D face geometry without the need for markers. It works under uncontrolled lighting, and it successfully reconstructs expressive motion including high-frequency face detail such as folds and laugh lines. After simple manual initialization, the capturing process is fully automatic, which makes it versatile, lightweight and easy-to-deploy. Our approach tracks accurate sparse 2D features between automatically selected key frames to animate a parametric blend shape model, which is further refined in pose, expression and shape by temporally coherent optical flow and photometric stereo. We demonstrate performance capture results for long and complex face sequences captured indoors and outdoors, and we exemplify the relevance of our approach as an enabling technology for model-based face editing in movies and video, such as adding new facial textures, as well as a step towards enabling everyone to do facial performance capture with a single affordable camera.

Skip Supplemental Material Section

Supplemental Material

References

  1. Ahonen, T., Hadid, A., and Pietikainen, M. 2006. Face description with local binary patterns: Application to face recognition. IEEE TPAMI 28, 12, 2037--2041. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alexander, O., Rogers, M., Lambeth, W., Chiang, M., and Debevec, P. 2009. The Digital Emily Project: photoreal facial modeling and animation. In ACM SIGGRAPH Courses, 12:1--12:15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Arun, K. S., Huang, T. S., and Blostein, S. D. 1987. Least-squares fitting of two 3-D point sets. IEEE TPAMI 9, 5, 698--700. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, R. W., and Gross, M. 2011. High-quality passive facial performance capture using anchor frames. ACM TOG (Proc. SIGGRAPH) 30, 75:1--75:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bickel, B., Botsch, M., Angst, R., Matusik, W., Otaduy, M., Pfister, H., and Gross, M. 2007. Multi-scale capture of facial geometry and motion. ACM TOG (Proc. SIGGRAPH) 26, 33:1--33:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Black, M., and Yacoob, Y. 1995. Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion. In Proc. ICCV, 374--381. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Blanz, V., Basso, C., Vetter, T., and Poggio, T. 2003. Reanimating faces in images and video. CGF (Proc. EUROGRAPHICS) 22, 641--650.Google ScholarGoogle ScholarCross RefCross Ref
  8. Borshukov, G., Piponi, D., Larsen, O., Lewis, J. P., and Tempelaar-Lietz, C. 2003. Universal capture: image-based facial animation for "The Matrix Reloaded". In ACM SIGGRAPH 2003 Sketches, 16:1--16:1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bouaziz, S., Wang, Y., and Pauly, M. 2013. Online modeling for realtime facial animation. ACM TOG (Proc. SIGGRAPH) 32, 4, 40:1--40:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bradley, D., Heidrich, W., Popa, T., and Sheffer, A. 2010. High resolution passive facial performance capture. ACM TOG (Proc. SIGGRAPH) 29, 4, 41:1--41:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Brand, M., and Bhotika, R. 2001. Flexible flow for 3D nonrigid tracking and shape recovery. In Proc. CVPR, 315--322.Google ScholarGoogle Scholar
  12. Cao, C., Weng, Y., Lin, S., and Zhou, K. 2013. 3D shape regression for real-time facial animation. ACM TOG (Proc. SIGGRAPH) 32, 4, 41:1--41:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Chai, J.-x., Xiao, J., and Hodgins, J. 2003. Vision-based control of 3D facial animation. In Proc. SCA, 193--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chuang, E., and Bregler, C. 2002. Performance-driven facial animation using blend shape interpolation. Tech. Rep. CS-TR-2002-02, Stanford University.Google ScholarGoogle Scholar
  15. Cootes, T. F., Edwards, G. J., and Taylor, C. J. 2001. Active appearance models. IEEE TPAMI 23, 6, 681--685. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Dale, K., Sunkavalli, K., Johnson, M. K., Vlasic, D., Matusik, W., and Pfister, H. 2011. Video face replacement. ACM TOG (Proc. SIGGRAPH Asia) 30, 6, 130:1--130:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Dantone, M., Gall, J., Fanelli, G., and Gool, L. V. 2012. Real-time facial feature detection using conditional regression forests. In Proc. CVPR, 2578--2585. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. David, P., DeMenthon, D., Duraiswami, R., and Samet, H. 2004. SoftPOSIT: Simultaneous pose and correspondence determination. IJCV 59, 3, 259--284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. DeCarlo, D., and Metaxas, D. 1996. The integration of optical flow and deformable models with applications to human face shape and motion estimation. In Proc. CVPR, 231--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Essa, I., Basu, S., Darrell, T., and Pentland, A. 1996. Modeling, tracking and interactive animation of faces and heads using input from video. In Proc. CA, 68--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Furukawa, Y., and Ponce, J. 2009. Dense 3D motion capture for human faces. In Proc. CVPR, 1674--1681.Google ScholarGoogle Scholar
  22. Guenter, B., Grimm, C., Wood, D., Malvar, H., and Pighin, F. 1998. Making faces. In Proc. SIGGRAPH, 55--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Huang, H., Chai, J., Tong, X., and Wu, H.-T. 2011. Leveraging motion capture and 3D scanning for high-fidelity facial performance acquisition. ACM TOG (Proc. SIGGRAPH) 30, 74:1--74:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kemelmacher-Shlizerman, I., Sankar, A., Shechtman, E., and Seitz, S. M. 2010. Being John Malkovich. In Proc. ECCV, 341--353. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Li, H., Roivainen, P., and Forcheimer, R. 1993. 3-D motion estimation in model-based facial image coding. IEEE TPAMI 15, 6, 545--555. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Li, H., Weise, T., and Pauly, M. 2010. Example-based facial rigging. ACM TOG (Proc. SIGGRAPH) 29, 3, 32:1--32:6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Li, K., Xu, F., Wang, J., Dai, Q., and Liu, Y. 2012. A data-driven approach for facial expression synthesis in video. In Proc. CVPR, 57--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Li, H., Yu, J., Ye, Y., and Bregler, C. 2013. Realtime facial animation with on-the-fly correctives. ACM TOG (Proc. SIGGRAPH) 32, 4, 42:1--42:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Nehab, D., Rusinkiewicz, S., Davis, J., and Ramamoorthi, R. 2005. Efficiently combining positions and normals for precise 3D geometry. ACM TOG 24, 3, 536--543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Pighin, F., and Lewis, J. 2006. Performance-driven facial animation. In ACM SIGGRAPH Courses.Google ScholarGoogle Scholar
  31. Pighin, F., Szeliski, R., and Salesin, D. 1999. Resynthesizing facial animation through 3D model-based tracking. In Proc. CVPR, 143--150.Google ScholarGoogle Scholar
  32. Platt, J. C. 1998. Sequential minimal optimization: A fast algorithm for training support vector machines. Tech. Rep. MSRTR-98-14, Microsoft Research.Google ScholarGoogle Scholar
  33. Saragih, J. M., Lucey, S., and Cohn, J. F. 2011. Deformable model fitting by regularized landmark mean-shift. IJCV 91, 2, 200--215. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sorkine, O. 2005. Laplacian mesh processing. In EUROGRAPHICS STAR report, 53--70.Google ScholarGoogle Scholar
  35. Valgaerts, L., Bruhn, A., Mainberger, M., and Weickert, J. 2011. Dense versus sparse approaches for estimating the fundamental matrix. IJCV 96, 2, 212--234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Valgaerts, L., Wu, C., Bruhn, A., Seidel, H.-P., and Theobalt, C. 2012. Lightweight binocular facial performance capture under uncontrolled lighting. ACM TOG (Proc. SIGGRAPH Asia) 31, 6, 187:1--187:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Vlasic, D., Brand, M., Pfister, H., and Popovíc, J. 2005. Face transfer with multilinear models. ACM TOG (Proc. SIGGRAPH) 24, 3, 426--433. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Volz, S., Bruhn, A., Valgaerts, L., and Zimmer, H. 2011. Modeling temporal coherence for optical flow. In Proc. ICCV, 1116--1123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Wang, Y., Huang, X., Su Lee, C., Zhang, S., Li, Z., Samaras, D., Metaxas, D., Elgammal, A., and Huang, P. 2004. High resolution acquisition, learning and transfer of dynamic 3-D facial expressions. CGF 23, 677--686.Google ScholarGoogle ScholarCross RefCross Ref
  40. Weise, T., Leibe, B., and Gool, L. J. V. 2007. Fast 3D scanning with automatic motion compensation. In Proc. CVPR.Google ScholarGoogle Scholar
  41. Weise, T., Li, H., Gool, L. J. V., and Pauly, M. 2009. Face/Off: live facial puppetry. In Proc. SIGGRAPH/Eurographics Symposium on Computer Animation, 7--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Weise, T., Bouaziz, S., Li, H., and Pauly, M. 2011. Realtime performance-based facial animation. ACM TOG (Proc. SIGGRAPH) 30, 77:1--77:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Williams, L. 1990. Performance-driven facial animation. In Proc. SIGGRAPH, 235--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Wilson, C. A., Ghosh, A., Peers, P., Chiang, J.-Y., Busch, J., and Debevec, P. 2010. Temporal upsampling of performance geometry using photometric alignment. ACM TOG 29, 17:1--17:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Xiao, J., Baker, S., Matthews, I., and Kanade, T. 2004. Real-time combined 2D+3D active appearance models. In Proc. CVPR, 535--542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Zhang, L., Noah, Curless, B., and Seitz, S. M. 2004. Spacetime faces: high resolution capture for modeling and animation. ACM TOG (Proc. SIGGRAPH) 23, 548--558. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Reconstructing detailed dynamic face geometry from monocular video

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Graphics
              ACM Transactions on Graphics  Volume 32, Issue 6
              November 2013
              671 pages
              ISSN:0730-0301
              EISSN:1557-7368
              DOI:10.1145/2508363
              Issue’s Table of Contents

              Copyright © 2013 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 November 2013
              Published in tog Volume 32, Issue 6

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader