research-article

Bringing portraits to life

Authors:
Hadar Averbuch-Elor

Tel-Aviv University

Tel-Aviv University
View Profile

,
Daniel Cohen-Or

Tel-Aviv University

Tel-Aviv University
View Profile

,
Johannes Kopf

Facebook

Facebook
View Profile

,
Michael F. Cohen

Facebook

Facebook
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 36 Issue 6Article No.: 196pp 1–13https://doi.org/10.1145/3130800.3130818

Published:20 November 2017Publication History

ACM Transactions on Graphics

Abstract

We present a technique to automatically animate a still portrait, making it possible for the subject in the photo to come to life and express various emotions. We use a driving video (of a different subject) and develop means to transfer the expressiveness of the subject in the driving video to the target portrait. In contrast to previous work that requires an input video of the target face to reenact a facial performance, our technique uses only a single target image. We animate the target image through 2D warps that imitate the facial transformations in the driving video. As warps alone do not carry the full expressiveness of the face, we add fine-scale dynamic details which are commonly associated with facial expressions such as creases and wrinkles. Furthermore, we hallucinate regions that are hidden in the input target face, most notably in the inner mouth. Our technique gives rise to reactive profiles, where people in still images can automatically interact with their viewers. We demonstrate our technique operating on numerous still portraits from the internet.

Supplemental Material

Available for Download

zip

a196-averbuch-elor.zip (31.9 MB)

Supplemental material.

References

Jiamin Bai, Aseem Agarwala, Maneesh Agrawala, and Ravi Ramamoorthi. 2013. Automatic cinemagraph portraits. In Computer Graphics Forum, Vol. 32. Wiley Online Library, 17--25. Google ScholarDigital Library
Volker Blanz, Curzio Basso, Tomaso Poggio, and Thomas Vetter. 2003. Reanimating faces in images and video. In Computer graphics forum, Vol. 22. Wiley Online Library, 641--650.Google Scholar
Volker Blanz and Thomas Vetter. 1999. A morphable model for the synthesis of 3D faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 187--194. Google ScholarDigital Library
Jean-Yves Bouguet. 2001. Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm. Intel Corporation 5, 1--10 (2001), 4.Google Scholar
Pia Breuer, Kwang-In Kim, Wolf Kienzle, Bernhard Scholkopf, and Volker Blanz. 2008. Automatic 3D face reconstruction from single images or video. In Automatic Face & Gesture Recognition, 2008. FG'08. 8th IEEE International Conference on. IEEE, 1--8.Google ScholarCross Ref
Chen Cao, Derek Bradley, Kun Zhou, and Thabo Beeler. 2015. Real-time high-fidelity facial performance capture. ACM Transactions on Graphics (TOG) 34, 4 (2015), 46. Google ScholarDigital Library
Chen Cao, Yanlin Weng, Shun Zhou, Yiying Tong, and Kun Zhou. 2014. Faceware-house: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics 20, 3 (2014), 413--425. Google ScholarDigital Library
Chen Cao, Hongzhi Wu, Yanlin Weng, Tianjia Shao, and Kun Zhou. 2016. Real-time facial animation with image-based dynamic avatars. ACM Transactions on Graphics (TOG) 35, 4 (2016), 126. Google ScholarDigital Library
Erika Chuang and Christoph Bregler. 2005. Mood swings: expressive speech animation. ACM Transactions on Graphics (TOG) 24, 2 (2005), 331--347. Google ScholarDigital Library
T. F. Cootes. Talking face video. http://www-prima.inrialpes.fr/FGnet/data/01-TalkingFace/talking_face.html (????).Google Scholar
Kevin Dale, Kalyan Sunkavalli, Micah K Johnson, Daniel Vlasic, Wojciech Matusik, and Hanspeter Pfister. 2011. Video face replacement. ACM Transactions on Graphics (TOG) 30, 6 (2011), 130. Google ScholarDigital Library
Changxing Ding and Dacheng Tao. 2016. A comprehensive survey on pose-invariant face recognition. ACM Transactions on Intelligent Systems and Technology (TIST) 7, 3 (2016), 37. Google ScholarDigital Library
Ohad Fried, Eli Shechtman, Dan B Goldman, and Adam Finkelstein. 2016. Perspective-aware Manipulation of Portrait Photos. (2016).Google Scholar
Yaroslav Ganin, Daniil Kononenko, Diana Sungatullina, and Victor Lempitsky. 2016. DeepWarp: Photorealistic image resynthesis for gaze manipulation. In European Conference on Computer Vision. Springer, 311--326.Google ScholarCross Ref
Pablo Garrido, Levi Valgaerts, Ole Rehmsen, Thorsten Thormahlen, Patrick Perez, and Christian Theobalt. 2014. Automatic face reenactment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4217--4224. Google ScholarDigital Library
Pablo Garrido, Levi Valgaerts, Hamid Sarmadi, Ingmar Steiner, Kiran Varanasi, Patrick Perez, and Christian Theobalt. 2015. Vdub: Modifying face video of actors for plausible visual alignment to a dubbed audio track. In Computer Graphics Forum, Vol. 34. Wiley Online Library, 193--204. Google ScholarDigital Library
Tal Hassner, Shai Harel, Eran Paz, and Roee Enbar. 2015. Effective face frontalization in unconstrained images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4295--4304.Google ScholarCross Ref
Alexander Hornung, Ellen Dekkers, and Leif Kobbelt. 2007. Character animation from 2D pictures and 3D motion data. ACM Transactions on Graphics (TOG) 26, 1 (2007). Google ScholarDigital Library
Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, and Shigeo Morishima. 2013. Photorealistic inner mouth expression in speech animation. In ACM SIGGRAPH 2013 Posters. ACM, 9. Google ScholarDigital Library
Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, and Shigeo Morishima. 2014. Data-driven speech animation synthesis focusing on realistic inside of the mouth. Journal of information processing 22, 2 (2014), 401--409.Google ScholarCross Ref
Ira Kemelmacher-Shlizerman, Aditya Sankar, Eli Shechtman, and Steven M Seitz. 2010. Being john malkovich. In European Conference on Computer Vision. 341--353. Google ScholarDigital Library
Davis E King. 2009. Dlib-ml: A machine learning toolkit. J. Mach. Learning Research 10 (2009), 1755--1758. Google ScholarDigital Library
Iryna Korshunova, Wenzhe Shi, Joni Dambre, and Lucas Theis. 2016. Fast face-swap using convolutional neural networks. arXiv preprint arXiv:1611.09577 (2016).Google Scholar
Claudia Kuster, Tiberiu Popa, Jean-Charles Bazin, Craig Gotsman, and Markus Gross. 2012. Gaze correction for home video conferencing. ACM Transactions on Graphics (TOG) 31, 6 (2012), 174. Google ScholarDigital Library
Tommer Leyvand, Daniel Cohen-Or, Gideon Dror, and Dani Lischinski. 2008. Data-driven enhancement of facial attractiveness. In ACM Transactions on Graphics (TOG), Vol. 27. ACM, 38. Google ScholarDigital Library
Kai Li, Feng Xu, Jue Wang, Qionghai Dai, and Yebin Liu. 2012. A data-driven approach for facial expression synthesis in video. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 57--64. Google ScholarDigital Library
Zicheng Liu, Ying Shan, and Zhengyou Zhang. 2001. Expressive expression mapping with ratio images. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques. ACM, 271--276. Google ScholarDigital Library
Iacopo Masi, Anh Tuan Tran, Jatuporn Toy Leksut, Tal Hassner, and Gérard G. Medioni. 2016. Do We Really Need to Collect Millions of Faces for Effective Face Recognition? CoRR abs/1603.07057 (2016). http://arxiv.org/abs/1603.07057Google Scholar
Maja Pantic, Michel Valstar, Ron Rademaker, and Ludo Maat. 2005. Web-based database for facial expression analysis. In 2005 IEEE international conference on multimedia and Expo. IEEE, 5--pp.Google ScholarCross Ref
Patrick Pérez, Michel Gangnet, and Andrew Blake. 2003. Poisson image editing. In ACM Transactions on Graphics (TOG), Vol. 22. ACM, 313--318. Google ScholarDigital Library
Marcel Piotraschke and Volker Blanz. 2016. Automated 3d face reconstruction from multiple images using quality measures. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3418--3427.Google ScholarCross Ref
Carsten Rother, Vladimir Kolmogorov, and Andrew Blake. 2004. Grabcut: Interactive foreground extraction using iterated graph cuts. In ACM transactions on graphics (TOG), Vol. 23. ACM, 309--314. Google ScholarDigital Library
Shunsuke Saito, Tianye Li, and Hao Li. 2016. Real-time facial segmentation and performance capture from rgb input. In European Conference on Computer Vision. Springer, 244--261.Google ScholarCross Ref
Jason M Saragih, Simon Lucey, and Jeffrey F Cohn. 2011. Real-time avatar animation from a single image. In Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on. IEEE, 117--124.Google Scholar
Xiaoyong Shen, Aaron Hertzmann, Jiaya Jia, Sylvain Paris, Brian Price, Eli Shechtman, and Ian Sachs. 2016. Automatic Portrait Segmentation for Image Stylization. In Computer Graphics Forum, Vol. 35. Wiley Online Library, 93--102.Google Scholar
Zhixin Shu, Eli Shechtman, Dimitris Samaras, and Sunil Hadap. 2016. EyeOpener: Editing Eyes in the Wild. ACM Transactions on Graphics (TOG) 36, 1 (2016), 1. Google ScholarDigital Library
Yaniv Taigman, Adam Polyak, and Lior Wolf. 2016. Unsupervised Cross-Domain Image Generation. arXiv preprint arXiv:1611.02200 (2016).Google Scholar
Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2016. Face2face: Real-time face capture and reenactment of rgb videos. Proc. Computer Vision and Pattern Recognition (CVPR), IEEE 1 (2016).Google ScholarCross Ref
Michel Valstar and Maja Pantic. 2010. Induced disgust, happiness and surprise: an addition to the mmi facial expression database. In Proc. 3rd Intern. Workshop on EMOTION (satellite of LREC): Corpora for Research on Emotion and Affect. 65.Google Scholar
Daniel Vlasic, Matthew Brand, Hanspeter Pfister, and Jovan Popović. 2005. Face transfer with multilinear models. In ACM Transactions on Graphics (TOG), Vol. 24. ACM, 426--433. Google ScholarDigital Library
Fei Yang, Lubomir Bourdev, Eli Shechtman, Jue Wang, and Dimitris Metaxas. 2012. Facial expression editing in video using a temporally-smooth factorization. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 861--868. Google ScholarDigital Library
Fei Yang, Jue Wang, Eli Shechtman, Lubomir Bourdev, and Dimitri Metaxas. 2011. Expression flow for 3D-aware face component transfer. In ACM Transactions on Graphics (TOG), Vol. 30. ACM, 60. Google ScholarDigital Library
Raymond Yeh, Ziwei Liu, Dan B Goldman, and Aseem Agarwala. 2016. Semantic Facial Expression Editing using Autoencoded Flow. arXiv preprint arXiv:1611.09961 (2016).Google Scholar
Shizhe Zhou, Hongbo Fu, Ligang Liu, Daniel Cohen-Or, and Xiaoguang Han. 2010. Parametric reshaping of human bodies in images. ACM Transactions on Graphics (TOG) 29, 4 (2010), 126. Google ScholarDigital Library

Index Terms

Bringing portraits to life
1. Computing methodologies
  1. Computer graphics
    1. Animation

Recommendations

Deep video portraits

We present a novel approach that enables photo-realistic re-animation of portrait videos using only an input video. In contrast to existing approaches that are restricted to manipulations of facial expressions only, we are the first to transfer the full ...
Read More
Audio-driven talking face generation with diverse yet realistic facial animations
Abstract
Audio-driven talking face generation, which aims to synthesize talking faces with realistic facial animations (including accurate lip movements, vivid facial expression details and natural head poses) corresponding to the audio, has achieved ...
Highlights
- Generate diverse yet realistic talking faces from the same input audio.
- Network for modelling the uncertain relations between audio and visual signals.
- Novel technique that enables to generate temporally coherent talking faces.
Read More
Exploring photobios

We present an approach for generating face animations from large image collections of the same person. Such collections, which we call photobios, sample the appearance of a person over changes in pose, facial expression, hairstyle, age, and other ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Graphics Volume 36, Issue 6
December 2017
973 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3130800
Editor:
Kavita Bala
Issue’s Table of Contents
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 November 2017
Published in tog Volume 36, Issue 6

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
face animation
facial reenactment
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 124
  Total Citations
  View Citations
- 1,726
  Total Downloads
- Downloads (Last 12 months)92
- Downloads (Last 6 weeks)19
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Bringing portraits to life

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Deep video portraits

Audio-driven talking face generation with diverse yet realistic facial animations

Exploring photobios

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Bringing portraits to life

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Deep video portraits

Audio-driven talking face generation with diverse yet realistic facial animations

Exploring photobios

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media