Article

Simulating speech with a physics-based facial muscle model

Authors:

Eftychios Sifakis,

Avram Robinson-Mosher,

Ronald FedkiwAuthors Info & Claims

SCA '06: Proceedings of the 2006 ACM SIGGRAPH/Eurographics symposium on Computer animation

Pages 261 - 270

Published: 02 September 2006 Publication History

Abstract

We present a physically based system for creating animations of novel words and phrases from text and audio input based on the analysis of motion captured speech examples. Leading image based techniques exhibit photo-real quality, yet lack versatility especially with regard to interactions with the environment. Data driven approaches that use motion capture to deform a three dimensional surface often lack any anatomical or physically based structure, limiting their accuracy and realism. In contrast, muscle driven physics-based facial animation systems can trivially integrate external interacting objects and have the potential to produce very realistic animations as long as the underlying model and simulation framework are faithful to the anatomy of the face and the physics of facial tissue deformation. We start with a high resolution, anatomically accurate flesh and muscle model built for a specific subject. Then we translate a motion captured training set of speech examples into muscle activation signals, and subsequently segment those into intervals corresponding to individual phonemes. Finally, these samples are used to synthesize novel words and phrases. The versatility of our approach is illustrated by combining this novel speech content with various facial expressions, as well as interactions with external objects.

References

[1]

{Ann06} Annosoft, LIC: Basic phoneset, 2006. Available at http://www.annosoft.com/phoneset.htm.

[2]

{BB02} Byun M., Badler N. I.: FacEMOTE: Qualitative parametric modifiers for facial animations. In Proc. of ACM SIGGRAPH/Eurographics Symp. on Comput. Anim. (2002), ACM Press, pp. 65--71.

Digital Library

[3]

{BBPV03} Blanz V., Basso C., Poggio T., Vetter T.: Reanimating faces in images and video. In Proc. of Eurographics (2003), vol. 22.

[4]

{BCS97} Bregler C., Covell M., Slaney M.: Video Rewrite: driving visual speech with audio. In Proc. of ACM SIGGRAPH (1997), pp. 353--360.

Digital Library

[5]

{BOP98a} Basu S., Oliver N., Pentland A.: 3D lip shapes from video: a combined physical-statistical model. Speech Communication 26 (1998), 131--148.

Digital Library

[6]

{BOP98b} Basu S., Oliver N., Pentland A.: 3D modeling and tracking of human lip motions. IEEE Comput. Society, pp. 337--343.

Digital Library

[7]

{Bra99} Brand M.: Voice puppetry. In Proc. of ACM SIGGRAPH (1999), pp. 21--28.

Digital Library

[8]

{BSVS04} Blanz V., Scherbaum K., Vetter T., Seidel H. P.: Exchanging faces in images. In Proc. of Eurographics (2004), vol. 23.

[9]

{BTC99} Black A., Taylor P., Caley R.: The festival speech synthesis system.

[10]

{BV99} Blanz V., Vetter T.: A morphable model for the synthesis of 3D faces. In Proc. of ACM SIGGRAPH (1999), ACM Press, pp. 187--194.

Digital Library

[11]

{CB05} Chuang E., Bregler C.: Mood swings: expressive speech animation. ACM Trans. Graph. 24, 2 (2005), 331--347.

Digital Library

[12]

{CE05} Chang Y., Ezzat T.: Transferable videorealistic speech animation. Eurographics/ACM SIGGRAPH Symp. on Comput. Anim. (2005).

Digital Library

[13]

{CFKP04} Cao Y., Faloutsos P., Kohler E., Pighin F.: Real-time speech motion synthesis from recorded motions. In Proc. of 2003 ACM SIGGRAPH/Eurographics Symp. on Comput. Anim. (2004), pp. 347--355.

Digital Library

[14]

{CFP03} Cao Y., Faloutsos P., Pighin F.: Unsupervised learning for speech motion editing. In Proc. of the ACM SIGGRAPH/Eurographics Symp. on Comput. Anim. (2003), pp. 225--231.

Digital Library

[15]

{CK01} Choe B., Ko H.-S.: Analysis and synthesis of facial expressions with hand-generated muscle actuation basis. In Proc. of Comput. Anim. (2001), pp. 12--19.

[16]

{CLK01} Choe B., Lee H., Ko H.-S.: Performance-driven muscle-based facial animation. J. Vis. and Comput. Anim. 12 (2001), 67--79.

[17]

{CM93} Cohen M., Massaro D.: Modeling coarticulation in synthetic visual speech. Models and Techniques in Comput. Anim. (1993).

[18]

{CPB*94} Cassell J., Pelachaud C., Badler N., Steedman M., Achorn B., Becket T., Doubille B., Prevost S., Stone M.: Animated conversation: Rule-based generation of facial expression, gesture and spoken intonation for multiple conversational agents. In Proc. of ACM SIGGRAPH (1994), ACM Press, pp. 413--420.

Digital Library

[19]

{CVB01} Cassell J., Vilhjálmsson H. H., Bickmore T.: BEAT: the Behavior Expression Animation Toolkit. In Proc. of ACM SIGGRAPH (2001), pp. 477--486.

Digital Library

[20]

{CXH03} Chai J., Xiao J., Hodgins J.: Vision-based control of 3D facial animation. In Proc. of ACM SIGGRAPH/Eurographics Symp. on Comput. Anim. (2003), pp. 193--206.

Digital Library

[21]

{DLN05} Deng Z., Lewis J., Neumann U.: Synthesizing speech animation by learning compact speech co-articulation models. Comput. Graph. Int. (2005), 19--25.

Digital Library

[22]

{DMS98} Decarlo D., Metaxas D., Stone M.: An anthropometric face model using variational techniques. In Proc. of ACM SIGGRAPH (1998), ACM Press, pp. 67--74.

Digital Library

[23]

{EBDP96} Essa I., Basu S., Darrell T., Pentland A.: Modeling, tracking and interactive animation of faces and heads using input from video. In Proc. of Comput. Anim. (1996), IEEE Comput. Society, pp. 68--79.

Digital Library

[24]

{EF78} Ekman P., Friesen W. V.: Facial Action Coding System. Consulting Psychologist Press, Palo Alto, 1978.

[25]

{EGP02} Ezzat T., Geiger G., Poggio T.: Trainable videorealistic speech animation. In ACM Trans. Graph. (2002), vol. 21, ACM Press, pp. 388--398.

Digital Library

[26]

{EP97} Essa I., Pentland A.: Coding, analysis, interpretation, and recognition of facial expressions. IEEE Trans. on Pattern Analysis and Machine Intelligence 19, 7 (1997), 757--763.

Digital Library

[27]

{EP00} Ezzat T., Poggio T.: Visual speech synthesis by morphing visemes. In Int. J. Comp. Vision (2000), vol. 38, pp. 45--37.

Digital Library

[28]

{GGW*98} Guenter B., Grimm C., Wood D., Malvar H., Pighin F.: Making faces. In Proc. ACM SIGGRAPH (1998), ACM Press, pp. 55--66.

Digital Library

[29]

{JTDP03} Joshi P., Tien W. C., Desbrun M., Pighin F.: Learning controls for blend shape based realistic facial animation. In Proc. ACM SIGGRAPH/Eurographics Symp. on Comput. Anim. (2003), pp. 365--373.

Digital Library

[30]

{KGB98} Koch R., Gross M., Bosshard A.: Emotion editing using finite elements. Proc. of Eurographics 1998 17, 3 (1998).

[31]

{KGC*96} Koch R. M., Gross M. H., Carls F. R., Von Buren D. F., Fankhauser G., Parish Y. I. H.: Simulating facial surgery using finite element models. Comput. Graph. 30, Annual Conf. Series (1996), 421--428.

Digital Library

[32]

{KHS01} Kahler K., Haber J., Seidel H.-P.: Geometry-based muscle modeling for facial animation. In Proc. of Graph. Interface (2001), pp. 37--46.

Digital Library

[33]

{KHS03} Kahler K., Haber J., Seidel H.-P.: Reanimating the dead: Reconstruction of expressive faces from skull data. In ACM Trans. Graph. (2003), vol. 22, pp. 554--561.

Digital Library

[34]

{KHYS02} Kahler K., Haber J., Yamauchi H., Seidel H.-P.: Head shop: Generating animated head models with anatomical structure. In Proc. of ACM SIGGRAPH/Eurographics Symp. on Comput. Anim. (2002), pp. 55--63.

Digital Library

[35]

{KMMTT92} Kalra P., Mangili A., Magnetat-Thalmann N., Thalmann D.: Simulation of facial muscle actions based on rational free form deformations. In Proc. of Eurographics (1992), pp. 59--69.

[36]

{KMT03} Kshirsagar S., Magnenat-Thalmann N.: Visyllable based speech animation. In Proc. of Eurographics (2003), vol. 22.

[37]

{KP05} King S., Parent R.: Creating speech-synchronized animation. IEEE Transactions on Visualization and Computer Graphics 11, 3 (2005), 341--352.

Digital Library

[38]

{LM99} Lucero J., Munhall K.: A model of facial biomechanics for speech production. Journal of the Accoustical Society of America 106, 5 (1999), 2834--2842.

[39]

{LMVB*97} Lucero J., Munhall K., Vatikiotis-Bateson E., Gracco V., Terzopoulos D.: Muscle-based modeling of facial dynamics during speech production. Journal of the Acoustical Society of America 101, 5 (May 1997), 3175--3176.

[40]

{LTW95} Lee Y., Terzopoulos D., Waters K.: Realistic modeling for facial animation. Comput. Graph. (SIGGRAPH Proc.) (1995), 55--62.

Digital Library

[41]

{MIT98} Morishima S., Ishikawa T., Terzopoulos D.: Facial muscle parameter decision from 2D frontal image. In Proc. of the Int. Conf. on Pattern Recognition (1998), vol. 1, pp. 160--162.

Digital Library

[42]

{MTPT88} Magnenat-Thalmann N., Primeau E., Thalmann D.: Abstract muscle action procedures for human face animation. The Vis. Comput. 3, 5 (1988), 290--297.

[43]

{NJ04} Na K., Jung M.: Hierarchical retargetting of fine facial motions. In Proc. of Eurographics (2004), vol. 23.

[44]

{NN01} Noh J., Neumann U.: Expression cloning. In Proc. of ACM SIGGRAPH (2001), Fiume E., (Ed.), ACM Press, pp. 277--288.

Digital Library

[45]

{Par72} Parke F. I.: Computer generated animation of faces. In Proc. of ACM Conf. (1972), ACM Press, pp. 451--457.

Digital Library

[46]

{PB81} Platt S. M., Badler N. I.: Animating facial expressions. Comput. Graph. (SIGGRAPH Proc.) (1981), 245--252.

Digital Library

[47]

{PHL*98} Pighin F., Hecker J., Lischinski D., Szeliski R., Salesin D. H.: Synthesizing realistic facial expressions from photographs. In Proc. of ACM SIGGRAPH (1998), ACM Press, pp. 75--84.

Digital Library

[48]

{PKC*03} Pyun H., Kim Y., Chae W., Kang H. W., Shin S. Y.: An example-based approach for facial expression cloning. In Proc. of ACM SIGGRAPH/Eurographics Symp. on Comput. Anim. (2003), pp. 167--176.

Digital Library

[49]

{PLB*05} Pighin F., Lewis J., Borshukov G., Bennett D., Debevec P., Hery C., Sullivan S., Williams L., Zhang L.: Digital face cloning. In SIGGRAPH Course Notes (2005), ACM.

Digital Library

[50]

{PSS99} Pighin F., Szeliski R., Salesin D.: Resynthesizing facial animation through 3D model-based tracking. In Proc. of Int. Conf. on Comput. Vision (1999), pp. 143--150.

[51]

{RE01} Reveret L., Essa I.: Visual coding and tracking of speech related facial motion. In Proc. of IEEE CVPR Int. Wrkshp. on Cues in Communication (2001).

[52]

{RGTC98} Roth S. H., Gross M., Turello M. H., Carls S.: A Bernstein-Bézier based approach to soft tissue simulation. In Proc. of Eurographics (1998), vol. 17, pp. 285--294.

[53]

{SNF05} Sifakis E., Neverov I., Fedkiw R.: Automatic determination of facial muscle activations from sparse motion capture marker data. ACM Trans. Graph. (SIGGRAPH Proc.) (2005).

Digital Library

[54]

{SP04} Sumner R., Popović J.: Deformation transfer for triangle meshes. In ACM Transactions on Graphics (Proc. of ACM SIGGRAPH) (2004), vol. 23, pp. 399--405.

Digital Library

[55]

{TSIF05} Teran J., Sifakis E., Irving G., Fedkiw R.: Robust quasistatic finite elements and flesh simulation. Proc. of the 2005 ACM SIGGRAPH/Eurographics Symp. on Comput. Anim. (2005).

Digital Library

[56]

{TW90} Terzopoulos D., Waters K.: Physically-based facial modeling, analysis, and animation. J. Vis. and Comput. Anim. 1 (1990), 73--80.

[57]

{TW93} Terzopoulos D., Waters K.: Analysis and synthesis of facial image sequences using physical and anatomical models. IEEE Trans. on Pattern Analysis and Machine Intelligence 15, 6 (1993).

Digital Library

[58]

{VBMH*96} Vatikiotis-Bateson E., Munhall K., Hirayama M., Lee Y., Terzopoulos D.: Dynamics of facial motion in speech: Kinematic and electromyographic studies of orofacial structures. In Speechreading by Humans and Machines, vol. 150 of NATO ASI Series on Computer and System Sciences. Springer-Verlag, March 1996, ch. 16, pp. 231--232.

[59]

{VBPP05} Vlasic D., Brand M., Pfister H., Popović J.: Face transfer with multilinear models. In ACM Transactions on Graphics (Proc. of ACM SIGGRAPH) (2005), vol. 24, pp. 426--433.

Digital Library

[60]

{Wat87} Waters K.: A muscle model for animating three-dimensional facial expressions. Comput. Graph. (SIGGRAPH Proc.) (1987), 17--24.

Digital Library

[61]

{WF95} Waters K., Frisbie J.: A coordinated muscle model for speech animation. In Proc. of Graph. Interface (May 1995), pp. 163--170.

[62]

{WHL*04} Wang Y., Huang X., Lee C. S., Zhang S., Li Z., Samaras D., Metaxas D., Elgammal A., Huang P.: High resolution acquisition, learning and transfer of dynamic 3-D facial expressions. In Proc. of Eurographics (2004), pp. 677--686.

[63]

{Wil90} Williams L.: Performance-driven facial animation. In Comput. Graph. (Proc. of Int. Conf. on Comput. Graph, and Int. Techniques) (1990), ACM Press, pp. 235--242.

Digital Library

[64]

{ZLGS03} Zhang Q., Liu Z., Guo B., Shum H.: Geometry-driven photorealistic facial expression synthesis. In Proc. of ACM SIGGRAPH/Eurographics Symp. on Comput. Anim. (2003), ACM Press, pp. 16--22.

Digital Library

[65]

{ZSCS04} Zhang L., Snavely N., Curless B., Seitz S.: Spacetime faces: High resolution capture for modeling and animation. In ACM Transactions on Graphics (Proc. of ACM SIGGRAPH) (2004), vol. 23, ACM Press, pp. 548--558.

Digital Library

Cited By

Mokaya FNoh HLucas RZhang P(2018)MyoVibeACM Transactions on Sensor Networks10.1145/314912714:1(1-26)Online publication date: 7-Mar-2018
https://dl.acm.org/doi/10.1145/3149127
Lan LCong MFedkiw RPearce AGossett CHorvath CPhillips C(2017)Lessons from the evolution of an anatomical facial muscle modelProceedings of the ACM SIGGRAPH Digital Production Symposium10.1145/3105692.3105693(1-3)Online publication date: 29-Jul-2017
https://dl.acm.org/doi/10.1145/3105692.3105693
Yu JLi LZou J(2017)Realistic emotion visualization by combining facial animation and hairstyle synthesisMultimedia Tools and Applications10.1007/s11042-016-4239-876:13(14905-14919)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1007/s11042-016-4239-8
Show More Cited By

Index Terms

Simulating speech with a physics-based facial muscle model
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
        3D imaging
  2. Computer graphics
    1. Animation
    2. Shape modeling
2. Theory of computation
  1. Randomness, geometry and discrete structures
    1. Computational geometry

Recommendations

Muscle-based facial retargeting with anatomical constraints
SIGGRAPH '19: ACM SIGGRAPH 2019 Talks

We present a physically based facial retargeting algorithm that is suitable for use in high-end production. Given an actor's facial performance, we first run a targeted muscle simulation on the actor in order to determine the actor blendshape muscles ...
Geometry-based muscle modeling for facial animation
GI '01: Proceedings of Graphics Interface 2001

We present a muscle model and methods for muscle construction that allow to easily create animatable facial models from given face geometry. Using our editing tool, one can interactively specify coarse outlines of the muscles, which are then ...
A facial animation model for expressive audio-visual speech

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SCA '06: Proceedings of the 2006 ACM SIGGRAPH/Eurographics symposium on Computer animation

September 2006

370 pages

ISBN:3905673347

General Chairs:
Carol O'Sullivan
Trinity College Dublin, Ireland
,
Fred Pighin
Industrial Light and Magic, USA

Sponsors

SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
EUROGRAPHICS: The European Association for Computer Graphics

Publisher

Eurographics Association

Goslar, Germany

Publication History

Published: 02 September 2006

Check for updates

Qualifiers

Article

Conference

SCA06

Sponsor:

SIGGRAPH
EUROGRAPHICS

SCA06: The ACM SIGGRAPH / Eurographics Symposium on Computer Animation

September 2 - 4, 2006

Vienna, Austria

Acceptance Rates

Overall Acceptance Rate 183 of 487 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
1,096
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mokaya FNoh HLucas RZhang P(2018)MyoVibeACM Transactions on Sensor Networks10.1145/314912714:1(1-26)Online publication date: 7-Mar-2018
https://dl.acm.org/doi/10.1145/3149127
Lan LCong MFedkiw RPearce AGossett CHorvath CPhillips C(2017)Lessons from the evolution of an anatomical facial muscle modelProceedings of the ACM SIGGRAPH Digital Production Symposium10.1145/3105692.3105693(1-3)Online publication date: 29-Jul-2017
https://dl.acm.org/doi/10.1145/3105692.3105693
Yu JLi LZou J(2017)Realistic emotion visualization by combining facial animation and hairstyle synthesisMultimedia Tools and Applications10.1007/s11042-016-4239-876:13(14905-14919)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1007/s11042-016-4239-8
Yu JJiang CWang Z(2017)Creating and simulating a realistic physiological tongue model for speech productionMultimedia Tools and Applications10.1007/s11042-016-3929-676:13(14673-14689)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1007/s11042-016-3929-6
Miguel EMiraut DOtaduy MGomes APurgathofer WJorge JLin M(2016)Modeling and estimation of energy-based hyperelastic objectsProceedings of the 37th Annual Conference of the European Association for Computer Graphics10.5555/3058909.3058960(385-396)Online publication date: 9-May-2016
https://dl.acm.org/doi/10.5555/3058909.3058960
Edwards PLandreth CFiume ESingh K(2016)JALIACM Transactions on Graphics10.1145/2897824.292598435:4(1-11)Online publication date: 11-Jul-2016
https://dl.acm.org/doi/10.1145/2897824.2925984
Cong MBao ME JBhat KFedkiw RBarbič JDeng Z(2015)Fully automatic generation of anatomical face simulation modelsProceedings of the 14th ACM SIGGRAPH / Eurographics Symposium on Computer Animation10.1145/2786784.2786786(175-183)Online publication date: 7-Aug-2015
https://dl.acm.org/doi/10.1145/2786784.2786786
Mokaya FLucas RNoh HZhang PMase KLangheinrich MGatica-Perez DGellersen HChoudhury TYatani K(2015)MyoVibeProceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing10.1145/2750858.2804258(27-38)Online publication date: 7-Sep-2015
https://dl.acm.org/doi/10.1145/2750858.2804258
Garrido PValgaerts LSarmadi HSteiner IVaranasi KPérez PTheobalt C(2015)VDubComputer Graphics Forum10.1111/cgf.1255234:2(193-204)Online publication date: 1-May-2015
https://dl.acm.org/doi/10.1111/cgf.12552
Alkawaz MMohamad DBasori ASaba T(2015)Blend Shape Interpolation and FACS for Realistic Avatar3D Research10.1007/s13319-015-0038-76:1(1-10)Online publication date: 1-Mar-2015
https://dl.acm.org/doi/10.1007/s13319-015-0038-7
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten