research-article

Gesture modeling and animation based on a probabilistic re-creation of speaker style

Authors:

Irene Albrecht,

Hans-Peter SeidelAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 27, Issue 1

Article No.: 5, Pages 1 - 24

https://doi.org/10.1145/1330511.1330516

Published: 20 March 2008 Publication History

Abstract

Animated characters that move and gesticulate appropriately with spoken text are useful in a wide range of applications. Unfortunately, this class of movement is very difficult to generate, even more so when a unique, individual movement style is required. We present a system that, with a focus on arm gestures, is capable of producing full-body gesture animation for given input text in the style of a particular performer. Our process starts with video of a person whose gesturing style we wish to animate. A tool-assisted annotation process is performed on the video, from which a statistical model of the person's particular gesturing style is built. Using this model and input text tagged with theme, rheme and focus, our generation algorithm creates a gesture script. As opposed to isolated singleton gestures, our gesture script specifies a stream of continuous gestures coordinated with speech. This script is passed to an animation system, which enhances the gesture description with additional detail. It then generates either kinematic or physically simulated motion based on this description. The system is capable of generating gesture animations for novel text that are consistent with a given performer's style, as was successfully validated in an empirical user study.

References

[1]

Albrecht, I., Haber, J., and Seidel, H.-P. 2002. Automatic generation of non-verbal facial expressions from speech. In Proceedings of the Computer Graphics International. 283--293.

[2]

Axtell, R. E. 1998. Gestures---The Do's and Taboo's of Body Language Around the World. John Wiley & Sons, Inc., New York.

[3]

Badler, N. I., Phillips, C. B., and Webber, B. L. 1992. Simulating Humans: Computer Graphics, Animation and Control. Oxford University Press.

Digital Library

[4]

Boersma, P. and Weenink, D. 2005. Praat: Doing phonetics by computer (version 4.3.14). http://www.praat.org/.

[5]

Calbris, G. 1990. Semiotics of French Gesture. Indiana University Press, Bloomington, IN.

[6]

Cassell, J., Nakano, Y., Bickmore, T., Sidner, C., and Rich, C. 2001. Non-verbal cues for discourse structure. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 106--115.

Digital Library

[7]

Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, T., Douville, B., Prevost, S., and Stone, M. 1994. Animated conversation: Rule-based generation of facial expression, gesture and spoken intonation for multiple conversational agents. In Proceedings of SIGGRAPH. 413--420.

Digital Library

[8]

Cassell, J., Vilhjálmsson, H., and Bickmore, T. 2001. BEAT: The Behavior Expression Animation Toolkit. In Proceedings of SIGGRAPH. 477--486.

Digital Library

[9]

Chi, D. M., Costa, M., Zhao, L., and Badler, N. I. 2000. The EMOTE model for effort and shape. In Proceedings of SIGGRAPH. 173--182.

Digital Library

[10]

Cohen, M. and Massaro, D. 1993. Modeling coarticulation in synthetic visual speech. In Models and Techniques in Computer Animation, N. Magnenat-Thalmann and D. Thalmann, Eds. Springer. 139--156.

[11]

de Ruiter, J. 2000. The production of gesture and speech. In Language and Gesture: Window into Thought and Action, D. McNeill, Ed. Cambridge University Press, Cambridge, UK. 284--311.

[12]

Faloutsos, P., van de Panne, M., and Terzopoulos, D. 2001. The virtual stuntman: Dynamic characters with a repertoire of autonomous motor skills. Comput. Graph. 25, 6, 933--953.

[13]

Finkler, W. and Neumann, G. 1988. MORPHIX. A Fast Realization of a Classification-Based Approach to Morphology. In Proceedings 4. Österreichische Artificial-Intelligence-Tagung. Wiener Workshop---Wissensbasierte Sprachverarbeitung. H. Trost, Ed. Springer. 11--19.

[14]

Frey, S. 1999. Die Macht des Bildes: der Einfluβ der nonverbalen Kommunikation auf Kultur und Politik. Verlag Hans Huber, Bern.

[15]

Hartmann, B., Mancini, M., Buisine, S., and Pelachaud, C. 2005. Design and evaluation of expressive gesture synthesis for embodied conversational agents. In Proceedings of the 4th International Joint Conference on Autonomous Agents and Multiagent Systems. ACM Press.

Digital Library

[16]

Hartmann, B., Mancini, M., and Pelachaud, C. 2002. Formational parameters and adaptive prototype installation for MPEG-4 compliant gesture synthesis. In Proceedings of Computer Animation. 111--119.

Digital Library

[17]

Hartmann, B., Mancini, M., and Pelachaud, C. 2006. Implementing expressive gesture synthesis for embodied conversational agents. In Proceedings of Gesture Workshop. Lecture Notes Artificial Intelligence, vol. 3881. Springer. 45--55.

Digital Library

[18]

Hodgins, J. K., Wooten, W. L., Brogan, D. C., and O'Brien, J. F. 1995. Animating human athletics. In Proceedings of SIGGRAPH. 71--78.

Digital Library

[19]

Hollars, M. G., Rosenthal, D. E., and Sherman, M. A. 1994. SD/FAST User's Manual. Symbolic Dynamics Inc.

[20]

Huenerfauth, M., Zhou, L., Gu, E., and Allbeck, J. 2007. Design and evaluation of an american sign language generator. In Proceedings of the Workshop on Embodied Language Processing. Association for Computational Linguistics. 51--58.

Digital Library

[21]

Jurafsky, D. and Martin, J. H. 2003. Speech and Language Processing. Prentice Hall.

Digital Library

[22]

Kendon, A. 1980. Gesticulation and speech: Two aspects of the process of utterance. In The Relationship of Verbal and Nonverbal Communication, M. Key, Ed. Mouton Publisher, The Hague, Netherlands. 207--227.

[23]

Kendon, A. 2004. Gesture---Visible Action as Utterance. Cambridge University Press, Cambridge.

[24]

Kipp, M. 2001. Anvil---a Generic Annotation Tool for Multimodal Dialogue. In Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech). Aalborg, Denmark. 1367--1370.

[25]

Kipp, M. 2004. Gesture generation by imitation: From human behavior to computer character animation. Dissertation.com, Boca Raton, FL.

[26]

Kipp, M., Neff, M., and Albrecht, I. 2006. An Annotation Scheme for Conversational Gestures: How to economically capture timing and form. In Proceedings of the Workshop on Multimodal Corpora (LREC'06). 24--27.

[27]

Kipp, M., Neff, M., Kipp, K., and Albrecht, I. 2007. Towards natural gesture synthesis: Evaluating gesture units in a data-driven approach to gesture synthesis. In Proceedings of Intelligent Virtual Agents (IVA'07). Lecture Notes in Artificial Intelligence, vol. 4722, 15--28.

Digital Library

[28]

Kita, S., van Gijn, I., and van der Hulst, H. 1998. Movement phases in signs and co-speech gestures, and their transcription by human coders. In Gesture and Sign Language in Human-Computer Interaction, I. Wachsmuth and M. Fröhlich, Eds. Springer, 23--35.

Digital Library

[29]

Kopp, S., Sowa, T., and Wachsmuth, I. 2004a. Imitation games with an artificial agent: From mimicking to understanding shape-related iconic gestures. In Proceedings of the Gesture Workshop. Lecture Notes in Artificial Intelligence, vol. 2915. Springer. 436--447.

[30]

Kopp, S., Tepper, P., and Cassell, J. 2004b. Towards integrated microplanning of language and iconic gesture for multimodal output. In Proceedings of the International Conference on Multimodal Interfaces. 97--104.

Digital Library

[31]

Kopp, S. and Wachsmuth, I. 2004c. Synthesizing multimodal utterances for conversational agents. Comput. Anim. Virt. Worlds 15, 39--52.

Digital Library

[32]

Mann, W. C. and Thompson, S. A. 1988. Rhetorical Structure Theory: Toward a functional theory of text organization. Text 8, 3, 243--281.

[33]

Martell, C. H. 2004. Form: An extensible, kinematically based gesture annotation scheme. In Natural, Intelligent and Effective Interaction in Multimodal Dialogue Systems. Kluwer Academic Press.

[34]

Martin, J.-C., Niewiadomski, R., Devillers, L., Buisine, S., and Pelachaud, C. 2006. Multimodal complex emotions: Gesture expressivity and blended facial expressions. J. Huma. Robot. 3, 269--291.

[35]

McNeill, D. 1992. Hand and Mind: What Gestures Reveal about Thought. University of Chicago Press, Chicago, IL.

[36]

McNeill, D. 2005. Gesture and Thought. University of Chicago Press, Chicago, IL.

[37]

Miller, G. A., Beckwith, R., Felbaum, C., Gross, D., and Miller, K. 1990. Introduction to WordNet: an online lexical database. Int. J. Lexicog. 3, 4, 235--244.

[38]

Neff, M. 2005. Aesthetic exploration and refinement: A computational framework for expressive character animation. Ph.D. Dissertation. Department of Computer Science, University of Toronto.

Digital Library

[39]

Neff, M. and Fiume, E. 2002. Modeling tension and relaxation for computer animation. In Proceedings of ACM SIGGRAPH Symposium on Computer Animation. 81--88.

Digital Library

[40]

Neff, M. and Fiume, E. 2005. AER: Aesthetic Exploration and Refinement for expressive character animation. In Proceedings of ACM SIGGRAPH / Eurographics Symposium on Computer Animation. 161--170.

Digital Library

[41]

Neff, M. and Fiume, E. 2006. Methods for exploring expressive stance. Graphic. Models 68, 2, 133--157.

Digital Library

[42]

Neff, M. and Seidel, H.-P. 2006. Modeling relaxed hand shape for character animation. In Articulated Motion and Deformable Objects (AMDO'06). Lecture Notes in Computer Science, vol. 4069. Springer.

Digital Library

[43]

Noma, T., Zhao, L., and Badler, N. 2000. Design of a virtual human presenter. IEEE Comput. Graph. Appl. 20, 4, 79--85.

Digital Library

[44]

Noot, H. and Ruttkay, Z. 2004. Gesture in style. In Proceedings of Gesture Workshop. Lecture Notes in Artificial Intelligence, vol. 2915. Springer, 324--337.

[45]

Pollard, N. S. and Zordan, V. B. 2005. Physically based grasping control from example. In Proceedings of ACM SIGGRAPH / Eurographics Symposium on Computer Animation. 311--318.

Digital Library

[46]

Popovic, Z. and Witkin, A. 1999. Physically based motion transformation. In Proceedings of SIGGRAPH. 11--20.

Digital Library

[47]

Press, W. H., Tukolsky, S. A., Vetterling, W. T., and Flannery, B. P. 1992. Numerical Recipes in C: The Art of Scientific Computing, 2nd ed. Cambridge University Press, Cambridge, UK.

Digital Library

[48]

Saitz, R. L. and Cervenka, E. J. 1972. Handbook of Gestures: Colombia and the United States, second ed. Mouton, The Hague, The Netherlands.

[49]

Scheflen, A. E. 1964. The significance of posture in communication systems. Psychiatry 26, 316--331.

[50]

Schegloff, E. A. 1984. On some gestures' relation to talk. In Structures of Social Action. J. M. Atkinson and J. Heritage, Eds. Cambridge University Press, Cambridge, UK. 266--296.

[51]

Shapiro, A., Faloutsos, P., and Ng-Thow-Hing, V. 2005. Dynamic animation and control environment. Graphics Interface, 61--70.

Digital Library

[52]

Steedman, M. 2000. Information structure and the syntax-phonology interface. Linguist. Inq. 34, 649--689.

[53]

Stone, M., DeCarlo, D., Oh, I., Rodriguez, C., Stere, A., Lees, A., and Bregler, C. 2004. Speaking with hands: Creating animated conversational characters from recordings of human performance. In Proceedings of SIGGRAPH. 506--513.

Digital Library

[54]

Tan, R. and Davis, J. 2004. Differential video coding of face and gesture events in presentation videos. Comput. Vision Image Understand. 96, 2, 200--215.

Digital Library

[55]

Webb, R. 1997. Linguistic Properties of Metaphoric Gestures. UMI, New York, NY.

[56]

Witkin, A. and Kass, M. 1988. Spacetime constraints. In Proceedings of SIGGRAPH. 159--168.

Digital Library

[57]

Witkin, A. and Popovic, Z. 1995. Motion warping. In Proceedings of SIGGRAPH. 105--108.

Digital Library

[58]

Zordan, V. B. and Hodgins, J. K. 2002. Motion capture-driven simulations that hit and react. In Proceedings of ACM SIGGRAPH Symposium on Computer Animation. 89--96.

Digital Library

Cited By

Nakano YNihei FIshii RHigashinaka R(2024)Selecting Iconic Gesture Forms Based on Typical Entity ImagesJournal of Information Processing10.2197/ipsjjip.32.19632(196-205)Online publication date: 2024
https://doi.org/10.2197/ipsjjip.32.196
Zhang ZAo TZhang YGao QLin CChen BLiu L(2024)Semantic Gesticulator: Semantics-Aware Co-Speech Gesture SynthesisACM Transactions on Graphics10.1145/365813443:4(1-17)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1145/3658134
Zhang J(2024)Actor Takeover of Animated Characters2024 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)10.1109/VRW62533.2024.00361(1134-1135)Online publication date: 16-Mar-2024
https://doi.org/10.1109/VRW62533.2024.00361
Show More Cited By

Index Terms

Gesture modeling and animation based on a probabilistic re-creation of speaker style
1. Computing methodologies
  1. Computer graphics
    1. Animation

Recommendations

Segmented gesture recognition for controlling character animation
VRST '08: Proceedings of the 2008 ACM symposium on Virtual reality software and technology

In this paper, we propose a method which uses vision-based gesture recognition to control character animation. Each animation sequence has a corresponding gesture to be recognized, and we focus on upper-body motions and use one camera to capture images. ...
Building Hand Motion-Based Character Animation: The Case of Puppetry
CW '10: Proceedings of the 2010 International Conference on Cyberworlds

Automatic motion generation for digital character under the real-time user control is a challenging problem for computer graphic research and virtual environment applications such as on-line games. The present study introduces a methodology to generate ...
Layered acting for character animation

We introduce an acting-based animation system for creating and editing character animation at interactive speeds. Our system requires minimal training, typically under an hour, and is well suited for rapidly prototyping and creating expressive motion. A ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 27, Issue 1

March 2008

135 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/1330511

Issue’s Table of Contents

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 March 2008

Accepted: 01 November 2007

Revised: 01 May 2007

Received: 01 August 2006

Published in TOG Volume 27, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Bundesministerium für Bildung und Forschung

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

181
Total Citations
View Citations
1,860
Total Downloads

Downloads (Last 12 months)59
Downloads (Last 6 weeks)2

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Nakano YNihei FIshii RHigashinaka R(2024)Selecting Iconic Gesture Forms Based on Typical Entity ImagesJournal of Information Processing10.2197/ipsjjip.32.19632(196-205)Online publication date: 2024
https://doi.org/10.2197/ipsjjip.32.196
Zhang ZAo TZhang YGao QLin CChen BLiu L(2024)Semantic Gesticulator: Semantics-Aware Co-Speech Gesture SynthesisACM Transactions on Graphics10.1145/365813443:4(1-17)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1145/3658134
Zhang J(2024)Actor Takeover of Animated Characters2024 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)10.1109/VRW62533.2024.00361(1134-1135)Online publication date: 16-Mar-2024
https://doi.org/10.1109/VRW62533.2024.00361
Cheng YJiang YWang Y(2024)Music-stylized hierarchical dance synthesis with user controlVirtual Reality & Intelligent Hardware10.1016/j.vrih.2024.06.0046:5(339-357)Online publication date: Oct-2024
https://doi.org/10.1016/j.vrih.2024.06.004
Qian XTang HYang JZhu HYin X(2024)Dual-Path Transformer-Based GAN for Co-speech Gesture SynthesisInternational Journal of Social Robotics10.1007/s12369-024-01136-yOnline publication date: 13-May-2024
https://doi.org/10.1007/s12369-024-01136-y
Kim HAli GHan BKim HKim JShin HKim GHwang J(2024)ASAP for multi-outputs: auto-generating storyboard and pre-visualization with virtual actors based on screenplayMultimedia Tools and Applications10.1007/s11042-024-19904-3Online publication date: 3-Aug-2024
https://doi.org/10.1007/s11042-024-19904-3
Fares MPelachaud CObin N(2023)Zero-shot style transfer for gesture animation driven by text and speech using adversarial disentanglement of multimodal style encodingFrontiers in Artificial Intelligence10.3389/frai.2023.11429976Online publication date: 12-Jun-2023
https://doi.org/10.3389/frai.2023.1142997
Oralbayeva NAly ASandygulova ABelpaeme T(2023)Data-Driven Communicative Behaviour Generation: A SurveyACM Transactions on Human-Robot Interaction10.1145/3609235Online publication date: 16-Aug-2023
https://dl.acm.org/doi/10.1145/3609235
Ao TZhang ZLiu L(2023)GestureDiffuCLIP: Gesture Diffusion Model with CLIP LatentsACM Transactions on Graphics10.1145/359209742:4(1-18)Online publication date: 26-Jul-2023
https://dl.acm.org/doi/10.1145/3592097
Hensel LYongsatianchot NTorshizi PMinucci EMarsella S(2023)Large language models in textual analysis for gesture selectionProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3614158(378-387)Online publication date: 9-Oct-2023
https://dl.acm.org/doi/10.1145/3577190.3614158
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents