Article

Recognizing gaze aversion gestures in embodied conversational discourse

Authors:

Louis-Philippe Morency,

C. Mario Christoudias,

Trevor DarrellAuthors Info & Claims

ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces

Pages 287 - 294

https://doi.org/10.1145/1180995.1181051

Published: 02 November 2006 Publication History

Abstract

Eye gaze offers several key cues regarding conversational discourse during face-to-face interaction between people. While a large body of research results exist to document the use of gaze in human-to-human interaction, and in animating realistic embodied avatars, recognition of conversational eye gestures - distinct eye movement patterns relevant to discourse - has received less attention. We analyze eye gestures during interaction with an animated embodied agent and propose a non-intrusive vision-based approach to estimate eye gaze and recognize eye gestures. In our user study, human participants avert their gaze (i.e. with "look-away" or "thinking" gestures) during periods of cognitive load. Using our approach, an agent can visually differentiate whether a user is thinking about a response or is waiting for the agent or robot to take its turn.

References

[1]

M. Argyle and M. Cook. Gaze and mutual gaze. Cambridge University Press, 1976.]]

[2]

AT&T. Natural Voices. http://www.naturalvoices.att.com.]]

[3]

S. Basu, I. Essa, and A. Pentland. Motion regularization for model-based head tracking. In In Intl. Conf. on Pattern Recognition (ICPR '96), 1996.]]

Digital Library

[4]

Tim Bickmore and Justine Cassell. J. van Kuppevelt, L. Dybkjaer, and N. Bernsen (eds.), Natural, Intelligent and Effective Interaction with Multimodal Dialogue Systems, chapter Social Dialogue with Embodied Conversational Agents. Kluwer Academic, 2004.]]

[5]

Breazeal, Hoffman, and A. Lockerd. Teaching and working with robots as a collaboration. In The Third International Conference on Autonomous Agents and Multi-Agent Systems AAMAS 2004, pages 1028--1035. ACM Press, July 2004.]]

Digital Library

[6]

De Carolis, Pelachaud, Poggi, and F. de Rosis. Behavior planning for a reflexive agent. In Proceedings of IJCAI, Seattle, September 2001.]]

Digital Library

[7]

M. R. M. Mimica C. H. Morimoto. Eye gaze tracking techniques for interactive applications. Computer Vision and Image Understanding, 98:52--82, 2005.]]

[8]

R. Alex Colburn, Michael F. Cohen, and Steven M. Ducker. The role or eye gaze in avatar mediated conversational interfaces. Technical Report MSR-TR-2000-81, Microsoft Research, July 2000.]]

[9]

Dillman, Becher, and P. Steinhaus. ARMAR II - a learning and cooperative multimodal humanoid robot system. International Journal of Humanoid Robotics, 1(1):143--155, 2004.]]

[10]

Dillman, Ehrenmann, Steinhaus, Rogalla, and R. Zoellner. Human friendly programming of humanoid robots - the German Collaborative Research Center. In The Third IARP Intenational Workshop on Humanoid and Human-Friendly Robotics, Tsukuba Research Centre, Japan, December 2002.]]

[11]

Atsushi Fukayama, Takehiko Ohno, Naoki Mukawa, Minako Sawaki, and Norihiro Hagita. Messages embedded in gaze of interface agents - impression management with agent's gaze. In CHI '02: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 41--48, 2002.]]

Digital Library

[12]

A. M. Glenberg, J. L. Schroeder, and D.A. Robertson. Averting the gaze disengages the environment and facilitates remembering. Memory and cognition, 26(4):651--658, July 1998.]]

[13]

Z. M. Griffin and K. Bock. What the eyes say about speaking. Psychological Science, 11(4):274--279, 2000.]]

[14]

Haptek. Haptek Player. http://www.haptek.com.]]

[15]

Craig Hennessey, Borna Noureddin, and Peter Lawrence. A single camera eye-gaze tracking system with free head motion. In ETRA '06: Proceedings of the 2006 symposium on Eye tracking research & applications, pages 87--94, 2006.]]

Digital Library

[16]

A. Kendon. Some functions of gaze direction in social interaction. Acta Psyghologica, 26:22--63, 1967.]]

[17]

Sooha Park Lee, Jeremy B. Badler, and Norman I. Badler. Eyes alive. In SIGGRAPH '02: Proceedings of the 29th annual conference on Computer graphics and interactive techniques, pages 637--644, 2002.]]

Digital Library

[18]

Michael Li and Ted Selker. Eye pattern analysis in intelligent virtual agents. In Conference on Intelligent Virutal Agents(IVA02), pages 23--35, 2001.]]

Digital Library

[19]

P. Majaranta and K. J. Raiha. Twenty years of eye typing: systems and design issues. In ETRA '02: Proceedings of the 2002 symposium on Eye tracking research & applications, pages 15--22, 2002.]]

Digital Library

[20]

Louis-Philippe Morency, Ali Rahimi, and Trevor Darrell. Adaptive view-based appearance model. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition, volume 1, pages 803--810, 2003.]]

[21]

Louis-Philippe Morency, Candace Sidner, Christopher Lee, and Trevor Darrell. Contextual recognition of head gestures. In Proceedings of the International Conference on Multi-modal Interfaces, October 2005.]]

Digital Library

[22]

Louis-Philippe Morency, Patrik Sundberg, and Trevor Darrell. Pose estimation using 3d view-based eigenspaces. In ICCV Workshop on Analysis and Modeling of Faces and Gestures, pages 45--52, Nice, France, 2003.]]

Digital Library

[23]

Nakano, Reinstein, Stocky, and Justine Cassell. Towards a model of face-to-face grounding. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan, July 2003.]]

Digital Library

[24]

D. G. Novick, B. Hansen, and K. Ward. Coordinating turn-taking with gaze. In Proceedings of the Fourth International Conference on Spoken Language, volume 3, pages 1888--1891, 1996.]]

[25]

Takehiko Ohno and Naoki Mukawa. A free-head, simple calibration, gaze tracking system that enables gaze-based interaction. In ETRA '04: Proceedings of the 2004 symposium on Eye tracking research & applications, pages 115--122, 2004.]]

Digital Library

[26]

C. Pelachaud and M. Bilvi. Modelling gaze behavior for conversational agents. In International Working Conference on Intelligent Virtual Agents, pages 15--17, September 2003.]]

[27]

A. Pentland, B. Moghaddam, and T. Starner. View-based and modular eigenspaces for face recognition. In Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, Seattle, WA, June 1994.]]

[28]

Pernilla Qvarfordt and Shumin Zhai. Conversing with the user based on eye-gaze patterns. In CHI '05: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 221--230, 2005.]]

Digital Library

[29]

Daniel C. Richardson and Michael J. Spivey. Representation, space and hollywood squares: looking at things that aren't there anymore. Cognition, 76:269--295, 2000.]]

[30]

D. C. Richardson and R. Dale. Looking to understand: The coupling between speakers' and listeners' eye movements and its relationship to discourse comprehension. In Proceedings of the 26th Annual Meeting of the Cognitive Science Society, pages 1143--1148, 2004.]]

[31]

C. Sidner, C. Lee, C. D. Kidd, N. Lesh, and C. Rich. Explorations in engagement for humans and robots. Artificial Intelligence, 166(1--2):140--164, August 2005.]]

[32]

M. J. Spivey and J. J. Geng. Oculomotor mechanisms activated by imagery and memory: Eye movements to absent objects. Psychological Research/Psychologische Forschung, 65(4):235--241, 2001.]]

[33]

G. Stein and A. Shashua. Direct estimation of motion and extended scene structure from moving stereo rig. In Proc. of CVPR, June 1998.]]

Digital Library

[34]

R. Stiefelhagen, J. Yang, and A. Waibel. Tracking eyes and monitoring eye gaze. In Proceedings of the Workshop on Perceptual User Interfaces (PUI'97), pages 98--100, Alberta, Canada, 1997.]]

[35]

D. Traum and J. Rickel. Embodied agents for multi-party dialogue in immersive virtual world. In Proceedings of the International Joint Conference on Autonomous Agents and Multi-agent Systems (AAMAS 2002), pages 766--773, July 2002.]]

Digital Library

[36]

David Traum and Jeff Rickel. Embodied agents for multi-party dialogue in immersive virtual worlds. In AAMAS '02: Proceedings of the first international joint conference on Autonomous agents and multiagent systems, pages 766--773, 2002.]]

Digital Library

[37]

B. M. Velichkovsky and J. P. Hansen. New technological windows in mind: There is more in eyes and brains for human-computer interaction. In CHI '96: Proceedings of the SIGCHI conference on Human factors in computing systems, 1996.]]

Digital Library

[38]

Roel Vertegaal, Robert Slagter, Gerrit van der Veer, and Anton Nijholt. Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes. In CHI '01: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 301--308, 2001.]]

Digital Library

[39]

P. Viola and M. Jones. Robust real-time face detection. In ICCV, page II: 747, 2001.]]

Digital Library

[40]

X. Xie, R. Sudhakar, and H. Zhuang. A cascaded scheme for eye tracking and head movement compensation. IEEE Transactions on Systems, Man and Cybernetics, 28(4):487--490, July 1998.]]

Digital Library

[41]

L. Young and D. Sheena. Survey of eye movement recording methods. Behavior Research Methods and Instrumentation, 7:397--429, 1975.]]

[42]

S. Zhai, C. Morimoto, and S. Ihde. Manual and gaze input cascaded (magic) pointing. In CHI99, pages 246--253, 1999.]]

Digital Library

Cited By

Türk OLazarov SWang YBuschmeier HGrimminger AWagner P(2024)Predictability of Understanding in Explanatory Interactions Based on Multimodal CuesProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685741(449-458)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3678957.3685741
Pan YAgrawal RSingh K(2024)S3: Speech, Script and Scene driven Head and Eye AnimationACM Transactions on Graphics10.1145/365817243:4(1-12)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1145/3658172
Otsuchi SMiyoshi KIshii YIshiii REitoku SOtsuka K(2023)Explainable Models for Predicting Interlocutors’ Subjective Impressions Based on Functional Head Movement Features頭部運動機能特徴に基づく対話者の主観的印象の予測・説明モデルの構築Transactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.38-3_H-M7438:3(H-M74_1-13)Online publication date: 1-May-2023
https://doi.org/10.1527/tjsai.38-3_H-M74
Show More Cited By

Index Terms

Recognizing gaze aversion gestures in embodied conversational discourse
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
        Motion capture
    2. Natural language processing
      1. Discourse, dialogue and pragmatics
  2. Computer graphics
    1. Animation
      1. Motion capture
      2. Motion processing

Recommendations

Conversational gaze aversion for humanlike robots
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction

Gaze aversion-the intentional redirection away from the face of an interlocutor-is an important nonverbal cue that serves a number of conversational functions, including signaling cognitive effort, regulating a conversation's intimacy level, and ...
Evaluating data-driven co-speech gestures of embodied conversational agents through real-time interaction
IVA '22: Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents

Embodied Conversational Agents (ECAs) that make use of co-speech gestures can enhance human-machine interactions in many ways. In recent years, data-driven gesture generation approaches for ECAs have attracted considerable research attention, and ...
Contextual recognition of head gestures
ICMI '05: Proceedings of the 7th international conference on Multimodal interfaces

Head pose and gesture offer several key conversational grounding cues and are used extensively in face-to-face interaction among people. We investigate how dialog context from an embodied conversational agent (ECA) can improve visual recognition of user ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces

November 2006

404 pages

ISBN:159593541X

DOI:10.1145/1180995

General Chairs:
Francis Quek
Virginia Tech, USA
,
Jie Yang
Carnegie Mellon University, USA
,
Program Chairs:
Dominic Massaro
University of California, Santa Cruz, USA
,
Abeer Alwan
University of California, Los Angeles, USA
,
Timothy J. Hazen
Massachusetts Institute of Technology, USA

Copyright © 2006 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

ICMI06

Sponsor:

ICMI06: 8th International Conference on Multimodal Interfaces 2006

November 2 - 4, 2006

Alberta, Banff, Canada

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

47
Total Citations
View Citations
687
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)2

Reflects downloads up to 09 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Türk OLazarov SWang YBuschmeier HGrimminger AWagner P(2024)Predictability of Understanding in Explanatory Interactions Based on Multimodal CuesProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685741(449-458)Online publication date: 4-Nov-2024
Pan YAgrawal RSingh K(2024)S3: Speech, Script and Scene driven Head and Eye AnimationACM Transactions on Graphics10.1145/365817243:4(1-12)Online publication date: 19-Jul-2024
Otsuchi SMiyoshi KIshii YIshiii REitoku SOtsuka K(2023)Explainable Models for Predicting Interlocutors’ Subjective Impressions Based on Functional Head Movement Features頭部運動機能特徴に基づく対話者の主観的印象の予測・説明モデルの構築Transactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.38-3_H-M7438:3(H-M74_1-13)Online publication date: 1-May-2023
Otsuchi SIto KIshii YIshii REitoku SOtsuka K(2023)Identifying Interlocutors' Behaviors and its Timings Involved with Impression Formation from Head-Movement Features and Linguistic FeaturesProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3614124(336-344)Online publication date: 9-Oct-2023
Cheng LBelopolsky AHindriks K(2023)Boundary Conditions for Human Gaze Estimation on A Social Robot using State-of-the-Art Models2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)10.1109/RO-MAN57019.2023.10309348(1486-1493)Online publication date: 28-Aug-2023
Singh ABansal A(2023)Synchronized Colored Petri Net Based Multimodal Modeling and Real-Time Recognition of Conversational Spatial Deictic GesturesIntelligent Computing10.1007/978-3-031-37963-5_85(1227-1246)Online publication date: 20-Aug-2023
Chen KWang FLi MLiu BChen HChen F(2022)HeadSee: Device-free head gesture recognition with commodity RFIDPeer-to-Peer Networking and Applications10.1007/s12083-021-01126-115:3(1357-1369)Online publication date: 14-Feb-2022
Gironzetti E(2022)The Multimodal Performance of Conversational HumorundefinedOnline publication date: 31-Mar-2022
Otsuchi SIshii YNakatani MOtsuka K(2021)Prediction of Interlocutors’ Subjective Impressions Based on Functional Head-Movement Features in Group MeetingsProceedings of the 2021 International Conference on Multimodal Interaction10.1145/3462244.3479930(352-360)Online publication date: 18-Oct-2021
Wang IRuiz J(2021)Examining the Use of Nonverbal Communication in Virtual AgentsInternational Journal of Human–Computer Interaction10.1080/10447318.2021.189885137:17(1648-1673)Online publication date: 28-Mar-2021
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten