skip to main content
10.1145/1180995.1181051acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
Article

Recognizing gaze aversion gestures in embodied conversational discourse

Published: 02 November 2006 Publication History

Abstract

Eye gaze offers several key cues regarding conversational discourse during face-to-face interaction between people. While a large body of research results exist to document the use of gaze in human-to-human interaction, and in animating realistic embodied avatars, recognition of conversational eye gestures - distinct eye movement patterns relevant to discourse - has received less attention. We analyze eye gestures during interaction with an animated embodied agent and propose a non-intrusive vision-based approach to estimate eye gaze and recognize eye gestures. In our user study, human participants avert their gaze (i.e. with "look-away" or "thinking" gestures) during periods of cognitive load. Using our approach, an agent can visually differentiate whether a user is thinking about a response or is waiting for the agent or robot to take its turn.

References

[1]
M. Argyle and M. Cook. Gaze and mutual gaze. Cambridge University Press, 1976.]]
[2]
AT&T. Natural Voices. http://www.naturalvoices.att.com.]]
[3]
S. Basu, I. Essa, and A. Pentland. Motion regularization for model-based head tracking. In In Intl. Conf. on Pattern Recognition (ICPR '96), 1996.]]
[4]
Tim Bickmore and Justine Cassell. J. van Kuppevelt, L. Dybkjaer, and N. Bernsen (eds.), Natural, Intelligent and Effective Interaction with Multimodal Dialogue Systems, chapter Social Dialogue with Embodied Conversational Agents. Kluwer Academic, 2004.]]
[5]
Breazeal, Hoffman, and A. Lockerd. Teaching and working with robots as a collaboration. In The Third International Conference on Autonomous Agents and Multi-Agent Systems AAMAS 2004, pages 1028--1035. ACM Press, July 2004.]]
[6]
De Carolis, Pelachaud, Poggi, and F. de Rosis. Behavior planning for a reflexive agent. In Proceedings of IJCAI, Seattle, September 2001.]]
[7]
M. R. M. Mimica C. H. Morimoto. Eye gaze tracking techniques for interactive applications. Computer Vision and Image Understanding, 98:52--82, 2005.]]
[8]
R. Alex Colburn, Michael F. Cohen, and Steven M. Ducker. The role or eye gaze in avatar mediated conversational interfaces. Technical Report MSR-TR-2000-81, Microsoft Research, July 2000.]]
[9]
Dillman, Becher, and P. Steinhaus. ARMAR II - a learning and cooperative multimodal humanoid robot system. International Journal of Humanoid Robotics, 1(1):143--155, 2004.]]
[10]
Dillman, Ehrenmann, Steinhaus, Rogalla, and R. Zoellner. Human friendly programming of humanoid robots - the German Collaborative Research Center. In The Third IARP Intenational Workshop on Humanoid and Human-Friendly Robotics, Tsukuba Research Centre, Japan, December 2002.]]
[11]
Atsushi Fukayama, Takehiko Ohno, Naoki Mukawa, Minako Sawaki, and Norihiro Hagita. Messages embedded in gaze of interface agents - impression management with agent's gaze. In CHI '02: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 41--48, 2002.]]
[12]
A. M. Glenberg, J. L. Schroeder, and D.A. Robertson. Averting the gaze disengages the environment and facilitates remembering. Memory and cognition, 26(4):651--658, July 1998.]]
[13]
Z. M. Griffin and K. Bock. What the eyes say about speaking. Psychological Science, 11(4):274--279, 2000.]]
[14]
Haptek. Haptek Player. http://www.haptek.com.]]
[15]
Craig Hennessey, Borna Noureddin, and Peter Lawrence. A single camera eye-gaze tracking system with free head motion. In ETRA '06: Proceedings of the 2006 symposium on Eye tracking research & applications, pages 87--94, 2006.]]
[16]
A. Kendon. Some functions of gaze direction in social interaction. Acta Psyghologica, 26:22--63, 1967.]]
[17]
Sooha Park Lee, Jeremy B. Badler, and Norman I. Badler. Eyes alive. In SIGGRAPH '02: Proceedings of the 29th annual conference on Computer graphics and interactive techniques, pages 637--644, 2002.]]
[18]
Michael Li and Ted Selker. Eye pattern analysis in intelligent virtual agents. In Conference on Intelligent Virutal Agents(IVA02), pages 23--35, 2001.]]
[19]
P. Majaranta and K. J. Raiha. Twenty years of eye typing: systems and design issues. In ETRA '02: Proceedings of the 2002 symposium on Eye tracking research & applications, pages 15--22, 2002.]]
[20]
Louis-Philippe Morency, Ali Rahimi, and Trevor Darrell. Adaptive view-based appearance model. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition, volume 1, pages 803--810, 2003.]]
[21]
Louis-Philippe Morency, Candace Sidner, Christopher Lee, and Trevor Darrell. Contextual recognition of head gestures. In Proceedings of the International Conference on Multi-modal Interfaces, October 2005.]]
[22]
Louis-Philippe Morency, Patrik Sundberg, and Trevor Darrell. Pose estimation using 3d view-based eigenspaces. In ICCV Workshop on Analysis and Modeling of Faces and Gestures, pages 45--52, Nice, France, 2003.]]
[23]
Nakano, Reinstein, Stocky, and Justine Cassell. Towards a model of face-to-face grounding. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan, July 2003.]]
[24]
D. G. Novick, B. Hansen, and K. Ward. Coordinating turn-taking with gaze. In Proceedings of the Fourth International Conference on Spoken Language, volume 3, pages 1888--1891, 1996.]]
[25]
Takehiko Ohno and Naoki Mukawa. A free-head, simple calibration, gaze tracking system that enables gaze-based interaction. In ETRA '04: Proceedings of the 2004 symposium on Eye tracking research & applications, pages 115--122, 2004.]]
[26]
C. Pelachaud and M. Bilvi. Modelling gaze behavior for conversational agents. In International Working Conference on Intelligent Virtual Agents, pages 15--17, September 2003.]]
[27]
A. Pentland, B. Moghaddam, and T. Starner. View-based and modular eigenspaces for face recognition. In Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, Seattle, WA, June 1994.]]
[28]
Pernilla Qvarfordt and Shumin Zhai. Conversing with the user based on eye-gaze patterns. In CHI '05: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 221--230, 2005.]]
[29]
Daniel C. Richardson and Michael J. Spivey. Representation, space and hollywood squares: looking at things that aren't there anymore. Cognition, 76:269--295, 2000.]]
[30]
D. C. Richardson and R. Dale. Looking to understand: The coupling between speakers' and listeners' eye movements and its relationship to discourse comprehension. In Proceedings of the 26th Annual Meeting of the Cognitive Science Society, pages 1143--1148, 2004.]]
[31]
C. Sidner, C. Lee, C. D. Kidd, N. Lesh, and C. Rich. Explorations in engagement for humans and robots. Artificial Intelligence, 166(1--2):140--164, August 2005.]]
[32]
M. J. Spivey and J. J. Geng. Oculomotor mechanisms activated by imagery and memory: Eye movements to absent objects. Psychological Research/Psychologische Forschung, 65(4):235--241, 2001.]]
[33]
G. Stein and A. Shashua. Direct estimation of motion and extended scene structure from moving stereo rig. In Proc. of CVPR, June 1998.]]
[34]
R. Stiefelhagen, J. Yang, and A. Waibel. Tracking eyes and monitoring eye gaze. In Proceedings of the Workshop on Perceptual User Interfaces (PUI'97), pages 98--100, Alberta, Canada, 1997.]]
[35]
D. Traum and J. Rickel. Embodied agents for multi-party dialogue in immersive virtual world. In Proceedings of the International Joint Conference on Autonomous Agents and Multi-agent Systems (AAMAS 2002), pages 766--773, July 2002.]]
[36]
David Traum and Jeff Rickel. Embodied agents for multi-party dialogue in immersive virtual worlds. In AAMAS '02: Proceedings of the first international joint conference on Autonomous agents and multiagent systems, pages 766--773, 2002.]]
[37]
B. M. Velichkovsky and J. P. Hansen. New technological windows in mind: There is more in eyes and brains for human-computer interaction. In CHI '96: Proceedings of the SIGCHI conference on Human factors in computing systems, 1996.]]
[38]
Roel Vertegaal, Robert Slagter, Gerrit van der Veer, and Anton Nijholt. Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes. In CHI '01: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 301--308, 2001.]]
[39]
P. Viola and M. Jones. Robust real-time face detection. In ICCV, page II: 747, 2001.]]
[40]
X. Xie, R. Sudhakar, and H. Zhuang. A cascaded scheme for eye tracking and head movement compensation. IEEE Transactions on Systems, Man and Cybernetics, 28(4):487--490, July 1998.]]
[41]
L. Young and D. Sheena. Survey of eye movement recording methods. Behavior Research Methods and Instrumentation, 7:397--429, 1975.]]
[42]
S. Zhai, C. Morimoto, and S. Ihde. Manual and gaze input cascaded (magic) pointing. In CHI99, pages 246--253, 1999.]]

Cited By

View all
  • (2024)Predictability of Understanding in Explanatory Interactions Based on Multimodal CuesProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685741(449-458)Online publication date: 4-Nov-2024
  • (2024)S3: Speech, Script and Scene driven Head and Eye AnimationACM Transactions on Graphics10.1145/365817243:4(1-12)Online publication date: 19-Jul-2024
  • (2023)Explainable Models for Predicting Interlocutors’ Subjective Impressions Based on Functional Head Movement Features頭部運動機能特徴に基づく対話者の主観的印象の予測・説明モデルの構築Transactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.38-3_H-M7438:3(H-M74_1-13)Online publication date: 1-May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces
November 2006
404 pages
ISBN:159593541X
DOI:10.1145/1180995
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. aversion gestures
  2. embodied conversational agent
  3. eye gaze tracking
  4. eye gestures
  5. human-computer interaction
  6. turn-taking

Qualifiers

  • Article

Conference

ICMI06
Sponsor:

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)2
Reflects downloads up to 09 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Predictability of Understanding in Explanatory Interactions Based on Multimodal CuesProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685741(449-458)Online publication date: 4-Nov-2024
  • (2024)S3: Speech, Script and Scene driven Head and Eye AnimationACM Transactions on Graphics10.1145/365817243:4(1-12)Online publication date: 19-Jul-2024
  • (2023)Explainable Models for Predicting Interlocutors’ Subjective Impressions Based on Functional Head Movement Features頭部運動機能特徴に基づく対話者の主観的印象の予測・説明モデルの構築Transactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.38-3_H-M7438:3(H-M74_1-13)Online publication date: 1-May-2023
  • (2023)Identifying Interlocutors' Behaviors and its Timings Involved with Impression Formation from Head-Movement Features and Linguistic FeaturesProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3614124(336-344)Online publication date: 9-Oct-2023
  • (2023)Boundary Conditions for Human Gaze Estimation on A Social Robot using State-of-the-Art Models2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)10.1109/RO-MAN57019.2023.10309348(1486-1493)Online publication date: 28-Aug-2023
  • (2023)Synchronized Colored Petri Net Based Multimodal Modeling and Real-Time Recognition of Conversational Spatial Deictic GesturesIntelligent Computing10.1007/978-3-031-37963-5_85(1227-1246)Online publication date: 20-Aug-2023
  • (2022)HeadSee: Device-free head gesture recognition with commodity RFIDPeer-to-Peer Networking and Applications10.1007/s12083-021-01126-115:3(1357-1369)Online publication date: 14-Feb-2022
  • (2022)The Multimodal Performance of Conversational HumorundefinedOnline publication date: 31-Mar-2022
  • (2021)Prediction of Interlocutors’ Subjective Impressions Based on Functional Head-Movement Features in Group MeetingsProceedings of the 2021 International Conference on Multimodal Interaction10.1145/3462244.3479930(352-360)Online publication date: 18-Oct-2021
  • (2021)Examining the Use of Nonverbal Communication in Virtual AgentsInternational Journal of Human–Computer Interaction10.1080/10447318.2021.189885137:17(1648-1673)Online publication date: 28-Mar-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media