ABSTRACT
Eye gaze offers several key cues regarding conversational discourse during face-to-face interaction between people. While a large body of research results exist to document the use of gaze in human-to-human interaction, and in animating realistic embodied avatars, recognition of conversational eye gestures - distinct eye movement patterns relevant to discourse - has received less attention. We analyze eye gestures during interaction with an animated embodied agent and propose a non-intrusive vision-based approach to estimate eye gaze and recognize eye gestures. In our user study, human participants avert their gaze (i.e. with "look-away" or "thinking" gestures) during periods of cognitive load. Using our approach, an agent can visually differentiate whether a user is thinking about a response or is waiting for the agent or robot to take its turn.
- M. Argyle and M. Cook. Gaze and mutual gaze. Cambridge University Press, 1976.]]Google Scholar
- AT&T. Natural Voices. http://www.naturalvoices.att.com.]]Google Scholar
- S. Basu, I. Essa, and A. Pentland. Motion regularization for model-based head tracking. In In Intl. Conf. on Pattern Recognition (ICPR '96), 1996.]] Google ScholarDigital Library
- Tim Bickmore and Justine Cassell. J. van Kuppevelt, L. Dybkjaer, and N. Bernsen (eds.), Natural, Intelligent and Effective Interaction with Multimodal Dialogue Systems, chapter Social Dialogue with Embodied Conversational Agents. Kluwer Academic, 2004.]]Google Scholar
- Breazeal, Hoffman, and A. Lockerd. Teaching and working with robots as a collaboration. In The Third International Conference on Autonomous Agents and Multi-Agent Systems AAMAS 2004, pages 1028--1035. ACM Press, July 2004.]] Google ScholarDigital Library
- De Carolis, Pelachaud, Poggi, and F. de Rosis. Behavior planning for a reflexive agent. In Proceedings of IJCAI, Seattle, September 2001.]]Google ScholarDigital Library
- M. R. M. Mimica C. H. Morimoto. Eye gaze tracking techniques for interactive applications. Computer Vision and Image Understanding, 98:52--82, 2005.]]Google ScholarCross Ref
- R. Alex Colburn, Michael F. Cohen, and Steven M. Ducker. The role or eye gaze in avatar mediated conversational interfaces. Technical Report MSR-TR-2000-81, Microsoft Research, July 2000.]]Google Scholar
- Dillman, Becher, and P. Steinhaus. ARMAR II - a learning and cooperative multimodal humanoid robot system. International Journal of Humanoid Robotics, 1(1):143--155, 2004.]]Google ScholarCross Ref
- Dillman, Ehrenmann, Steinhaus, Rogalla, and R. Zoellner. Human friendly programming of humanoid robots - the German Collaborative Research Center. In The Third IARP Intenational Workshop on Humanoid and Human-Friendly Robotics, Tsukuba Research Centre, Japan, December 2002.]]Google Scholar
- Atsushi Fukayama, Takehiko Ohno, Naoki Mukawa, Minako Sawaki, and Norihiro Hagita. Messages embedded in gaze of interface agents - impression management with agent's gaze. In CHI '02: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 41--48, 2002.]] Google ScholarDigital Library
- A. M. Glenberg, J. L. Schroeder, and D.A. Robertson. Averting the gaze disengages the environment and facilitates remembering. Memory and cognition, 26(4):651--658, July 1998.]]Google ScholarCross Ref
- Z. M. Griffin and K. Bock. What the eyes say about speaking. Psychological Science, 11(4):274--279, 2000.]]Google ScholarCross Ref
- Haptek. Haptek Player. http://www.haptek.com.]]Google Scholar
- Craig Hennessey, Borna Noureddin, and Peter Lawrence. A single camera eye-gaze tracking system with free head motion. In ETRA '06: Proceedings of the 2006 symposium on Eye tracking research & applications, pages 87--94, 2006.]] Google ScholarDigital Library
- A. Kendon. Some functions of gaze direction in social interaction. Acta Psyghologica, 26:22--63, 1967.]]Google ScholarCross Ref
- Sooha Park Lee, Jeremy B. Badler, and Norman I. Badler. Eyes alive. In SIGGRAPH '02: Proceedings of the 29th annual conference on Computer graphics and interactive techniques, pages 637--644, 2002.]] Google ScholarDigital Library
- Michael Li and Ted Selker. Eye pattern analysis in intelligent virtual agents. In Conference on Intelligent Virutal Agents(IVA02), pages 23--35, 2001.]] Google ScholarDigital Library
- P. Majaranta and K. J. Raiha. Twenty years of eye typing: systems and design issues. In ETRA '02: Proceedings of the 2002 symposium on Eye tracking research & applications, pages 15--22, 2002.]] Google ScholarDigital Library
- Louis-Philippe Morency, Ali Rahimi, and Trevor Darrell. Adaptive view-based appearance model. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition, volume 1, pages 803--810, 2003.]]Google ScholarCross Ref
- Louis-Philippe Morency, Candace Sidner, Christopher Lee, and Trevor Darrell. Contextual recognition of head gestures. In Proceedings of the International Conference on Multi-modal Interfaces, October 2005.]] Google ScholarDigital Library
- Louis-Philippe Morency, Patrik Sundberg, and Trevor Darrell. Pose estimation using 3d view-based eigenspaces. In ICCV Workshop on Analysis and Modeling of Faces and Gestures, pages 45--52, Nice, France, 2003.]] Google ScholarDigital Library
- Nakano, Reinstein, Stocky, and Justine Cassell. Towards a model of face-to-face grounding. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan, July 2003.]] Google ScholarDigital Library
- D. G. Novick, B. Hansen, and K. Ward. Coordinating turn-taking with gaze. In Proceedings of the Fourth International Conference on Spoken Language, volume 3, pages 1888--1891, 1996.]]Google ScholarCross Ref
- Takehiko Ohno and Naoki Mukawa. A free-head, simple calibration, gaze tracking system that enables gaze-based interaction. In ETRA '04: Proceedings of the 2004 symposium on Eye tracking research & applications, pages 115--122, 2004.]] Google ScholarDigital Library
- C. Pelachaud and M. Bilvi. Modelling gaze behavior for conversational agents. In International Working Conference on Intelligent Virtual Agents, pages 15--17, September 2003.]]Google ScholarCross Ref
- A. Pentland, B. Moghaddam, and T. Starner. View-based and modular eigenspaces for face recognition. In Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, Seattle, WA, June 1994.]]Google ScholarCross Ref
- Pernilla Qvarfordt and Shumin Zhai. Conversing with the user based on eye-gaze patterns. In CHI '05: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 221--230, 2005.]] Google ScholarDigital Library
- Daniel C. Richardson and Michael J. Spivey. Representation, space and hollywood squares: looking at things that aren't there anymore. Cognition, 76:269--295, 2000.]]Google ScholarCross Ref
- D. C. Richardson and R. Dale. Looking to understand: The coupling between speakers' and listeners' eye movements and its relationship to discourse comprehension. In Proceedings of the 26th Annual Meeting of the Cognitive Science Society, pages 1143--1148, 2004.]]Google Scholar
- C. Sidner, C. Lee, C. D. Kidd, N. Lesh, and C. Rich. Explorations in engagement for humans and robots. Artificial Intelligence, 166(1--2):140--164, August 2005.]]Google Scholar
- M. J. Spivey and J. J. Geng. Oculomotor mechanisms activated by imagery and memory: Eye movements to absent objects. Psychological Research/Psychologische Forschung, 65(4):235--241, 2001.]]Google ScholarCross Ref
- G. Stein and A. Shashua. Direct estimation of motion and extended scene structure from moving stereo rig. In Proc. of CVPR, June 1998.]] Google ScholarDigital Library
- R. Stiefelhagen, J. Yang, and A. Waibel. Tracking eyes and monitoring eye gaze. In Proceedings of the Workshop on Perceptual User Interfaces (PUI'97), pages 98--100, Alberta, Canada, 1997.]]Google Scholar
- D. Traum and J. Rickel. Embodied agents for multi-party dialogue in immersive virtual world. In Proceedings of the International Joint Conference on Autonomous Agents and Multi-agent Systems (AAMAS 2002), pages 766--773, July 2002.]] Google ScholarDigital Library
- David Traum and Jeff Rickel. Embodied agents for multi-party dialogue in immersive virtual worlds. In AAMAS '02: Proceedings of the first international joint conference on Autonomous agents and multiagent systems, pages 766--773, 2002.]] Google ScholarDigital Library
- B. M. Velichkovsky and J. P. Hansen. New technological windows in mind: There is more in eyes and brains for human-computer interaction. In CHI '96: Proceedings of the SIGCHI conference on Human factors in computing systems, 1996.]] Google ScholarDigital Library
- Roel Vertegaal, Robert Slagter, Gerrit van der Veer, and Anton Nijholt. Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes. In CHI '01: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 301--308, 2001.]] Google ScholarDigital Library
- P. Viola and M. Jones. Robust real-time face detection. In ICCV, page II: 747, 2001.]] Google ScholarDigital Library
- X. Xie, R. Sudhakar, and H. Zhuang. A cascaded scheme for eye tracking and head movement compensation. IEEE Transactions on Systems, Man and Cybernetics, 28(4):487--490, July 1998.]]Google ScholarDigital Library
- L. Young and D. Sheena. Survey of eye movement recording methods. Behavior Research Methods and Instrumentation, 7:397--429, 1975.]]Google ScholarCross Ref
- S. Zhai, C. Morimoto, and S. Ihde. Manual and gaze input cascaded (magic) pointing. In CHI99, pages 246--253, 1999.]] Google ScholarDigital Library
Index Terms
- Recognizing gaze aversion gestures in embodied conversational discourse
Recommendations
Conversational gaze aversion for humanlike robots
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interactionGaze aversion-the intentional redirection away from the face of an interlocutor-is an important nonverbal cue that serves a number of conversational functions, including signaling cognitive effort, regulating a conversation's intimacy level, and ...
Contextual recognition of head gestures
ICMI '05: Proceedings of the 7th international conference on Multimodal interfacesHead pose and gesture offer several key conversational grounding cues and are used extensively in face-to-face interaction among people. We investigate how dialog context from an embodied conversational agent (ECA) can improve visual recognition of user ...
Evaluating data-driven co-speech gestures of embodied conversational agents through real-time interaction
IVA '22: Proceedings of the 22nd ACM International Conference on Intelligent Virtual AgentsEmbodied Conversational Agents (ECAs) that make use of co-speech gestures can enhance human-machine interactions in many ways. In recent years, data-driven gesture generation approaches for ECAs have attracted considerable research attention, and ...
Comments