ABSTRACT
Acknowledging an interruption with a nod of the head is a natural and intuitive communication gesture which can be performed without significantly disturbing a primary interface activity. In this paper we describe vision-based head gesture recognition techniques and their use for common user interface commands. We explore two prototype perceptual interface components which use detected head gestures for dialog box confirmation and document browsing, respectively. Tracking is performed using stereo-based alignment, and recognition proceeds using a trained discriminative classifier. An additional context learning component is described, which exploits interface context to obtain robust performance. User studies with prototype recognition components indicate quantitative and qualitative benefits of gesture-based confirmation over conventional alternatives.
- C. Breazeal and L. Aryananda. Recognizing affective intent in robot directed speech. Autonomous Robots, 12(1):83--104, 2002. Google ScholarDigital Library
- J. Cassell, T. Bickmore, M. Billinghurst, L. Campbell, K. Chang, H. Vilhjalmsson, and H. Yan. Embodiment in conversational interfaces: Rea. In Proceedings of the CHI'99 Conference, pages 520--527, Pittsburgh, PA, 1999. Google ScholarDigital Library
- J. Cassell, T. Bickmore, H. Vilhjalmsson, and H. Yan. A relational agent: A model and implementation of building user trust. In Proceedings of the CHI'01 Conference, pages 396--403, Seattle, WA, 2001. Google ScholarDigital Library
- H. Clark and E. Schaefer. Contributing to discourse. Cognitive Science, 13:259--294, 1989.Google ScholarCross Ref
- C. J. Cohen, G. J. Beach, and G. Foulk. A basic hand gesture control system for pc applications. In Proceedings. 30th Applied Imagery Pattern Recognition Workshop (AIPR'01), pages 74--79, 2001. Google ScholarDigital Library
- J. Davis and S. Vaks. A perceptual user interface for recognizing head gesture acknowledgements. In ACM Workshop on Perceptual User Interfaces, pages 15--16, November 2001. Google ScholarDigital Library
- S. Fujie, Y. Ejiri, K. Nakajima, Y. Matsusaka, and T. Kobayashi. A conversation robot using head gesture recognition as para-linguistic information. In Proceedings of 13th IEEE International Workshop on Robot and Human Communication, RO-MAN 2004, pages 159--164, September 2004.Google ScholarCross Ref
- R. Jacob. Eye tracking in advanced interface design, pages 258--288. Oxford University Press, 1995. Google ScholarDigital Library
- A. Kapoor and R. Picard. A real-time head nod and shake detector. In Proceedings from the Workshop on Perspective User Interfaces, November 2001. Google ScholarDigital Library
- S. Kawato and J. Ohya. Real-time detection of nodding and head-shaking by directly detecting and tracking the "between-eyes". In Proceedings. Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pages 40--45, 2000. Google ScholarDigital Library
- R. Kjeldsen. Head gestures for computer control. In Proc. Second International Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Real-time Systems, pages 62--67, 2001. Google ScholarDigital Library
- S. Lenman, L. Bretzer, and B. Thuresson. Computer vision based hand gesture interfaces for human-computer interaction. Technical Report CID-172, Center for User Oriented IT Design, June 2002.Google Scholar
- L.-P. Morency, A. Rahimi, N. Checka, and T. Darrell. Fast stereo-based head tracking for interactive environment. In Proceedings of the Int. Conference on Automatic Face and Gesture Recognition, pages 375--380, 2002. Google ScholarDigital Library
- L.-P. Morency, A. Rahimi, and T. Darrell. Adaptive view-based appearance model. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition, volume~1, pages 803--810, 2003. Google ScholarDigital Library
- L.-P. Morency, C. Sidner, C. Lee, and T. Darrell. Contextual recognition of head gestures. In Proceedings of the International Conference on Multi-modal Interfaces, October 2005. Google ScholarDigital Library
- Nakano, Reinstein, Stocky, and J. Cassell. Towards a model of face-to-face grounding. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan, July 2003. Google ScholarDigital Library
- Rich, Sidner, and N. Lesh. Collagen: Applying collaborative discourse theory to human--computer interaction. AI Magazine, Special Issue on Intelligent User Interfaces, 22(4):15--25, 2001. Google ScholarDigital Library
- Sidner, Kidd, Lee, and N. Lesh. Where to look: A study of human--robot engagement. In Proceedings of Intelligent User Interfaces, Portugal, 2004. Google ScholarDigital Library
- Sidner, Lee, and N. Lesh. Engagement when looking: Behaviors for robots when collaborating with people. In Diabruck: Proceedings of the 7th workshop on the Semantic and Pragmatics of Dialogue, pages 123--130, University of Saarland, 2003. I. Kruiff-Korbayova and C. Kosny (eds.).Google Scholar
- K. Toyama. Look, ma - no hands!hands-free cursor control with real-time 3D face tracking. In PUI98, 1998.Google Scholar
- Wikipedia. Wikipedia encyclopedia. http://en.wikipedia.org/wiki/Dialog_box.Google Scholar
- S. Zhai, C. Morimoto, and S. Ihde. Manual and gaze input cascaded (magic) pointing. In CHI99, pages 246--253, 1999. Google ScholarDigital Library
Index Terms
- Head gesture recognition in intelligent interfaces: the role of context in improving recognition
Recommendations
Contextual recognition of head gestures
ICMI '05: Proceedings of the 7th international conference on Multimodal interfacesHead pose and gesture offer several key conversational grounding cues and are used extensively in face-to-face interaction among people. We investigate how dialog context from an embodied conversational agent (ECA) can improve visual recognition of user ...
The effect of head-nod recognition in human-robot conversation
HRI '06: Proceedings of the 1st ACM SIGCHI/SIGART conference on Human-robot interactionThis paper reports on a study of human participants with a robot designed to participate in a collaborative conversation with a human. The purpose of the study was to investigate a particular kind of gestural feedback from human to the robot in these ...
Head gestures for perceptual interfaces: The role of context in improving recognition
Head pose and gesture offer several conversational grounding cues and are used extensively in face-to-face interaction among people. To accurately recognize visual feedback, humans often use contextual knowledge from previous and current events to ...
Comments