skip to main content
10.1145/1111449.1111464acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
Article

Head gesture recognition in intelligent interfaces: the role of context in improving recognition

Published:29 January 2006Publication History

ABSTRACT

Acknowledging an interruption with a nod of the head is a natural and intuitive communication gesture which can be performed without significantly disturbing a primary interface activity. In this paper we describe vision-based head gesture recognition techniques and their use for common user interface commands. We explore two prototype perceptual interface components which use detected head gestures for dialog box confirmation and document browsing, respectively. Tracking is performed using stereo-based alignment, and recognition proceeds using a trained discriminative classifier. An additional context learning component is described, which exploits interface context to obtain robust performance. User studies with prototype recognition components indicate quantitative and qualitative benefits of gesture-based confirmation over conventional alternatives.

References

  1. C. Breazeal and L. Aryananda. Recognizing affective intent in robot directed speech. Autonomous Robots, 12(1):83--104, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Cassell, T. Bickmore, M. Billinghurst, L. Campbell, K. Chang, H. Vilhjalmsson, and H. Yan. Embodiment in conversational interfaces: Rea. In Proceedings of the CHI'99 Conference, pages 520--527, Pittsburgh, PA, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Cassell, T. Bickmore, H. Vilhjalmsson, and H. Yan. A relational agent: A model and implementation of building user trust. In Proceedings of the CHI'01 Conference, pages 396--403, Seattle, WA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. H. Clark and E. Schaefer. Contributing to discourse. Cognitive Science, 13:259--294, 1989.Google ScholarGoogle ScholarCross RefCross Ref
  5. C. J. Cohen, G. J. Beach, and G. Foulk. A basic hand gesture control system for pc applications. In Proceedings. 30th Applied Imagery Pattern Recognition Workshop (AIPR'01), pages 74--79, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Davis and S. Vaks. A perceptual user interface for recognizing head gesture acknowledgements. In ACM Workshop on Perceptual User Interfaces, pages 15--16, November 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Fujie, Y. Ejiri, K. Nakajima, Y. Matsusaka, and T. Kobayashi. A conversation robot using head gesture recognition as para-linguistic information. In Proceedings of 13th IEEE International Workshop on Robot and Human Communication, RO-MAN 2004, pages 159--164, September 2004.Google ScholarGoogle ScholarCross RefCross Ref
  8. R. Jacob. Eye tracking in advanced interface design, pages 258--288. Oxford University Press, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Kapoor and R. Picard. A real-time head nod and shake detector. In Proceedings from the Workshop on Perspective User Interfaces, November 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Kawato and J. Ohya. Real-time detection of nodding and head-shaking by directly detecting and tracking the "between-eyes". In Proceedings. Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pages 40--45, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Kjeldsen. Head gestures for computer control. In Proc. Second International Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Real-time Systems, pages 62--67, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Lenman, L. Bretzer, and B. Thuresson. Computer vision based hand gesture interfaces for human-computer interaction. Technical Report CID-172, Center for User Oriented IT Design, June 2002.Google ScholarGoogle Scholar
  13. L.-P. Morency, A. Rahimi, N. Checka, and T. Darrell. Fast stereo-based head tracking for interactive environment. In Proceedings of the Int. Conference on Automatic Face and Gesture Recognition, pages 375--380, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L.-P. Morency, A. Rahimi, and T. Darrell. Adaptive view-based appearance model. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition, volume~1, pages 803--810, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L.-P. Morency, C. Sidner, C. Lee, and T. Darrell. Contextual recognition of head gestures. In Proceedings of the International Conference on Multi-modal Interfaces, October 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Nakano, Reinstein, Stocky, and J. Cassell. Towards a model of face-to-face grounding. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan, July 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Rich, Sidner, and N. Lesh. Collagen: Applying collaborative discourse theory to human--computer interaction. AI Magazine, Special Issue on Intelligent User Interfaces, 22(4):15--25, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Sidner, Kidd, Lee, and N. Lesh. Where to look: A study of human--robot engagement. In Proceedings of Intelligent User Interfaces, Portugal, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Sidner, Lee, and N. Lesh. Engagement when looking: Behaviors for robots when collaborating with people. In Diabruck: Proceedings of the 7th workshop on the Semantic and Pragmatics of Dialogue, pages 123--130, University of Saarland, 2003. I. Kruiff-Korbayova and C. Kosny (eds.).Google ScholarGoogle Scholar
  20. K. Toyama. Look, ma - no hands!hands-free cursor control with real-time 3D face tracking. In PUI98, 1998.Google ScholarGoogle Scholar
  21. Wikipedia. Wikipedia encyclopedia. http://en.wikipedia.org/wiki/Dialog_box.Google ScholarGoogle Scholar
  22. S. Zhai, C. Morimoto, and S. Ihde. Manual and gaze input cascaded (magic) pointing. In CHI99, pages 246--253, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Head gesture recognition in intelligent interfaces: the role of context in improving recognition

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            IUI '06: Proceedings of the 11th international conference on Intelligent user interfaces
            January 2006
            392 pages
            ISBN:1595932879
            DOI:10.1145/1111449

            Copyright © 2006 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 29 January 2006

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            Overall Acceptance Rate746of2,811submissions,27%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader