skip to main content
10.1145/1088463.1088510acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
Article

A study of manual gesture-based selection for the PEMMI multimodal transport management interface

Published: 04 October 2005 Publication History

Abstract

Operators of traffic control rooms are often required to quickly respond to critical incidents using a complex array of multiple keyboards, mice, very large screen monitors and other peripheral equipment. To support the aim of finding more natural interfaces for this challenging application, this paper presents PEMMI (Perceptually Effective Multimodal Interface), a transport management system control prototype taking video-based manual gesture and speech recognition as inputs. A specific theme within this research is determining the optimum strategy for gesture input in terms of both single-point input selection and suitable multimodal feedback for selection. It has been found that users tend to prefer larger selection areas for targets in gesture interfaces, and tend to select within 44% of this selection radius. The minimum effective size for targets when using 'device-free' gesture interfaces was found to be 80 pixels (on a 1280x1024 screen). This paper also shows that feedback on gesture input via large screens is enhanced by the use of both audio and visual cues to guide the user's multimodal input. Audio feedback in particular was found to improve user response time by an average of 20% over existing gesture selection strategies for multimodal tasks.

References

[1]
Baudel, T. and Beaudouin-Lafon M. Charade: Remote Control of Objects Using Free-Hand Gestures. Communications of the ACM, 36, 7, (1993), 28--35.
[2]
Bolt, R. A. "Put-that-there": Voice and gesture at the graphic interface. Proc. of Computer Graphics (SIGGRAPH'80), 14, 3 (1980), 262--270.
[3]
Card, S. K., J. D. Mackinlay, et al., (Eds.) Information Visualization. Readings in Information Visualization. San Francisco, California, Morgan Kaufmann, (1999).
[4]
Cohen, P.R., Johnston, M., McGee, D.R., Oviatt, S.L., Pittman, J., Smith, I., Chen, L., and Clow, J. QuickSet: Multimodal interaction for distributed applications. In Proc. 5th Int. Multimedia Conf. (Multimedia '97), ACM Press: Seattle, WA, (1997), 31--40.
[5]
Corradini, A., Wesson, R. M., and Cohen, P. R. A Map-based System Using Speech and 3D Gestures for Pervasive Computing. In Proc. IEEE Int. Conf. on Multimodal Interfaces, Pittsburgh, PA, (2002), 191--196.
[6]
Elting C., Michelitsch G. A Multimodal Presentation Planner for a Home Entertainment Environment. In Proc. PUI 2001, Orlando FL, USA. ACM Press: New York, NY, (2001), 1--5.
[7]
Epps, J., Oviatt, S., and Chen, F. Integration of Speech and Gesture Inputs during Multimodal Interaction. In Proc. Aust. Int. Conf. on Computer-Human Interact. (OZCHI), (2004).
[8]
Gomez, G., Moralez, E. Automatic feature construction and a simple rule induction algorithm for skin detection. In Proc. IMCL Workshop on Machine Learning in Computer Vision, (2002), 31--38.
[9]
Graham, E. and MacKenzie, C.L. Pointing on a Computer Display. In Proc. Conf. companion on Human Factors in Computing Systems (1995), 314--315.
[10]
ISO Report number ISO/TC 159/SC4/WG3 N147: Ergonomic requirements for office work with visual display terminals (VDTs) - part 9 - Requirements for non-keyboard input devices (ISO 9241-9), International Organisation for Standardisation, 1998.
[11]
Jojic, N., Brumitt, B., Meyers, B., and Harris, S. Detecting and estimating pointing gestures in dense disparity maps. In Proc. IEEE Int. Conf. on Face and Gesture Recog. (2000), 468--475.
[12]
Kelly, M.J. Preliminary Human Factors Guidelines for Traffic Management Centers. Technical Report FHWA-JPO-99-042, U.S. Department of Transportation, (1999).
[13]
Miller, G. A. The Magic Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information. Psychological Review 63, (1957), 81--97.
[14]
Mousavi, S., Low, R., and Sweller, J. Reducing cognitive load by mixing auditory and visual presentation modes. Journal of Educational Psychology, 87 (1995). 319--334.
[15]
Nesbitt, K. V. Multi-sensory Display of Abstract Data, PhD thesis, School of Information Technology, University of Sydney, (2003).
[16]
Nickel, K., and Stiefelhagen, R. User tests and multimodal gesture: Pointing gesture recognition based on 3-D tracking of face, hands and head orientation. In Proc. Int. Conf. on Multimodal Interfaces (2003), 140--146.
[17]
Oviatt, S. L. Ten myths of multimodal interaction, Communications of the ACM, 42, 11, (1999), 74--81.
[18]
Oviatt, S., Coulston, R., and Lunsford, R. When Do We Interact Multimodally? Cognitive Load and Multimodal Communication Patterns. In Proc. 6th Int. Conf. on Multimodal Interfaces (2004).
[19]
Plaisant, C., Tarnoff, P., Keswani, S. and Rose, A. Understanding Transportation Management Systems Performance with a Simulation-Based Learning Environment. In Proc. Conference on Intelligent Transportation Systems ' 99 (CD ROM), (1999).
[20]
Schapira, E. and Sharma, R. Experimental Evaluation of Vision and Speech Based Multimodal Interfaces. In Proc. PUI 2001, ACM Press (2001), 1--9.
[21]
Wilson, A., and Oliver, N. GWindows: Robust stereo vision for gesture-based control of windows. In Proc. Int. Conf. on Multimodal Interfaces (2003), 140--146.

Cited By

View all
  • (2017)Analysis on the Role of Transport Equipment Standardization in the Development of Multimodal TransportInternational Symposium for Intelligent Transportation and Smart City (ITASC) 2017 Proceedings10.1007/978-981-10-3575-3_15(139-148)Online publication date: 7-Apr-2017
  • (2016)Effects of auditory, haptic and visual feedback on performing gestures by gaze or by handBehaviour & Information Technology10.1080/0144929X.2016.119447735:12(1044-1062)Online publication date: 1-Dec-2016
  • (2007)An FPGA-based smart camera for gesture recognition in HCI applicationsProceedings of the 8th Asian conference on Computer vision - Volume Part I10.5555/1775614.1775699(718-727)Online publication date: 18-Nov-2007
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '05: Proceedings of the 7th international conference on Multimodal interfaces
October 2005
344 pages
ISBN:1595930280
DOI:10.1145/1088463
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. manual gesture
  2. multimodal fusion
  3. multimodal interaction
  4. multimodal output generation
  5. speech

Qualifiers

  • Article

Conference

ICMI05
Sponsor:

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Analysis on the Role of Transport Equipment Standardization in the Development of Multimodal TransportInternational Symposium for Intelligent Transportation and Smart City (ITASC) 2017 Proceedings10.1007/978-981-10-3575-3_15(139-148)Online publication date: 7-Apr-2017
  • (2016)Effects of auditory, haptic and visual feedback on performing gestures by gaze or by handBehaviour & Information Technology10.1080/0144929X.2016.119447735:12(1044-1062)Online publication date: 1-Dec-2016
  • (2007)An FPGA-based smart camera for gesture recognition in HCI applicationsProceedings of the 8th Asian conference on Computer vision - Volume Part I10.5555/1775614.1775699(718-727)Online publication date: 18-Nov-2007
  • (2007)An input-parsing algorithm supporting integration of deictic gesture in natural language interfaceProceedings of the 12th international conference on Human-computer interaction: intelligent multimodal interaction environments10.5555/1769590.1769613(206-215)Online publication date: 22-Jul-2007
  • (2007)MULTIMODAL HUMAN-MACHINE INTERFACE AND USER COGNITIVE LOAD MEASUREMENTIFAC Proceedings Volumes10.3182/20070904-3-KR-2922.0003540:16(200-205)Online publication date: 2007
  • (2007)Multimedia Reasoning with Natural Language SupportProceedings of the International Conference on Semantic Computing10.1109/ICSC.2007.61(413-420)Online publication date: 17-Sep-2007
  • (2007)Multimodal user interface for traffic incident management in control roomIET Intelligent Transport Systems10.1049/iet-its:200600381:1(27)Online publication date: 2007
  • (2007)An FPGA-Based Smart Camera for Gesture Recognition in HCI ApplicationsComputer Vision – ACCV 200710.1007/978-3-540-76386-4_68(718-727)Online publication date: 2007
  • (2007)An Input-Parsing Algorithm Supporting Integration of Deictic Gesture in Natural Language InterfaceHuman-Computer Interaction. HCI Intelligent Multimodal Interaction Environments10.1007/978-3-540-73110-8_22(206-215)Online publication date: 2007
  • (2006)QuickFusionProceedings of the 2005 NICTA-HCSNet Multimodal User Interaction Workshop - Volume 5710.5555/1151804.1151813(51-54)Online publication date: 1-Apr-2006
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media