research-article

Fusion engines for multimodal input: a survey

Authors:

Laurence Nigay,

philippe Palanque,

Peter Robinson,

Jean Vanderdonckt,

Jean-François LadryAuthors Info & Claims

ICMI-MLMI '09: Proceedings of the 2009 international conference on Multimodal interfaces

Pages 153 - 160

https://doi.org/10.1145/1647314.1647343

Published: 02 November 2009 Publication History

Abstract

Fusion engines are fundamental components of multimodal inter-active systems, to interpret input streams whose meaning can vary according to the context, task, user and time. Other surveys have considered multimodal interactive systems; we focus more closely on the design, specification, construction and evaluation of fusion engines. We first introduce some terminology and set out the major challenges that fusion engines propose to solve. A history of past work in the field of fusion engines is then presented using the BRETAM model. These approaches to fusion are then classified. The classification considers the types of application, the fusion principles and the temporal aspects. Finally, the challenges for future work in the field of fusion engines are set out. These include software frameworks, quantitative evaluation, machine learning and adaptation.

References

[1]

Allen, J.: Maintaining Knowledge about Temporal Intervals. Communications of the ACM, Vol. 26, No. 11, (1983) 832--843.

Digital Library

[2]

Bass, L., Pellegrino, R., Reed, S., Seacord, R., Sheppard, R., and Szezur, M. R. The Arch model: Seeheim revisited. proceeding of the User Interface Developpers' workshop. 91.

[3]

Bellik, Y. (1997) Media integration in multimodal interfaces. Proceedings of IEEE Workshop on Multimedia Signal Processing, Princeton, NJ.

[4]

Bolt, R. A. 1980. "Put-that-there": Voice and gesture at the graphics interface. In Proceedings of the 7th Annual Conference on Computer Graphics and interactive Techniques. SIGGRAPH '80. ACM, New York, NY, 262--270.

Digital Library

[5]

Bouchet, J., Nigay, L. ICARE: A Component-Based Approach for the Design and Development of Multimodal Interfaces. In Extended Abstracts CHI'04 (2004). ACM Press, 1325--1328.

Digital Library

[6]

Bouchet, J., Nigay, L. ICARE Software Components for Rapidly Developing Multimodal Interfaces. In Proceedings of the 6th international Conference on Multimodal interfaces (State College, PA, USA, 2004). ICMI '04. ACM, New York, NY, 251--258.

Digital Library

[7]

Bourguet, M. L. A Toolkit for Creating and Testing Multimodal Interface Designs. Proceedings of UIST'02, Paris. 2002, pp. 29--30.

[8]

Carpenter, R. 1990. Typed feature structures: Inheritance, (In)equality, and Extensionality. In W. Daelemans and O. Gazdar (Eds.), Proceedings of the 1TK Workshop: Inheritance in Natural Language Processing, Tilburg University, pp. 9--18.

[9]

Cohen, P. R., Johnston, M., McGee, D., Oviatt, S., Pittman, J., Smith, I., Chen, L., and Clow, J. 1997. QuickSet: multimodal interaction for distributed applications. In Proceedings of the Fifth ACM international Conference on Multimedia (Seattle, Washington, United States, November 09 -- 13, 1997). MULTIMEDIA '97. ACM, New York, NY, 31--40.

Digital Library

[10]

Coutaz, J., Nigay, L., Salber, D., Blandford, A., May, J. and Young, R. Four Easy Pieces for Assessing the Usability of Multimodal Interaction: The CARE properties. Proceedings of INTERACT'95 conference, 1995, pp. 115--120.

[11]

Duarte, C. and Carriço, L. 2006. A conceptual framework for developing adaptive multimodal applications. In Proceedings of the 11th international Conference on intelligent User interfaces (Sydney, Australia, January 29 -- February 01, 2006). IUI '06. ACM, New York, NY, 132--139.

Digital Library

[12]

Dumas, B., Lalanne, D., Guinard, D., Koenig, R., and Ingold, R. 2008. Strengths and weaknesses of software architectures for the rapid creation of tangible and multimodal interfaces. In Proc. of the 2nd int. Conf. on Tangible and Embedded interaction (Bonn, Germany, 2008). TEI '08. ACM, 47--54

Digital Library

[13]

Dumas B., Lalanne D.&Oviatt S. Multimodal Interfaces: A Survey of Principles, Models and Frameworks. D. Lalanne and J. Kohlas (Eds.): Human Machine Interaction, LNCS 5440, pp. 3--26, 2009. © Springer-Verlag Berlin 2009

Digital Library

[14]

Flippo, F., Krebs, A., and Marsic, I. 2003. A framework for rapid development of multimodal interfaces. In Proceedings of the 5th international Conference on Multimodal interfaces. ICMI '03. ACM, New York, NY, 109--116.

Digital Library

[15]

Gaines, B. R., Modeling and Forecasting the Information Sciences. Information Sciences 57--58, 1991, p. 3--22.

Digital Library

[16]

Jaimes, A. and Sebe, N. 2007. Multimodal human-computer interaction: A survey. Comput. Vis. Image Underst. 108, 1--2 (Oct. 2007), 116--134.

Digital Library

[17]

Holzapfel, H., Nickel, K., and Stiefelhagen, R. 2004. Implementation and evaluation of a constraint-based multimodal fusion system for speech and 3D pointing gestures. In Proceedings of the 6th international Conference on Multimodal interfaces (State College, PA, USA. ICMI '04. ACM, New York, NY, 175--182.

Digital Library

[18]

Johnston, M. and Bangalore, S. Finite-state multimodal integration and understanding. Nat. Lang. Eng. 11, 2 (Jun. 2005), 159--187

Digital Library

[19]

Koons, D. B., Sparrell, C. J., and Thorisson, K. R. 1993. Integrating simultaneous input from speech, gaze, and hand gestures. In intelligent Multimedia interfaces, M. T. Maybury, Ed. American Association for Artificial Intelligence, Menlo Park, CA, 257--276.

Digital Library

[20]

Krahnstoever, N., Kettebekov, S., Yeasin, M., and Sharma, R. 2002. A Real--Time Framework for Natural Multimodal Interaction with Large Screen Displays. In Proceedings of the 4th IEEE international Conference on Multimodal interfaces -- Volume 00 (October 14 -- 16, 2002). International Conference on Multimodal Interfaces. IEEE Computer Society, Washington, DC, 349.

Digital Library

[21]

Latoschik, M.E., 2002. Designing transition networks for multimodal VR-interactions using a markup language. Multimodal Interfaces, 2002. In Proceedings of the Fourth IEEE International Conference on Multimodal Interfaces. IEEE, 411--416.

Digital Library

[22]

Martin, J.C., Veldman, R.,&Beroule, D. 1998. Developping multimodal interfaces: a theoretical framework and guided propagation networks. Multimodal Human-Computer Communication. Springer Verlag LNAI 1374.

Digital Library

[23]

Mansoux, B., Nigay, L., Troccaz, J. Output Multimodal Interaction: The Case of Augmented Surgery. In Proceedings of HCI'06, Human Computer Interaction, People and Computers XX, the 20th BCS HCI Group conference, (London, UK,11--15 September, 2006). Springer Publ, 177--192.

[24]

Melichar, M. and Cenek, P. From vocal to multimodal dialogue management. In Proceedings of the 8th international Conference on Multimodal interfaces 2006. ICMI '06. ACM, New York, 59--67.

Digital Library

[25]

Milota, A. D. Modality fusion for graphic design applications. In Proceedings of the 6th international Conference on Multimodal interfaces. ICMI '04. ACM, New York, NY, 167--174.

Digital Library

[26]

Neal, J. G., Thielman, C. Y., Dobes, Z., Haller, S. M., and Shapiro, S. C. 1989. Natural language with integrated deictic and graphic gestures. In Proceedings of the Workshop on Speech and Natural Language. Human Language Technology Conference. Association for Computational Linguistics, Morristown, NJ, 410--423

Digital Library

[27]

Nigay, L. and Coutaz, J. A design space for multimodal systems: concurrent processing and data fusion. In Proceedings of the INTERCHI '93 Conference on Human Factors in Computing Systems. S. Ashlund, A. Henderson, E. Hollnagel, K. Mullet, and T. White, Eds. IOS Press, Amsterdam, 172--178.

Digital Library

[28]

Nigay, L. and Coutaz, J. A generic platform for addressing the multimodal challenge. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Denver, Colorado, United States, May 07 -- 11, 1995). Conference on Human Factors in Computing Systems. ACM Press, New York, NY, 98--105.

Digital Library

[29]

Nigay, L., et al. Software Engineering for Multimodal Interactive Systems. In Multimodal user interfaces: from signals to interaction, Chapter 9, Springer, Lecture Notes in Electrical Engineering, ISBN13: 9783540783442, 201--218.

[30]

Navarre, D., Palanque, P., Bastide, R., Schyn, A., Winckler, M.A., Nedel, L., Freitas, C. A Formal Description of Multimodal Interaction Techniques for Immersive Virtual Reality Applications. In: INTERACT 2005: IFIP TC13 International Conference, Rome, Italy, Springer-Verlag LNCS 3585, p. 170--185, 2005.

Digital Library

[31]

Navarre, D., Palanque, P., Basnyat, S. Usability Service Continuation through Reconfiguration of Input and Output Devices in Safety Critical Interactive Systems. The 27th International Conference on Computer Safety, Reliability and Security (SAFECOMP 2008), LNCS 5219, pp. 373--386, 2008.

Digital Library

[32]

Nielsen, J.: A Virtual Protocol Model for Computer-Human Interaction. International Journal of Man--Machine Studies 24,3 (1986) 301--312

Digital Library

[33]

Pfleger, N. 2004. Context based multimodal fusion. In Proceedings of the 6th international Conference on Multimodal interfaces (State College, PA, USA, October 13 -- 15, 2004). ICMI '04. ACM, New York, NY, 265--272.

Digital Library

[34]

Portillo, P. M., García, G. P., and Carredano, G. A. 2006. Multimodal fusion: a new hybrid strategy for dialogue systems. In Proceedings of the 8th international Conference on Multimodal interfaces (Banff, Alberta, Canada, November 02 -- 04, 2006). ICMI '06. ACM, New York, NY, 357--363

Digital Library

[35]

Schlomer, T., Poppinga B., Henze N.&Boll S. Gesture Recognition with a Wii Controller, Proceedings of the 2nd international conference on Tangible and embedded interaction, ACM, 2008

Digital Library

[36]

Shikler, T. S., Kaliouby, R. and Robinson, P. 2004. Design Challenges in Multimodal Inference Systems for Human-Computer Interaction. In Proceedings of International Workshop on Universal Access and Assistive Technology, Fitzwilliam College, University of Cambridge, United Kingdom, 22nd--24th March, 2004

[37]

Sun, Y., Chen, F., Shi, Y. and Chung, V. 2006. A novel method for multi-sensory data fusion in multimodal human computer interaction. In Proceedings of the 18th Australia Conf. on Computer-Human Interaction: Design: Activities, Artefacts and Environments. . OZCHI '06, vol. 206. ACM, New York, NY, 401--404.

Digital Library

[38]

Jobs, S. P. et al. (2008). Touch Screen Device, Method, and Graphical User Interface for Determining Commands by Applying Heuristics. United States Patent Application 20080122796. Kind Code A, May 29, 2008.

[39]

Vernier, F. and Nigay, L. A Framework for the Combination and Characterization of Output Modalities. In Proceedings of DSV-IS 2000, (Limerick IR, 2000). LNCS 1946, Springer-Verlag, 32--48.

[40]

Wahlster, W. 1991. User and discourse models for multimodal communication. In intelligent User Interfaces, J. W. Sullivan and S. W. Tyler, Eds. ACM, New York, NY, 45--67.

Digital Library

Cited By

Ma JZuo YJolimoy OGong ZXu W(2024)ADAS Alarm Sound Design for Autonomous Vehicles Based on Local Optimization: A Case Study in Shanghai, ChinaApplied Sciences10.3390/app14221073314:22(10733)Online publication date: 20-Nov-2024
https://doi.org/10.3390/app142210733
Ince ECha KCho J(2024)Towards a Conceptual Model of Users’ Expectations of an Autonomous In-Vehicle Multimodal ExperienceHuman Behavior and Emerging Technologies10.1155/2024/74185972024(1-14)Online publication date: 14-Mar-2024
https://doi.org/10.1155/2024/7418597
Ma JZuo YDu HWang YTan MLi J(2024)Interactive Output Modalities Design for Enhancement of User Trust Experience in Highly Autonomous DrivingInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2375697(1-19)Online publication date: 10-Jul-2024
https://doi.org/10.1080/10447318.2024.2375697
Show More Cited By

Index Terms

Fusion engines for multimodal input: a survey

Recommendations

Benchmarking fusion engines of multimodal interactive systems
ICMI-MLMI '09: Proceedings of the 2009 international conference on Multimodal interfaces

This article proposes an evaluation framework to benchmark the performance of multimodal fusion engines. The paper first introduces different concepts and techniques associated with multimodal fusion engines and further surveys recent implementations. ...
An Evaluation Framework for Assessing and Optimizing Multimodal Fusion Engines Performance
CISIS '12: Proceedings of the 2012 Sixth International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS)

The current development of interactive systems is shifting its focus into adding new features and capabilities, encompassing for example, new input devices and ways of interacting. Some applications make use of different modalities for both input and ...
Adaptive multimodal fusion
UAHCI'11: Proceedings of the 6th international conference on Universal access in human-computer interaction: design for all and eInclusion - Volume Part I

Multimodal interfaces offer its users the possibility of interacting with computers, in a transparent, natural way, by means of various modalities. Fusion engines are key components in multimodal systems, responsible for combining information from ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMI-MLMI '09: Proceedings of the 2009 international conference on Multimodal interfaces

November 2009

374 pages

ISBN:9781605587721

DOI:10.1145/1647314

General Chairs:
James L. Crowley
INRIA Grenoble Rhône-Alpes Research Centre, France
,
Yuri Ivanov
MERL, USA
,
Christopher Wren
Google, USA
,
Program Chairs:
Daniel Gatica-Perez
Idiap Research Institute, Switzerland
,
Michael Johnston
AT&T Research, USA
,
Rainer Stiefelhagen
University of Karlsruhe, Germany

Copyright © 2009 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICMI-MLMI '09

Sponsor:

SIGCHI

ICMI-MLMI '09: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES/WORKSHOP ON MACHINE LEARNING FOR MULTIMODAL INTERFACES

November 2 - 4, 2009

Massachusetts, Cambridge, USA

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

89
Total Citations
View Citations
1,036
Total Downloads

Downloads (Last 12 months)50
Downloads (Last 6 weeks)6

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ma JZuo YJolimoy OGong ZXu W(2024)ADAS Alarm Sound Design for Autonomous Vehicles Based on Local Optimization: A Case Study in Shanghai, ChinaApplied Sciences10.3390/app14221073314:22(10733)Online publication date: 20-Nov-2024
https://doi.org/10.3390/app142210733
Ince ECha KCho J(2024)Towards a Conceptual Model of Users’ Expectations of an Autonomous In-Vehicle Multimodal ExperienceHuman Behavior and Emerging Technologies10.1155/2024/74185972024(1-14)Online publication date: 14-Mar-2024
https://doi.org/10.1155/2024/7418597
Ma JZuo YDu HWang YTan MLi J(2024)Interactive Output Modalities Design for Enhancement of User Trust Experience in Highly Autonomous DrivingInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2375697(1-19)Online publication date: 10-Jul-2024
https://doi.org/10.1080/10447318.2024.2375697
Zuo YMa JGong ZZhao JZang L(2024)User Interaction Mode Selection and Preferences in Different Driving States of Automotive Intelligent CockpitDesign, User Experience, and Usability10.1007/978-3-031-61353-1_18(262-274)Online publication date: 15-Jun-2024
https://doi.org/10.1007/978-3-031-61353-1_18
Stefanidi EBentvelzen MWoźniak PKosch TWoźniak MMildner TSchneegass SMüller HNiess J(2023)Literature Reviews in HCI: A Review of ReviewsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581332(1-24)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581332
Niu HVan Leeuwen CHao JWang GLachmann T(2022)Multimodal Natural Human–Computer Interfaces for Computer-Aided Design: A Review PaperApplied Sciences10.3390/app1213651012:13(6510)Online publication date: 27-Jun-2022
https://doi.org/10.3390/app12136510
Siegert IWeißkirchen NWendemuth A(2022)Acoustic-Based Automatic Addressee Detection for Technical Systems: A ReviewFrontiers in Computer Science10.3389/fcomp.2022.8317844Online publication date: 14-Jul-2022
https://doi.org/10.3389/fcomp.2022.831784
Ordones Raposo NCastro ACastro TLima DMakedon F(2022)Enhancing interaction of people with quadriplegiaProceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments10.1145/3529190.3529218(223-229)Online publication date: 29-Jun-2022
https://dl.acm.org/doi/10.1145/3529190.3529218
Zimmerer CKrop PFischbach MLatoschik M(2022)Reducing the Cognitive Load of Playing a Digital Tabletop Game with a Multimodal InterfaceProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3502062(1-13)Online publication date: 29-Apr-2022
https://dl.acm.org/doi/10.1145/3491102.3502062
Zimmerer CFischbach MLatoschik M(2022)A Case Study on the Rapid Development of Natural and Synergistic Multimodal Interfaces for XR Use-CasesExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491101.3503552(1-8)Online publication date: 27-Apr-2022
https://dl.acm.org/doi/10.1145/3491101.3503552
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten