Article

Context based multimodal fusion

Author:

Norbert PflegerAuthors Info & Claims

ICMI '04: Proceedings of the 6th international conference on Multimodal interfaces

Pages 265 - 272

https://doi.org/10.1145/1027933.1027977

Published: 13 October 2004 Publication History

Abstract

We present a generic approach to multimodal fusion which we call <i>context based multimodal integration</i>. Key to this approach is that every multimodal input event is interpreted and enriched with respect to its <i>local turn context</i>. This local turn context comprises all previously recognized input events and the dialogue state that both belong to the same user turn. We show that a production rule system is an elegant way to handle this context based multimodal integration and we describe a first implementation of the so-called PATE system. Finally, we present results from a first evaluation of this approach as part of a human-factors experiment with the <sc>COMIC</sc> system.

References

[1]

P. R. Cohen, M. Johnston, D. McGee, S. L. Oviatt, J. A. Pittman, I. Smith, L. Chen, and J. Clow. Quickset: Multimodal interaction for distributed applications. In Proceedings of ACM Multimedia 1997, pages 31 -- 40, Seattle, WA, 1997.

Digital Library

[2]

E. den Os and L. Boves. Towards ambient intelligence: Multimodal computers that understand our intentions. In eChallenges e-2003, pages 22--24, 2003.

[3]

R. Engel and N. Pfleger. Multimodal fusion. In W. Wahlster, editor, SmartKom - Foundations of Multi-modal Dialogue Systems, Cognitive Technologies. Springer Verlag (in Press), 2004.

[4]

G. Herzog, H. Kirchmann, S. Merten, A. Ndiaye, and P. Poller. Multiplatform testbed: An integration platform for multimodal dialog systems. In H. Cunningham and J. Patrick, editors, Proceedings of the HLT-NAACL 2003 Workshop on Software Engineering and Architecture of Language Technology Systems (SEALTS), pages 75--82, Edmonton, Canada, 2003.

Digital Library

[5]

M. Johnston and S. Bangalore. Finite-state methods for multimodal parsing and integration. In Finite-state Methods Workshop, ESSLLI Summer School on Logic Language and Information, Helsinki, Finland, August 2001.

[6]

M. Johnston, S. Bangalore, and G. Vasireddy. Match: Multimodal access to city help. In Proceedings of ASRU 2001 Workshop, Madonna di Campiglio, Italy, 2001.

[7]

M. Johnston, P. R. Cohen, D. McGee, S. L. Oviatt, J. A. Pittman, and I. Smith. Unification based multimodal integration. In Proceedings of the 35th ACL, pages 281--288, Madrid, Spain, 1997.

Digital Library

[8]

B. Kempe. PATE -- a production rule system based on activation and typed feature structure elements. Master's thesis, Saarland University, 2004.

[9]

S. Oviatt. Ten myths of multimodal interaction. Communications of the ACM, 42(11):74--81, 1999.

Digital Library

[10]

S. Oviatt, R. Coulston, S. Tomko, B. Xiao, R. Lunsford, M. Wesson, and L. Carmichael. Toward a theory of organized multimodal integration patterns during human-computer interaction. In Proceedings of the 5th international conference on Multimodal interfaces, pages 44--51. ACM Press, 2003.

Digital Library

[11]

N. Pfleger, J. Alexandersson, and T. Becker. Scoring functions for overlay and their application in discourse processing. In KONVENS-02, Saarbrücken, September 2002.

[12]

P. V. and H. T. S. Multimodal tracking and classification of audio-visual features. In AAAI Workshop on Representations for Multi-modal Human-Computer Interaction, July 1998.

[13]

L. Vuurpijl, L. ten Bosch, S. Rossignol, A. Neumann, R. Engel, and N. Pfleger. Comic deliverable 3.3: Reports on human factors experiments with simultaneous coordinated speech and pen input and fusion. Technical report, The COMIC Project, 2004.

[14]

L. Vuurpijl, L. ten Bosch, S. Rossignol, A. Neumann, N. Pfleger, and R. Engel. Evaluation of multimodal input for design applications. In LREC WS on Multimodal Corpora Lisbon, Portugal, in press.

[15]

W. Wahlster. User and discourse models for multimodal communication. In J. W. Sullivan and S. W. Tyler, editor, Intelligent User Interfaces, pages 45--67. ACM Press, 1991.

Digital Library

[16]

W. Wahlster. Smartkom: Symmetric multimodality in an adaptive and reusable dialogue shell. In R. Krahl and D. Günther, editors, Proceedings of the Human Computer Interaction Status Conference 2003, pages 47--62, Berlin: DLR, June 2003.

Cited By

Jing PLiu JLiu CLi M(2024)EFCC-IeT: Cross-Modal Electronic File Content Correlation via Image-Enhanced TextKnowledge Science, Engineering and Management10.1007/978-981-97-5492-2_17(214-227)Online publication date: 26-Jul-2024
https://doi.org/10.1007/978-981-97-5492-2_17
Fang HWeng DTian Z(2023)A Parallel Multimodal Integration Framework and Application for Cake ShoppingApplied Sciences10.3390/app1401029914:1(299)Online publication date: 29-Dec-2023
https://doi.org/10.3390/app14010299
Zimmerer CFischbach MLatoschik M(2018)Semantic Fusion for Natural Multimodal Interfaces using Concurrent Augmented Transition NetworksMultimodal Technologies and Interaction10.3390/mti20400812:4(81)Online publication date: 6-Dec-2018
https://doi.org/10.3390/mti2040081
Show More Cited By

Index Terms

Context based multimodal fusion
1. Human-centered computing
  1. Human computer interaction (HCI)
  2. Interaction design
    1. Interaction design process and methods
      1. Interface design prototyping
      2. User centered design
    2. Interaction design theory, concepts and paradigms

Recommendations

When do we interact multimodally?: cognitive load and multimodal communication patterns
ICMI '04: Proceedings of the 6th international conference on Multimodal interfaces

Mobile usage patterns often entail high and fluctuating levels of difficulty as well as dual tasking. One major theme explored in this research is whether a flexible multimodal interface supports users in managing cognitive load. Findings from this ...
Toward a theory of organized multimodal integration patterns during human-computer interaction
ICMI '03: Proceedings of the 5th international conference on Multimodal interfaces

As a new generation of multimodal systems begins to emerge, one dominant theme will be the integration and synchronization requirements for combining modalities into robust whole systems. In the present research, quantitative modeling is presented on ...
Integration and synchronization of input modes during multimodal human-computer interaction
ReferringPhenomena '97: Referring Phenomena in a Multimedia Context and their Computational Treatment

Our ability to develop robust multimodal systems will depend on knowledge of the natural integration patterns that typify people's combined use of different input modes. To provide a foundation for theory and design, the present research analyzed ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMI '04: Proceedings of the 6th international conference on Multimodal interfaces

October 2004

368 pages

ISBN:1581139950

DOI:10.1145/1027933

General Chairs:
Rajeev Sharma
Advanced Interfaces
,
Trevor Darrell
Massachusetts Institute of Technology
,
Program Chairs:
Mary Harper
Purdue University, West Lafayette, IN
,
Gianni Lazzari
ITC-IRST
,
Matthew Turk
University of California, Santa Barbara, CA

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 October 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

ICMI04

Sponsor:

ICMI04: Sixth International Conference on Multimodal Interfaces 2004

October 13 - 15, 2004

PA, State College, USA

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

29
Total Citations
View Citations
1,017
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)1

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jing PLiu JLiu CLi M(2024)EFCC-IeT: Cross-Modal Electronic File Content Correlation via Image-Enhanced TextKnowledge Science, Engineering and Management10.1007/978-981-97-5492-2_17(214-227)Online publication date: 26-Jul-2024
https://doi.org/10.1007/978-981-97-5492-2_17
Fang HWeng DTian Z(2023)A Parallel Multimodal Integration Framework and Application for Cake ShoppingApplied Sciences10.3390/app1401029914:1(299)Online publication date: 29-Dec-2023
https://doi.org/10.3390/app14010299
Zimmerer CFischbach MLatoschik M(2018)Semantic Fusion for Natural Multimodal Interfaces using Concurrent Augmented Transition NetworksMultimodal Technologies and Interaction10.3390/mti20400812:4(81)Online publication date: 6-Dec-2018
https://doi.org/10.3390/mti2040081
Siddiqui MJavaid ACarvalho J(2017)A Genetic Algorithm Based Approach for Data Fusion at Grammar Level2017 International Conference on Computational Science and Computational Intelligence (CSCI)10.1109/CSCI.2017.48(286-291)Online publication date: Dec-2017
https://doi.org/10.1109/CSCI.2017.48
Schüssel FHonold FBubalo NWeber MHuckauf A(2017)Management of Multimodal User Interaction in Companion-SystemsCompanion Technology10.1007/978-3-319-43665-4_10(187-207)Online publication date: 5-Dec-2017
https://doi.org/10.1007/978-3-319-43665-4_10
Schüssel FHonold FBubalo NWeber MBöck RBonin FCampbell NPoppe R(2016)Increasing robustness of multimodal interaction via individual interaction historiesProceedings of the Workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction10.1145/3011263.3011273(56-63)Online publication date: 12-Nov-2016
https://dl.acm.org/doi/10.1145/3011263.3011273
Laleye FEzin EMotamed C(2016)Speech Phoneme Classification by Intelligent Decision-Level FusionInformatics in Control, Automation and Robotics 12th International Conference, ICINCO 2015 Colmar, France, July 21-23, 2015 Revised Selected Papers10.1007/978-3-319-31898-1_4(63-78)Online publication date: 15-May-2016
https://doi.org/10.1007/978-3-319-31898-1_4
Honold FSchüssel FWeber M(2014)The Automated Interplay of Multimodal Fission and Fusion in Adaptive HCIProceedings of the 2014 International Conference on Intelligent Environments10.1109/IE.2014.32(170-177)Online publication date: 30-Jun-2014
https://dl.acm.org/doi/10.1109/IE.2014.32
Verma GTiwary U(2014)Multimodal fusion framework: A multiresolution approach for emotion classification and recognition from physiological signalsNeuroImage10.1016/j.neuroimage.2013.11.007102(162-172)Online publication date: Nov-2014
https://doi.org/10.1016/j.neuroimage.2013.11.007
LaViola Jr. JBuchanan SPittman C(2014)Multimodal Input for Perceptual User InterfacesInteractive Displays10.1002/9781118706237.ch9(285-312)Online publication date: 12-Jul-2014
https://doi.org/10.1002/9781118706237.ch9
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten