skip to main content
10.1145/1647314.1647344acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

A fusion framework for multimodal interactive applications

Published: 02 November 2009 Publication History

Abstract

This research aims to propose a multi-modal fusion framework for high-level data fusion between two or more modalities. It takes as input low level features extracted from different system devices, analyses and identifies intrinsic meanings in these data. Extracted meanings are mutually compared to identify complementarities, ambiguities and inconsistencies to better understand the user intention when interacting with the system. The whole fusion life cycle will be described and evaluated in an office environment scenario, where two co-workers interact by voice and movements, which might show their intentions. The fusion in this case is focusing on combining modalities for capturing a context to enhance the user experience.

References

[1]
V. Akman. From discourse to logic: Introduction to model-theoretic semantics of natural language, formal logic and discourse representation theory. Computational Linguistics, 21(2):265--268, June 1995.
[2]
J. Bouchet and L. Nigay. Icare: A component-based approach for the design and development of multimodal interfaces. In CHI '04 Extended Abstracts on Human Factors in Computing Systems, Vienna, 2004.
[3]
G. Calvary, J. Coutaz, D. Thevenin, Q. Limbourg, L. Bouillon, and J. Vanderdonckt. A unifying reference framework for multi-target user interfaces. Interacting with Computers, Vol. 15(No. 3):289--308, June 2003.
[4]
U. catholique de Louvain. Interface de recherche multimodale dans le contenu audivisuel -- irma. http://www.irmaproject.net/, December 2008.
[5]
L. Cole and D. Austin. Visual object recognition using template matching. In Proceedings of Australasian Conference on Robotics and Automation, 2004.
[6]
I. Corporation. Open source computer vision library -- opencv. http://www.intel.com/technology/computing/opencv/index.htm, January 2009.
[7]
J. Curran, S. Clark, and J. Bos. Linguistically motivated large-scale nlp with c&c and boxer. In Proceedings of the ACL 2007 Demonstrations Session, pages 29--32, 2007.
[8]
R. D. and M. B. A master-slaves volumetric framework for 3d reconstruction from images. In Proceedings of the SPIE'07, volume Vol. 6491, San Jose, USA, 2007.
[9]
B. A. Davis J. The representation and recognition of action using temporal templates. Technical Report 402, MIT Media Lab Technical Report, 1997.
[10]
R. Engel and N. P°eger. SmartKom: Foundations of Multimodal Dialogue Systems. Springer, Berlin, Germany, 2006.
[11]
J. Hockenmaier and M. Steedman. Ccgbank: A corpus of ccg derivations and dependency structures extracted from the penn treebank. Comput. Linguist., 33(3):355--396, 2007.
[12]
S. M. Inc. Project grizzly. https://grizzly.dev.java.net/, May 2009.
[13]
J.-Y. L. Lawson, A.-A. Al-Akkad, J. Vanderdonckt, and B. Macq. An open source workbench for prototyping multimodal interactions based on off-the-shelf heterogeneous components. In Proceedings of the EICS'09, Pittsburgh, USA, July 2009.
[14]
J.-Y. L. Lawson and B. Macq. Openinterface platform for multimodal applications prototyping. In ICASSP Show&Tell '08, Las Vegas, USA, April 2008.
[15]
J. C. Martin. Tycoon: Theoretical framework and software tools for multimodal interfaces, 1998.
[16]
J. C. Martin and M. Kipp. Annotating and measuring multimodal behaviour -- tycoon metrics in the anvil tool, 2002.
[17]
H. Mendonca. Meanings4fusion. http://kenai.com/projects/meanings4fusion, May 2009.
[18]
U. of Michigan. Soar. http://sitemaker.umich.edu/soar/home, August 2008.
[19]
N. Pfleger. Context based multimodal fusion. In Proceedings of the ICMI'04, Pennsylvania, USA, October 2004.
[20]
D. Reynolds and R. Rose. Robust text--independent speaker identification using gaussian mixture speaker models. Speech and Audio Processing, IEEE Transactions on, Vol. 3(No. 1):pp. 72--83, 1995.
[21]
S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, New Jersey, 2 ed. edition, 2002.
[22]
Y. Shoham and K. Leyton-Brown. Multiagent Systems. Cambridge, New York, 2009.
[23]
D. o. C. S. Stanford University. Protege platform. http://protege.stanford.edu, January 2009.
[24]
C. Town. Multi-sensory and multi-modal fusion for sentient computing. International Journal of Computer Vision, 71(2):235--253, February 2007.
[25]
O. Vybornova, H. Mendonca, L. Lawson, and B. Macq. High level data fusion on a multimodal interactive application platform. In Proceedings of IEEE ISM'08, Berkeley, USA, December 2008.
[26]
W3C. Resource description framework -- rdf. http://www.w3.org/TR/PR-rdf-syntax/, January 1999.
[27]
W. Walker. Sphinx-4: A flexible open source framework for speech recognition. Technical Report SMLI TR2004-0811, Sun Mycrosystems Inc., 2004.
[28]
X. Zou and B. Bhanu. Tracking humans using multi-modal fusion. In CVPR'2005, IEEE Computer Society Conference on, San Diego, USA, June 2005.

Cited By

View all
  • (2024)Adapting Web Applications for AR/VR using Distributed User Interfaces and Meta-UIsProcedia Computer Science10.1016/j.procs.2024.09.495246:C(762-771)Online publication date: 1-Jan-2024
  • (2018)Mass-Computer Interaction for Thousands of Users and BeyondExtended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems10.1145/3170427.3188465(1-6)Online publication date: 20-Apr-2018
  • (2017)Semantic Entity-Component State Management Techniques to Enhance Software Quality for Multimodal VR-SystemsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2017.265709823:4(1342-1351)Online publication date: 1-Apr-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI-MLMI '09: Proceedings of the 2009 international conference on Multimodal interfaces
November 2009
374 pages
ISBN:9781605587721
DOI:10.1145/1647314
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. context-sensitive interaction
  2. multi-modal fusion
  3. speech recognition

Qualifiers

  • Research-article

Conference

ICMI-MLMI '09
Sponsor:

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)2
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Adapting Web Applications for AR/VR using Distributed User Interfaces and Meta-UIsProcedia Computer Science10.1016/j.procs.2024.09.495246:C(762-771)Online publication date: 1-Jan-2024
  • (2018)Mass-Computer Interaction for Thousands of Users and BeyondExtended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems10.1145/3170427.3188465(1-6)Online publication date: 20-Apr-2018
  • (2017)Semantic Entity-Component State Management Techniques to Enhance Software Quality for Multimodal VR-SystemsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2017.265709823:4(1342-1351)Online publication date: 1-Apr-2017
  • (2015)Multimodal Systems: An Excursus of the Main Research QuestionsOn the Move to Meaningful Internet Systems: OTM 2015 Workshops10.1007/978-3-319-26138-6_59(546-558)Online publication date: 28-Oct-2015
  • (2014)Review ArticlePattern Recognition Letters10.1016/j.patrec.2013.07.00336(189-195)Online publication date: 1-Jan-2014
  • (2010)A survey and analysis of frameworks and framework issues for information fusion applicationsProceedings of the 5th international conference on Hybrid Artificial Intelligence Systems - Volume Part I10.1007/978-3-642-13769-3_2(14-23)Online publication date: 23-Jun-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media