research-article

Usage patterns and latent semantic analyses for task goal inference of multimodal user interactions

Authors:

Helen MengAuthors Info & Claims

IUI '10: Proceedings of the 15th international conference on Intelligent user interfaces

Pages 129 - 138

https://doi.org/10.1145/1719970.1719989

Published: 07 February 2010 Publication History

Abstract

This paper describes our work in usage pattern analysis and development of a latent semantic analysis framework for interpreting multimodal user input consisting speech and pen gestures. We have designed and collected a multimodal corpus of navigational inquiries. Each modality carries semantics related to domain-specific task goal. Each inquiry is annotated manually with a task goal based on the semantics. Multimodal input usually has a simpler syntactic structure than unimodal input and the order of semantic constituents is different in multimodal and unimodal inputs. Therefore, we proposed to use semantic analysis to derive the latent semantics from the multimodal inputs using latent semantic modeling (LSM). In order to achieve this, we parse the recognized Chinese spoken input for the spoken locative references (SLR). These SLRs are then aligned with their corresponding pen gesture(s). Then, we characterized the cross-modal integration pattern as 3-tuple multimodal terms with SLR, pen gesture type and their temporal relation. The inquiry-multimodal term matrix is then decomposed using singular value decomposition (SVD) to derive the latent semantics automatically. Task goal inference based on the latent semantics shows that the task goal inference accuracy on a disjoint test set is of 99%.

References

[1]

Nigay, L. and J. Coutaz, "A Generic Platform for Addressing the Multimodal Challenge," in the Proc. of CHI, 1995.

Digital Library

[2]

Wang, S. "A Multimodal Galaxy-based Geographic System," S.M. Thesis, MIT, 2003.

[3]

Johnston, M. et al., "Unification-based Multimodal Integration," in the Proc. of COLING-ACL, 1997.

Digital Library

[4]

Johnston, M., "Unification-based Multimodal Parsing," in the Proc. of COLING-ACL, 1998.

Digital Library

[5]

Wu, L. et al., "Multimodal Integration - A Statistical View," IEEE Transactions on Multimedia, 1(4), pp.334--341, 1999.

Digital Library

[6]

Wahlster, W. et al., SmartKom (www.smartkom.org)

[7]

Johnston, M. & S. Bangalore, "Finite-state Multimodal Parsing and Understanding," in the Proc. of COLING, 2000.

Digital Library

[8]

Chai, J. et. al., "A Probabilistic Approach to Reference Resolution in Multimodal User Interfaces," in the Proc. of IUI, 2004.

Digital Library

[9]

Chai, J. et. al., "Optimization in Multimodal Interpretation," in the Proc. of ACL, 2004.

Digital Library

[10]

Qu, S. and J. Chai, "Salience Modeling based on Non-verbal Modalities for Spoken Language Understanding," in the Proc. of ICMI, 2006.

Digital Library

[11]

Hui, P. Y. and H. Meng, "Cross-Modality Semantic Integration with Hypothesis Rescoring for Robust Interpretation of Multimodal User Interactions," IEEE Trans. on Audio, Speech and Language Processing, Vol. 17, No. 3, 2009.

Digital Library

[12]

Meng, H., et al., "To Believe is to Understand," in the Proc. of the Eurospeech, 1999.

[13]

Chan, S. F. and H. Meng, "Interdependencies among Dialog Acts, Task Goals and Discourse Inheritance in Mixed-Initiative Dialog," in the Proc. of the HLT, 2002.

Digital Library

[14]

Bellegarda, J. R., "Latent Semantic Mapping: Principles and Applications," Synthesis Lectures on Speech and Audio Processing, Vol. 3, No. 1, 2007.

Digital Library

[15]

Naptali, W., et al., "Word Co-occurrence Matrix and Context Dependent Class in LSA based Language Model for Speech Recognition," International Journal of Computers, Issue 1, Volume 3, 2009.

[16]

Song, W. and S. C. Park, "A Novel Document Clustering Model Based on Latent Semantic Analysis," in the Proc. of the ICSKG, 2007.

Digital Library

[17]

Chen, B., "Word Topic Models for Spoken Document Retrieval and Transcription," ACM Trans. on Asian Language Information Processing, Vol. 18, No. 1, 2009.

Digital Library

[18]

Lee, J. H., et al., "Automatic Generic Document Summarization based on Non-negative Matrix Factorization," International Journal on Information Processing and Management, Vol. 45, Issue 1, 2009.

Digital Library

[19]

Oviatt, S., et al., "Integration and Synchronization of Input Modes during Multimodal Human-Computer Interaction," in the Proc. of the CHI, 1997.

Digital Library

[20]

Hofmann, T., "Probabilistic Latent Semantic Analysis," in the Proc. of UAI, 1999.

Digital Library

[21]

Blei, D. M., et. al., "Latent Dirichlet allocation," Journal of Machine Learning Research 3, 2003.

Digital Library

[22]

Salton, G. and M. McGill, Introduction to Modern Information Retrieval, McGraw-Hall, New York, New Jersey, USA, 1983.

Digital Library

Cited By

He Y(2010)Goal detection from natural language queriesProceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems10.5555/1894525.1894547(157-168)Online publication date: 23-Jun-2010
https://dl.acm.org/doi/10.5555/1894525.1894547
He Y(2010)Goal Detection from Natural Language QueriesNatural Language Processing and Information Systems10.1007/978-3-642-13881-2_16(157-168)Online publication date: 2010
https://doi.org/10.1007/978-3-642-13881-2_16

Index Terms

Usage patterns and latent semantic analyses for task goal inference of multimodal user interactions
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction devices
      1. Touch screens
    2. Interaction paradigms
      1. Natural language interfaces
  2. Interaction design
    1. Interaction design process and methods
      1. User centered design

Recommendations

Cross-Modality Semantic Integration With Hypothesis Rescoring for Robust Interpretation of Multimodal User Interactions

We develop a framework pertaining to automatic semantic interpretation of multimodal user interactions using speech and pen gestures. The two input modalities abstract the user's intended message differently into input events, e.g., key terms/phrases in ...
Exploration of Usage Patterns for Multimodal Input Interactions
IndiaHCI '14: Proceedings of the 6th Indian Conference on Human-Computer Interaction

This work attempts to study the usage patterns for multimodal input interactions like, pattern of errors, user's modality preferences when dealing with errors, their pattern of switching across modalities and their preferences of input modalities. In ...
Intent capturing through multimodal inputs
HCI'13: Proceedings of the 15th international conference on Human-Computer Interaction: interaction modalities and techniques - Volume Part IV

Virtual manufacturing environments need complex and accurate 3D human-computer interaction. One main problem of current virtual environments (VEs) is the heavy overloads of the users on both cognitive and motor operational aspects. This paper ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

IUI '10: Proceedings of the 15th international conference on Intelligent user interfaces

February 2010

460 pages

ISBN:9781605585154

DOI:10.1145/1719970

General Chairs:
Charles Rich
Worcester Polytechnic Institute, USA
,
Qiang Yang
Hong Kong University of Science & Technology, China
,
Program Chairs:
Marc Cavazza
Teesside University, UK
,
Michelle Zhou
IBM Research, China

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 February 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

IUI '10

Sponsor:

IUI '10: 15th International Conference on Intelligent User Interfaces

February 7 - 10, 2010

Hong Kong, China

Acceptance Rates

Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Sponsor:
sigai
sigai

30th International Conference on Intelligent User Interfaces

March 24 - 27, 2025

Cagliari , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
289
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

He Y(2010)Goal detection from natural language queriesProceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems10.5555/1894525.1894547(157-168)Online publication date: 23-Jun-2010
https://dl.acm.org/doi/10.5555/1894525.1894547
He Y(2010)Goal Detection from Natural Language QueriesNatural Language Processing and Information Systems10.1007/978-3-642-13881-2_16(157-168)Online publication date: 2010
https://doi.org/10.1007/978-3-642-13881-2_16

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten