Article

Prototyping novel collaborative multimodal systems: simulation, data collection and analysis tools for the next decade

Authors:

Alexander M. Arthur,

Rebecca Lunsford,

Sharon OviattAuthors Info & Claims

ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces

Pages 209 - 216

https://doi.org/10.1145/1180995.1181039

Published: 02 November 2006 Publication History

Abstract

To support research and development of next-generation multimodal interfaces for complex collaborative tasks, a comprehensive new infrastructure has been created for collecting and analyzing time-synchronized audio, video, and pen-based data during multi-party meetings. This infrastructure needs to be unobtrusive and to collect rich data involving multiple information sources of high temporal fidelity to allow the collection and annotation of simulation-driven studies of natural human-human-computer interactions. Furthermore, it must be flexibly extensible to facilitate exploratory research. This paper describes both the infrastructure put in place to record, encode, playback and annotate the meeting-related media data, and also the simulation environment used to prototype novel system concepts.

References

[1]

Banerjee, S., Rose, C., & Rudnicky, A. I. The necessity of a meeting recording and playback system and the benefit of topic-level annotations to meeting browsing. In Proceedings of the IFIP Interact'05: Human-Computer Interaction, 2005 (Rome, Italy). 3585, Springer Berlin/Heidelberg: 643--656.

Digital Library

[2]

Dielmann, A. & Renals, S. Dynamic bayesian networks for meeting structuring. In Proceedings of ICASSP '04. 2004 (Montreal, Canada). 5, IEEE: V-629-32 vol.5.

[3]

Falcon, V., Leonardi, C., Pianesi, F., Tomasini, D., & Zancanaro, M. Co-located support for small group meetings. Workshop on The Virtuality Continuum Revisited Workshop held in conjunction with Computer-Human Interaction CHI2005 Conference, 2005 (Portland, OR).

[4]

Gatica-Perez, D., Lathoud, G., Odobez, J.-M., & Mccowan, I. Multimodal multispeaker probabilistic tracking in meetings. In Proceedings of the 7th International Conference on Multimodal Interfaces, 2005 (Trento, Italy). ACM Press: 183--190.

Digital Library

[5]

Gatica-Perez, D., Mccowan, I., Zhang, D., & Bengio, S. Detecting group interest-level in meetings. In Proceedings of ICASSP '05. 2005 (Philadelphia, PA). 1, IEEE: 489--492.

[6]

Gatica-Perez, D., Odobez, J.-M., Ba, S., Smith, K., & Lathoud, G. Tracking people in meetings with particles. In Proceedings of the International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), 2005, Instituto Superior Tecnico, Lisbon.

[7]

Gruenstein, A., Niekrasz, J., & Purver, M. Meeting structure annotation: Data and tools. In Proceedings of the 6th SIGdial Workshop on Discourse and Dialogue, 2005 (Lisbon, Portugal). 117--127.

[8]

Huang, X., Oviatt, S., & Lunsford, R. Combining user modeling and machine learning to predict users' multimodal integration patterns. In Proceedings of the 3rd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, 2006 (Washington DC, USA).

Digital Library

[9]

Katzenmaier, M., Stiefelhagen, R., & Schultz, T. Identifying the addressee in human-human-robot interactions based on head pose and speech. In Proceedings of the 6th International Conference on Multimodal Interfaces, 2004 (State College, PA, USA). ACM Press: 144--151.

Digital Library

[10]

Kumar, S., Cohen, P. R., & Levesque, H. J. The adaptive agent architecture: Achieving fault-tolerance using persistent broker teams. In Proceedings of the Fourth International Conference on Multi-Agent Systems (ICMAS 2000), 2000 (Boston, MA, USA). IEEE Press: 159--166.

Digital Library

[11]

Kuperus, J. The effect of agents on meetings. In Proceedings of the The 4th Twente Student Conference on IT, 2006 (Enschede, The Netherlands). Twente University Press.

[12]

Lunsford, R., Oviatt, S., & Arthur, A. Toward open-microphone engagement for multiparty field interactions. In press, ICMI 2006.

Digital Library

[13]

Lunsford, R. & Oviatt, S. Human perception of intended addressee in multiparty meetings. In press, ICMI 2006.

Digital Library

[14]

Macho, D., Padrell, J., Abad, A., Nadeu, C., Hernando, J., Mcdonough, J., Wölfel, M., Klee, U., Omologo, M., Brutti, A., Svaizer, P., Potamianos, G., & Chu, S. M. Automatic speech activity detection, source localization, and speech recognition on the CHIL seminar corpus. In Proceedings of the IEEE International Conference on Multimedia and Expo, 2005. ICME 2005., 2005 (Amsterdam, The Netherlands). IEEE: 876--879.

[15]

Martin, J.-C. & Kipp, M. Annotating and measuring multimodal behaviour - tycoon metrics in the anvil tool. In Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002), 2002 (Las Palmas, Canary Islands, Spain).

[16]

Mccowan, I., Gatica-Perez, D., Bengio, S., Lathoud, G., Barnard, M., & Zhang, D. Automatic analysis of multimodal group actions in meetings. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005 27, IEEE: 305--317.

Digital Library

[17]

Michel, M. & Stanford, V. Synchronizing multimodal data streams acquired using commodity hardware. In submission.

[18]

Milde, J.-T. & Gut, U. The TASX-environment: An xml-based corpus database for time aligned language data. In Proceedings of the IRCS Workshop on Linguistic Databases, 2001 (Pennsylvania, Philadelphia). 174--180.

[19]

Moore, D., The IDIAP smart meeting room. IDIAP-COM 02-07, 2002.

[20]

NIST smart space project. http://www.nist.gov/smartspace/

[21]

Oviatt, S., Cohen, P., Fong, M., & Frank, M. A rapid semi-automatic simulation technique for investigating interactive speech and handwriting. In Proceedings of the International Conference on Spoken Language Processing, 1992 (University of Alberta). 2, Quality Color Press, Edmonton, Canada: 1351--1354.

[22]

Oviatt, S. L., Coulston, R., Tomko, S., Xiao, B., Lunsford, R., Wesson, M., & Carmichael, L. Toward a theory of organized multimodal integration patterns during human-computer interaction. In Proceedings of the International Conference on Multimodal Interfaces, 2003 (Vancouver, BC). ACM Press: 44--51.

Digital Library

[23]

Reiter, S., Schreiber, S., & Rigoll, G. Multimodal meeting analysis by segmentation and classification of meeting events based on a higher level semantic approach. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). 2005 (Philadelphia, PA). 2, IEEE: 161--164.

[24]

Rienks, R., Nijholt, A., & Barthelmess, P. Pro-active meeting assistants: Attention please. In Proceedings of the 5th Workshop on Social Intelligence Design, 2006 (Osaka, Japan).

[25]

Stanford, V., Garofolo, J., Galibert, O., Michel, M., & Laprun, C. The NIST smart space and meeting room projects: Signals, acquisition annotation, and metrics. In Proceedings of ICASSP '03, 2003 4, IEEE: IV-736-9 vol.4.

[26]

Tucker, S. & Whittaker, S. Novel techniques for time-compressing speech: An exploratory study. In Proceedings of ICASSP '05. 2005 (Philadelphia, PA). 1, IEEE: 477--480.

[27]

White, R. L. & Percival, J. W. Compression and progressive transmission of astronomical images. The International Society for Optical Engineering, 2199: 703--713.

[28]

Wölfel, M., Nickel, K., & Mcdonough, J. Microphone array driven speech recognition: Influence of localization on the word error rate. Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, 2005 (Edinburgh). 3869: 320--331.

Digital Library

Cited By

Al-Azani SSait SAl-Utaibi K(2022)A Comprehensive Literature Review on Children’s Databases for Machine Learning ApplicationsIEEE Access10.1109/ACCESS.2022.314600810(12262-12285)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3146008
Oviatt SLin JSriramulu A(2021)I Know What You Know: What Hand Movements Reveal about Domain ExpertiseACM Transactions on Interactive Intelligent Systems10.1145/342304911:1(1-26)Online publication date: 15-Mar-2021
https://dl.acm.org/doi/10.1145/3423049
Sriramulu ALin JOviatt S(2019)Dynamic Adaptive Gesturing Predicts Domain Expertise in Mathematics2019 International Conference on Multimodal Interaction10.1145/3340555.3353726(105-113)Online publication date: 14-Oct-2019
https://dl.acm.org/doi/10.1145/3340555.3353726
Show More Cited By

Index Terms

Prototyping novel collaborative multimodal systems: simulation, data collection and analysis tools for the next decade
1. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing theory, concepts and paradigms
      1. Computer supported cooperative work
  2. Interaction design
    1. Interaction design process and methods
      1. Interface design prototyping
2. Social and professional topics
  1. Professional topics
    1. Computing and business
      1. Computer supported cooperative work

Recommendations

Multimodal slideshow: demonstration of the openinterface interaction development environment
ICMI '08: Proceedings of the 10th international conference on Multimodal interfaces

In this paper, we illustrate the OpenInterface Interaction Development Environment (OIDE) that addresses the design and development of multimodal interfaces. Multimodal interaction software development presents a particular challenge because of the ever ...
Human-robot collaborative tutoring using multiparty multimodal spoken dialogue
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction

In this paper, we describe a project that explores a novel experimental setup towards building a spoken, multi-modally rich, and human-like multiparty tutoring robot. A human-robot interaction setup is designed, and a human-human dialogue corpus is ...
ICMI 2014 Workshop on Multimodal, Multi-Party, Real-World Human-Robot Interaction
ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction

The Workshop on Multimodal, Multi-Party, Real-World Human-Robot Interaction will be held in Istanbul on 16 November 2014, co-located with the 16th International Conference on Multimodal Interaction (ICMI 2014). The workshop objective is to address the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces

November 2006

404 pages

ISBN:159593541X

DOI:10.1145/1180995

General Chairs:
Francis Quek
Virginia Tech, USA
,
Jie Yang
Carnegie Mellon University, USA
,
Program Chairs:
Dominic Massaro
University of California, Santa Cruz, USA
,
Abeer Alwan
University of California, Los Angeles, USA
,
Timothy J. Hazen
Massachusetts Institute of Technology, USA

Copyright © 2006 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

ICMI06

Sponsor:

ICMI06: 8th International Conference on Multimodal Interfaces 2006

November 2 - 4, 2006

Alberta, Banff, Canada

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

21
Total Citations
View Citations
454
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Al-Azani SSait SAl-Utaibi K(2022)A Comprehensive Literature Review on Children’s Databases for Machine Learning ApplicationsIEEE Access10.1109/ACCESS.2022.314600810(12262-12285)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3146008
Oviatt SLin JSriramulu A(2021)I Know What You Know: What Hand Movements Reveal about Domain ExpertiseACM Transactions on Interactive Intelligent Systems10.1145/342304911:1(1-26)Online publication date: 15-Mar-2021
https://dl.acm.org/doi/10.1145/3423049
Sriramulu ALin JOviatt S(2019)Dynamic Adaptive Gesturing Predicts Domain Expertise in Mathematics2019 International Conference on Multimodal Interaction10.1145/3340555.3353726(105-113)Online publication date: 14-Oct-2019
https://dl.acm.org/doi/10.1145/3340555.3353726
Oviatt SHang KZhou JYu KChen F(2018)Dynamic Handwriting Signal Features Predict Domain ExpertiseACM Transactions on Interactive Intelligent Systems10.1145/32133098:3(1-21)Online publication date: 24-Jul-2018
https://dl.acm.org/doi/10.1145/3213309
Oviatt SGrafsgaard JChen LOchoa X(2018)Multimodal learning analyticsThe Handbook of Multimodal-Multisensor Interfaces10.1145/3107990.3108003(331-374)Online publication date: 1-Oct-2018
https://dl.acm.org/doi/10.1145/3107990.3108003
Cohen POviatt S(2017)Multimodal speech and pen interfacesThe Handbook of Multimodal-Multisensor Interfaces10.1145/3015783.3015795(403-447)Online publication date: 24-Apr-2017
https://dl.acm.org/doi/10.1145/3015783.3015795
Jacob MWachs J(2016)Optimal Modality Selection for Cooperative Human–Robot Task CompletionIEEE Transactions on Cybernetics10.1109/TCYB.2015.250698546:12(3388-3400)Online publication date: Dec-2016
https://doi.org/10.1109/TCYB.2015.2506985
Oviatt SCohen P(2015)The Paradigm Shift to Multimodality in Contemporary Computer InterfacesSynthesis Lectures on Human-Centered Informatics10.2200/S00636ED1V01Y201503HCI0308:3(1-243)Online publication date: 13-Apr-2015
https://doi.org/10.2200/S00636ED1V01Y201503HCI030
Oviatt SHang KZhou JChen FZhang ZCohen PBohus DHoraud RMeng H(2015)Spoken Interruptions Signal Productive Problem Solving and Domain Expertise in MathematicsProceedings of the 2015 ACM on International Conference on Multimodal Interaction10.1145/2818346.2820743(311-318)Online publication date: 9-Nov-2015
https://dl.acm.org/doi/10.1145/2818346.2820743
Tan CMirza-Babaei PZammitto VCanossa AConley GWallner GCox ACairns PBernhaupt RNacke L(2015)Tool Design JamProceedings of the 2015 Annual Symposium on Computer-Human Interaction in Play10.1145/2793107.2810263(827-831)Online publication date: 5-Oct-2015
https://dl.acm.org/doi/10.1145/2793107.2810263
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten