skip to main content
10.1145/1180995.1181049acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
Article

Toward open-microphone engagement for multiparty interactions

Published: 02 November 2006 Publication History

Abstract

There currently is considerable interest in developing new open-microphone engagement techniques for speech and multimodal interfaces that perform robustly in complex mobile and multiparty field environments. State-of-the-art audio-visual open-microphone engagement systems aim to eliminate the need for explicit user engagement by processing more implicit cues that a user is addressing the system, which results in lower cognitive load for the user. This is an especially important consideration for mobile and educational interfaces due to the higher load required by explicit system engagement. In the present research, longitudinal data were collected with six triads of high-school students who engaged in peer tutoring on math problems with the aid of a simulated computer assistant. Results revealed that amplitude was 3.25dB higher when users addressed a computer rather than human peer when no lexical marker of intended interlocutor was present, and 2.4dB higher for all data. These basic results were replicated for both matched and adjacent utterances to computer versus human partners. With respect to dialogue style, speakers did not direct a higher ratio of commands to the computer, although such dialogue differences have been assumed in prior work. Results of this research reveal that amplitude is a powerful cue marking a speaker's intended addressee, which should be leveraged to design more effective microphone engagement during computer-assisted multiparty interactions.

References

[1]
Arthur, A. M., Lunsford, R., Wesson, M., & Oviatt, S., Prototyping novel collaborative multimodal systems: Simulation, data collection, and analysis tools for the next decade. In press, ICMI 2006.
[2]
Boersma, P. & Weenink, D., Praat: Doing phonetics by computer (version 4.2). 2005. (URL:www.praat.org).
[3]
Coulston, R., Oviatt, S. L., & Darves, C. Amplitude convergence in children's conversational speech with animated personas. In Proceedings of the International Conference on Spoken Language Processing (ICSLP'2002), 2002, Casual Prod. Ltd., Denver, CO: 2689--2692.
[4]
Escera, C., Corral, M.-J., & Yago, E., An electrophysiological and behavioral investigation of involuntary attention towards auditory frequency, duration and intensity changes. Cognitive Brain Research, 14, 3: 325--332.
[5]
Hirschberg, J. & Grosz, B. Intonational features of local and global discourse. In Proceedings of the Speech and Natural Language Workshop, 1992 (Harriman, NY). Morgan Kaufmann: 441--446.
[6]
Hirschberg, J. & Nakatani, C. H. A prosodic analysis of discourse segments in direction-giving monologues. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, 1996 (Santa Cruz, CA). 286--293.
[7]
Katzenmaier, M., Stiefelhagen, R., & Schultz, T. Identifying the addressee in human-human-robot interactions based on head pose and speech. In Proceedings of the International Conference on Multimodal Interfaces, 2004 (State College, PA). ACM Press: 144--151.
[8]
Levow, G.-A. Prosodic cues to discourse segment boundaries in human-computer dialogue. In Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue 2004: 93--96.
[9]
Lunsford, R., Oviatt, S. L., & Coulston, R. Audio-visual cues distinguishing self- from system-directed speech in younger and older adults. In Proceedings of the International Conference on Multimodal Interfaces, 2005 (Trento, Italy). ACM Press: 265--272.
[10]
Messer, D. J., The identification of names in maternal speech to infants. Journal of Psycholinguistic Research, 10, 1 (January 1981): 69--77.
[11]
Neti, C., Iyengar, G., Potamianos, G., Senior, A., & Maison, B. Perceptual interfaces for information interaction: Joint processing of audio and visual information for human-computer interaction. In Proceedings of the International Conference on Spoken Language Processing, 2000 (Beijing). 3, Chinese Friendship Publishers: 11--14.
[12]
Oviatt, S. L., Darves, C., & Coulston, R., Toward adaptive conversational interfaces: Modeling speech convergence with animated personas. Transactions on Human Computer Interaction (TOCHI), 11, 3: 300--328.
[13]
Oviatt, S. L., Maceachern, M., & Levow, G., Predicting hyperarticulate speech during human-computer error resolution. Speech Communication, 24, 2: 1--23.
[14]
Paek, T., Horvitz, E., & Ringger, E. Continuous listening for unconstrained spoken dialog. In Proceedings of the International Conference on Spoken Language Processing, 2000 (Beijing, China). Chinese Freindship Publishers: 138--141.
[15]
Schroger, E., A neural mechanism for involuntary attention shifts to changes in auditory stimulation. Journal of Cognitive Neuroscience, 8, 6 (November 1996): 527--539.
[16]
van Turnhout, K., Terken, J., Bakx, I., & Eggen, B. Identifying the intended addressee in mixed human-human and human-computer interaction from non-verbal features. In Proceedings of the 7th International Conference on Multimodal interfaces, 2005 (Trento, Italy). ACM Press: 175--182.
[17]
Welkowitz, J., Feldstein, S., Finklestein, M., & Aylesworth, L., Changes in vocal intensity as a function of interspeaker influence. Perceptual and Motor Skills, 35: 715--718.

Cited By

View all
  • (2024)The dynamics of human–robot trust attitude and behavior — Exploring the effects of anthropomorphism and type of failureComputers in Human Behavior10.1016/j.chb.2023.108008150(108008)Online publication date: Jan-2024
  • (2021)Prosodic Differences in Human- and Alexa-Directed Speech, but Similar Local Intelligibility AdjustmentsFrontiers in Communication10.3389/fcomm.2021.6757046Online publication date: 29-Jul-2021
  • (2016)Optimal Modality Selection for Cooperative Human–Robot Task CompletionIEEE Transactions on Cybernetics10.1109/TCYB.2015.250698546:12(3388-3400)Online publication date: Dec-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces
November 2006
404 pages
ISBN:159593541X
DOI:10.1145/1180995
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. collaborative peer tutoring
  2. computer-supported collaborative work
  3. dialogue style
  4. intended addressee
  5. multimodal interaction
  6. open-microphone engagement
  7. spoken amplitude
  8. user communication modeling

Qualifiers

  • Article

Conference

ICMI06
Sponsor:

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)The dynamics of human–robot trust attitude and behavior — Exploring the effects of anthropomorphism and type of failureComputers in Human Behavior10.1016/j.chb.2023.108008150(108008)Online publication date: Jan-2024
  • (2021)Prosodic Differences in Human- and Alexa-Directed Speech, but Similar Local Intelligibility AdjustmentsFrontiers in Communication10.3389/fcomm.2021.6757046Online publication date: 29-Jul-2021
  • (2016)Optimal Modality Selection for Cooperative Human–Robot Task CompletionIEEE Transactions on Cybernetics10.1109/TCYB.2015.250698546:12(3388-3400)Online publication date: Dec-2016
  • (2015)Spoken Interruptions Signal Productive Problem Solving and Domain Expertise in MathematicsProceedings of the 2015 ACM on International Conference on Multimodal Interaction10.1145/2818346.2820743(311-318)Online publication date: 9-Nov-2015
  • (2014)Written Activity, Representations and Fluency as Predictors of Domain Expertise in MathematicsProceedings of the 16th International Conference on Multimodal Interaction10.1145/2663204.2663245(10-17)Online publication date: 12-Nov-2014
  • (2013)Identifying the Addressee using Head Orientation and Speech Information in Multiparty Human-Agent ConversationsTransactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.28.14928:2(149-159)Online publication date: 2013
  • (2013)Written and multimodal representations as predictors of expertise and problem-solving success in mathematicsProceedings of the 15th ACM on International conference on multimodal interaction10.1145/2522848.2533793(599-606)Online publication date: 9-Dec-2013
  • (2013)Problem solving, domain expertise and learningProceedings of the 15th ACM on International conference on multimodal interaction10.1145/2522848.2533791(569-574)Online publication date: 9-Dec-2013
  • (2013)Multimodal learning analyticsProceedings of the 15th ACM on International conference on multimodal interaction10.1145/2522848.2533790(563-568)Online publication date: 9-Dec-2013
  • (2013)Implementation and evaluation of a multimodal addressee identification mechanism for multiparty conversation systemsProceedings of the 15th ACM on International conference on multimodal interaction10.1145/2522848.2522872(35-42)Online publication date: 9-Dec-2013
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media