Article

Using model trees for evaluating dialog error conditions based on acoustic information

Authors:

Abe Kazemzadeh,

Shrikanth NarayananAuthors Info & Claims

HCM '06: Proceedings of the 1st ACM international workshop on Human-centered multimedia

Pages 109 - 114

https://doi.org/10.1145/1178745.1178763

Published: 27 October 2006 Publication History

Abstract

This paper examines the use of model trees for evaluating user utterances for response to system error in dialogs from the Communicator 2000 corpus. The features used by the model trees are limited to those which can be automatically obtained through acoustic measurements. These features are derived from pitch and energy measurements. The curve of the model tree output versus dialog turn is interpreted to be a measure of the level of user activation in the dialog. We test the premise that user response to error at the utterance level is related to user satisfaction at the dialog level. Several different evaluation tasks are investigated: on an utterance level we applied the model tree output to detecting response to error and on the dialog level we analyzed the relation of model tree output to estimating user satisfaction. For the former, we achieve 65% precision and 63% recall and for the latter our predictions show significant .48 correlation with user surveys.

References

[1]

L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Sone. Classification and Regression Trees. Chapman and Hall, Boca Raton, 1984.]]

[2]

R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz, and J. Taylor. Emotion recognition in human-computer interaction. IEEE Sig. Proc. Mag., vol. 18(1), pp. 3280, Jan, 2001.]]

[3]

J. Hirschberg, D. Litman, and M. Swerts. Prosodic cues to recognition errors. In ASRU, 1999.]]

[4]

J. Hirschberg, D. Litman, and M. Swerts. Identifying user corrections automatically in spoken dialogue systems. In NAACL, 2001.]]

Digital Library

[5]

A. Kazemzadeh, S. Lee, and S. Narayanan. Acoustic correlates of user response to errors in human-computer dialogues. In ASRU, St. Thomas, U.S. Virgin Islands, 2003.]]

[6]

C. Lee and S. Narayanan. Towards detecting emotions in spoken dialogs. IEEE Transactions on Speech and Audio Processing, 2004. (in press).]]

[7]

C. M. Lee, S. Narayanan, and R. Pieraccini. Classifying emotions in human-machine spoken dialogs. In ICME, Lusanne, Switzerland, 2002.]]

[8]

C. M. Lee, S. Narayanan, and R. Pieraccini. Combining acoustic and language information for emotion recognition. In ICSLP, Denver, CO, 2002.]]

[9]

S. Lee, E. Ammicht, E. Fosler-Lussier, J. Kuo, and A. Potamianos. Spoken dialogue evaluation for the bell labs communicator system. In HLT, San Diego, California, 2002.]]

[10]

E. Levin, S. Narayanan, R. Pieraccini, K. Biatov, E. Bocchieri, G. di Fabbrizio, W. Eckert, S. Lee, A. Pokrovsky, M. Rahim, P. Ruscitti, and M. Walker. The at&t-darpa communicator mixed initiative spoken dialog system. In Proc. of ICSLP, Beijing, 2000.]]

[11]

D. J. Litman, M. S. Kearns, S. Singh, and M. Walker. Automatic optimization of dialog management. In COLING, Saarbruken, Germany, 2000.]]

Digital Library

[12]

I. Murray and J. Arnott. Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion. J. Acoust. Soc. Am., 93 (2), Feb, 1993.]]

[13]

S. Narayanan. Toward modeling user behavior in human-machine interactions: Effects of errors and emotions. In ISLE Workshop on Multimodal Dialog Tagging, Edinburgh, UK, 2002.]]

[14]

J. Noad, S. Whiteside, and P. Green. A macroscopic analysis of an emotional speech corpus. In Eurospeech, Rhodes, Greece, 1997.]]

[15]

R. I. of Technology in Stockholm. The snack sound toolkit. http://www.speech.kth.se/snack/. Viewed 6/26/2005.]]

[16]

A. Potamianos, E. Ammicht, and H.-K. J. Kuo. Dialogue management in the bell labs communicator system. In ICSLP, Beijing, China, 2000.]]

[17]

J. R. Quinlan. Learning with continuous classes. In Proc. Fifth Australian Joint Conference on Artificial Inteligence, Hobart, Tasmania, 1992. World Scientific, Singapore.]]

[18]

H. Sagawa, T. Mitamura, and E. Nyberg. Correction grammars for error handling in a speech dialog system. In HLT/NAACL, Boston, 2004.]]

[19]

J. Shin, S. Narayanan, L. Gerber, A. Kazemzadeh, and D. Byrd. Analysis of user behavior under error conditions in spoken dialogues. In ICSLP, Denver, 2002.]]

[20]

M. Swerts, D. Litman, and J. Hirshberg. Corrections in spoken dialogue systems. In ICLSP, Beijing, 2000.]]

[21]

Walker, R. Passonneau, J. Aberdeen, J. Boland, E. Bratt, J. Garofolo, L. Hirschman, A. Le, S. Lee, S. Narayanan, K. Papineni, B. Pellom, J. Polifroni, A. Potamianos, P. Prabhu, A. Rudnicky, G. Sanders, S. Seneff, D. Stallard, and S. Whittaker. Cross-site evaluation in darpa communicator: The june 2000 data collection. Computer Speech and Language, 2002.]]

[22]

M. Walker, J. Aberdeen, J. Boland, E. Bratt, J. Garofolo, L. Hirschman, A. Le, S. Lee, S. Narayanan, K. Papineni, B. Pellom, J. Polifroni, A. Potamianos, P. Prabhu, A. Rudnicky, G. Sanders, S. Seneff, D. Stallard, and S. Whittaker. Darpa communicator dialog travel planning systems: The june 2000 data collection. In Proc. Eurospeech, Aalborg, Sweden, 2001.]]

[23]

M. Walker, J. Fromer, and S. Narayanan. Learning optimal dialogue strategies: A case study of a spoken dialogue agent for email. In ACL/COLING, Hamburg, 1998.]]

Digital Library

[24]

M. Walker, D. Litman, C. Kamm, and A. Abella. Paradise: A framework for evaluating spoken dialogue agents. In Association of Computational Linguistics, ACL 97, Madrid, 1997.]]

Digital Library

[25]

M.Walker, D. J. Litman, C. A. Kamm, and A. Abella. Evaluating spoken dialogue agents with paradise: Two case studies. Computer Speech and Language, 12(3), 1998.]]

[26]

M. A. Walker, R. Passonneau, and J. E. Boland. Quantitative and qualitative evaluation of darpa communicator spoken dialogue systems. In ACL, Toulouse, France, 2001.]]

Digital Library

[27]

Y. Wang and I. H. Witten. Inducing model trees for continuous classes. In 9th European Conference on Machine Learning, Prague, April 1997.]]

[28]

W. Ward and B. Pellom. The cu communicator system. In IEEE ASRU, Keystone, CO, 1999.]]

[29]

I. H. Witten and E. Frank. Data Mining. Morgan Kaufmann, San Francisco, 2000.]]

Digital Library

[30]

B. Wrede and E. Shriberg. The relationship between dialog acts and hot spots in meetings. In ASRU, St. Thomas, US Virgin Islands, 2003.]]

Cited By

Kazemzadeh A(2011)Toward a computational approach for natural language description of emotionsProceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II10.5555/2062850.2062875(216-223)Online publication date: 9-Oct-2011
https://dl.acm.org/doi/10.5555/2062850.2062875
Kazemzadeh A(2011)Toward a Computational Approach for Natural Language Description of EmotionsAffective Computing and Intelligent Interaction10.1007/978-3-642-24571-8_23(216-223)Online publication date: 2011
https://doi.org/10.1007/978-3-642-24571-8_23

Index Terms

Recommendations

Voice activity detection applied to hands-free spoken dialogue robot based on decoding using acoustic and language model
RoboComm '07: Proceedings of the 1st international conference on Robot communication and coordination

Speech recognition and speech-based dialogue are means for realizing communication between humans and robots. In case of conventional system setup a headset or a directional microphone is used to collect speech with high signal-to-noise ratio (SNR). ...
Multilingual recognition of non-native speech using acoustic model transformation and pronunciation modeling

This article presents an approach for the automatic recognition of non-native speech. Some non-native speakers tend to pronounce phonemes as they would in their native language. Model adaptation can improve the recognition rate for non-native speakers, ...
Exploring Mixed-Initiative Dialogue Using Computer Dialogue Simulation

This paper experimentally shows that mixed-initiative dialogue is not always more efficient than non-mixed initiative dialogue in route finding tasks. Based on the dialogue model proposed in Conversation Analysis and Discourse Analysis a lá the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HCM '06: Proceedings of the 1st ACM international workshop on Human-centered multimedia

October 2006

138 pages

ISBN:1595935002

DOI:10.1145/1178745

General Chairs:
Daniel Gatica-Perez
IDIAP Research Institute, Switzerland
,
Alejandro Jaimes
FXPAL Japan, Fuji Xerox, Co., Ltd., Japan
,
Nicu Sebe
University of Amsterdam, The Netherlands

Copyright © 2006 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

MM06

Sponsor:

MM06: The 14th ACM International Conference on Multimedia 2006

October 27, 2006

California, Santa Barbara, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
151
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kazemzadeh A(2011)Toward a computational approach for natural language description of emotionsProceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II10.5555/2062850.2062875(216-223)Online publication date: 9-Oct-2011
https://dl.acm.org/doi/10.5555/2062850.2062875
Kazemzadeh A(2011)Toward a Computational Approach for Natural Language Description of EmotionsAffective Computing and Intelligent Interaction10.1007/978-3-642-24571-8_23(216-223)Online publication date: 2011
https://doi.org/10.1007/978-3-642-24571-8_23

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten