tutorial

Speech-based Interaction: Myths, Challenges, and Opportunities

Authors:
Cosmin Munteanu

University of Toronto Mississauga, Mississauga, ON, Canada

University of Toronto Mississauga, Mississauga, ON, Canada
View Profile

,
Gerald Penn

University of Toronto, Toronto, ON, Canada

University of Toronto, Toronto, ON, Canada
View Profile

IUI '15: Proceedings of the 20th International Conference on Intelligent User InterfacesMarch 2015Pages 437–438https://doi.org/10.1145/2678025.2716263

Published:18 March 2015Publication History

IUI '15: Proceedings of the 20th International Conference on Intelligent User Interfaces

Pages 437–438

ABSTRACT

HCI research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most difficult modalities to be understood by machines -- despite, and perhaps, because it is the highest-bandwidth communication channel we possess. While significant research efforts, from engineering, to linguistic, and to cognitive sciences, have been spent on improving machines' ability to understand speech, the HCI community has been relatively timid in embracing this modality as a central focus of research. This can be attributed in part to the relatively discouraging levels of accuracy in understanding speech, in contrast with often-unfounded claims of success from industry, but also to the intrinsic difficulty of designing and especially evaluating speech and natural language interfaces.

The goal of this course is to inform the IUI community of the current state of speech and natural language research, to dispel some of the myths surrounding speech-based interaction, as well as to provide an opportunity for researchers and practitioners to learn more about how speech recognition and speech synthesis work, what are their limitations, and how they could be used to enhance current interaction paradigms. Through this, we hope that IUI researchers and general HCI, UI, and UX practitioners will learn how to combine recent advances in speech processing with user-centred principles in designing more usable and useful speech-based interactive systems.

References

Business Insider (2012). Frankly, It's Concerning that Apple is Still Advertising A Product as Flawed as Siri. http://www.businessinsider.com, 2012.Google Scholar
Gizmodo (2011). Siri is Apple's Broken Promise. http://www.gizmodo.com, 2011.Google Scholar
Fournier, H., et al. (2011). A Multidisciplinary Approach to Enhancing Infantry Training through Immersive Technologies. Proc of I/ITSEC.Google Scholar
Munteanu, C. et al. (2006). Automatic speech recognition for webcasts: how good is good enough and what to do when it isn't. Proc. of ICMI. Google ScholarDigital Library
Munteanu, C., Baecker, R., and Penn, G., (2008). Collaborative Editing for Improved Usefulness and Usability of Transcript-Enhanced Webcasts. Proc CHI. Google ScholarDigital Library
Munteanu, C., et al. (2013). Hidden in plain sight: Low literacy adults in a developed country overcoming social and educational challenges through mobile learning support tools. In J of Pers and Ubiquitous Computing. Google ScholarDigital Library
Oviatt, S. (2003). Advances in Robust Multimodal Interface Design. IEEE Comput. Graph. Appl. 23--5. Google ScholarDigital Library
Penn, G. and Zhu, X. (2008). A critical reassessment of evaluation baselines for speech summarization. In Proc. of ACL-HLT.Google Scholar
Youtube (2012). Upload statistics. http://www.youtube.com/t/press_statisticsGoogle Scholar

Index Terms

Speech-based Interaction: Myths, Challenges, and Opportunities

Recommendations

Speech-based Interaction: Myths, Challenges, and Opportunities
CHI EA '15: Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems

HCI research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most difficult modalities to be ...
Read More
Speech-based interaction: myths, challenges, and opportunities
MobileHCI '14: Proceedings of the 16th international conference on Human-computer interaction with mobile devices & services

Human-Computer Interaction (HCI) research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most ...
Read More
Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IUI '15: Proceedings of the 20th International Conference on Intelligent User Interfaces
March 2015
480 pages
ISBN:9781450333061
DOI:10.1145/2678025
General Chairs:
Oliver Brdiczka
Vectra Networks, Inc.
,
Polo Chau
Georgia Tech
,
Program Chairs:
Giuseppe Carenini
University of British Columbia
,
Shimei Pan
University of Maryland
,
Per Ola Kristensson
University of Cambridge
Copyright © 2015 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 March 2015
Check for updates
Author Tags
automatic speech recognition
multimodal interfaces
speech synthesis
speech-based interaction
text-to-speech
Qualifiers
- tutorial
Conference

Acceptance Rates
IUI '15 Paper Acceptance Rate47of205submissions,23%Overall Acceptance Rate746of2,811submissions,27%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 324
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Speech-based Interaction: Myths, Challenges, and Opportunities

IUI '15: Proceedings of the 20th International Conference on Intelligent User Interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

Speech-based Interaction: Myths, Challenges, and Opportunities

Speech-based interaction: myths, challenges, and opportunities

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System