skip to main content
10.1145/2678025.2716263acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
tutorial

Speech-based Interaction: Myths, Challenges, and Opportunities

Published:18 March 2015Publication History

ABSTRACT

HCI research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most difficult modalities to be understood by machines -- despite, and perhaps, because it is the highest-bandwidth communication channel we possess. While significant research efforts, from engineering, to linguistic, and to cognitive sciences, have been spent on improving machines' ability to understand speech, the HCI community has been relatively timid in embracing this modality as a central focus of research. This can be attributed in part to the relatively discouraging levels of accuracy in understanding speech, in contrast with often-unfounded claims of success from industry, but also to the intrinsic difficulty of designing and especially evaluating speech and natural language interfaces.

The goal of this course is to inform the IUI community of the current state of speech and natural language research, to dispel some of the myths surrounding speech-based interaction, as well as to provide an opportunity for researchers and practitioners to learn more about how speech recognition and speech synthesis work, what are their limitations, and how they could be used to enhance current interaction paradigms. Through this, we hope that IUI researchers and general HCI, UI, and UX practitioners will learn how to combine recent advances in speech processing with user-centred principles in designing more usable and useful speech-based interactive systems.

References

  1. Business Insider (2012). Frankly, It's Concerning that Apple is Still Advertising A Product as Flawed as Siri. http://www.businessinsider.com, 2012.Google ScholarGoogle Scholar
  2. Gizmodo (2011). Siri is Apple's Broken Promise. http://www.gizmodo.com, 2011.Google ScholarGoogle Scholar
  3. Fournier, H., et al. (2011). A Multidisciplinary Approach to Enhancing Infantry Training through Immersive Technologies. Proc of I/ITSEC.Google ScholarGoogle Scholar
  4. Munteanu, C. et al. (2006). Automatic speech recognition for webcasts: how good is good enough and what to do when it isn't. Proc. of ICMI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Munteanu, C., Baecker, R., and Penn, G., (2008). Collaborative Editing for Improved Usefulness and Usability of Transcript-Enhanced Webcasts. Proc CHI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Munteanu, C., et al. (2013). Hidden in plain sight: Low literacy adults in a developed country overcoming social and educational challenges through mobile learning support tools. In J of Pers and Ubiquitous Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Oviatt, S. (2003). Advances in Robust Multimodal Interface Design. IEEE Comput. Graph. Appl. 23--5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Penn, G. and Zhu, X. (2008). A critical reassessment of evaluation baselines for speech summarization. In Proc. of ACL-HLT.Google ScholarGoogle Scholar
  9. Youtube (2012). Upload statistics. http://www.youtube.com/t/press_statisticsGoogle ScholarGoogle Scholar

Index Terms

  1. Speech-based Interaction: Myths, Challenges, and Opportunities

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            IUI '15: Proceedings of the 20th International Conference on Intelligent User Interfaces
            March 2015
            480 pages
            ISBN:9781450333061
            DOI:10.1145/2678025

            Copyright © 2015 Owner/Author

            Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 18 March 2015

            Check for updates

            Qualifiers

            • tutorial

            Acceptance Rates

            IUI '15 Paper Acceptance Rate47of205submissions,23%Overall Acceptance Rate746of2,811submissions,27%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader