skip to main content
article

Towards efficient human machine speech communication: The speech graffiti project

Published: 01 February 2005 Publication History

Abstract

This research investigates the design and performance of the Speech Graffiti interface for spoken interaction with simple machines. Speech Graffiti is a standardized interface designed to address issues inherent in the current state-of-the-art in spoken dialog systems such as high word-error rates and the difficulty of developing natural language systems. This article describes the general characteristics of Speech Graffiti, provides examples of its use, and describes other aspects of the system such as the development toolkit. We also present results from a user study comparing Speech Graffiti with a natural language dialog system. These results show that users rated Speech Graffiti significantly better in several assessment categories. Participants completed approximately the same number of tasks with both systems, and although Speech Graffiti users often took more turns to complete tasks than natural language interface users, they completed tasks in slightly less time.

References

[1]
Baber, C. 1991. Human factors aspects of automatic speech recognition in control room environments. In Proceedings of IEEE Colloquium on Systems and Applications of Man-Machine Interaction Using Speech I/O. 10/1--10/3.
[2]
Black, A. and Lenzo, K. 2000. Limited domain synthesis. In Proceedings of the 6th International Conference on Spoken Language Processing (ISCLP'00). Beijing, China. 411--414.
[3]
Black, A., Taylor, P., and Caley, R. 1998. The festival speech synthesis system. Available at http://www.cstr.ed.ac.uk/projects/festival.html.
[4]
Blickenstorfer, C. H. 1995. Graffiti: Wow!!!! Pen Comput. Mag., (Jan:30-31).
[5]
Clarkson, P. and Rosenfeld, R. 1997. Statistical language modeling using the CMU-Cambridge toolkit. In Proceedings of Eurospeech. Rhodes, Greece. 2707--2710.
[6]
Eskenazi, M., Rudnicky, A., Gregory, K., Constantinides, P., Brennan, R., Bennett, C., and Allen, J. 1999. Data collection and processing in the Carnegie Mellon Communicator. In Proceedings of Eurospeech. 2695--2698.
[7]
Glass, J. 1999. Challenges for spoken dialogue systems. In Proceedings of IEEE Automatic Speech Recognition (ASRU) Workshop. Keystone, CO.
[8]
Grice, H. 1975. Logic and conversation. Syntax and Semantics, Vol. 3: Speech Acts. Academic Press, New York, NY. 41--58.
[9]
Guzman, S., Warren, R., Ahlenius, M., and Neves, D. 2001. Determining a set of acoustically discriminable, intuitive command words. In Proceedings of AVIOS Speech Technology Symposium (AVIOS'01). San Jose, CA. 241--250.
[10]
Harris, T. K. and Rosenfeld, R. A. 2004. A universal speech interface for appliances. In Proceedings of the 8th International Conference on Spoken Language Processing (ICSLP'04). Jeju Island, South Korea.
[11]
Hone, K. and Graham, R. 2001. Subjective assessment of speech-system interface usability. In Proceedings of Eurospeech, Aalborg, Denmark.
[12]
Huang, D., Alleva, F., Hon, H. W., Hwang, M. Y., Lee, K. F., and Rosenfeld, R. 1993. The Sphinx-II speech recognition system: An overview. Comput. Speech Lang. 7, 2, 137--148.
[13]
Perlman, G. 1984. Natural artificial languages: Low level processes. Int. J. Man-Machine Studies 20, 373--419.
[14]
Shneiderman, B. 1980. Software Psychology: Human Factors in Computer and Information Systems. Winthrop Inc, Cambridge MA.
[15]
Shriver, S. and Rosenfeld, R. 2002. Keywords for a universal speech interface. In Proceedings of the ACM Conference on Human Factors in Computing Systems. Minneapolis, MN. 726--727.
[16]
Sidner, C. and Forlines, C. 2002. Subset languages for conversing with collaborative interface agents. In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP'02). Denver, CO. 281--284.
[17]
Telephone Speech Standards Committee. 2000. Universal commands for telephony-based spoken language systems. SIGCHI Bull. 32, 2, 25--29.
[18]
Tomko, S. 2004. Speech Graffiti: Assessing the user experience. Carnegie Mellon University. LTI Tech Rep. CMU-LTI-04-185, Available at: www.cs.cmu.edu/∼stef/papers/mthesis.ps
[19]
Toth, A., Harris, T., Sanders, J., Shriver, S., and Rosenfeld, R. 2002. Towards every-citizen's speech interface: An application generator for speech interfaces to databases. In Proceedings of the 7th International Conference on Spoken Language Processing. Denver, CO. 1497--1500.
[20]
Ward, W. 1990. The CMU air travel information service: Understanding spontaneous speech. In Proceedings of the DARPA Speech and Language Workshop. Hidden Valley, PA. 127--129.
[21]
Zoltan-Ford, E.1991. How to get people to say and type what computers can understand. Int. J. Man-Machine Studies 34, 527--547.
[22]
Zue, V., Seneff, S., Glass, J. R., Polifroni, J., Pao, C., Hazen, T. J., and Hetherington, L. 2000. JUPITER: A telephone-based conversational interface for weather information. IEEE Trans. Speech Audio Process. 8, 1, 85--96.

Cited By

View all
  • (2024)A Fusion of EMG and IMU for an Augmentative Speech Detection and Recognition SystemIEEE Access10.1109/ACCESS.2024.335659712(14027-14039)Online publication date: 2024
  • (2023)Conversational agents on smartphones and the webDigital Therapeutics for Mental Health and Addiction10.1016/B978-0-323-90045-4.00010-1(99-112)Online publication date: 2023
  • (2020)Taylor‐AMS features and deep convolutional neural network for converting nonaudible murmur to normal speechComputational Intelligence10.1111/coin.1228136:3(940-963)Online publication date: 14-Feb-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Speech and Language Processing
ACM Transactions on Speech and Language Processing   Volume 2, Issue 1
February 2005
101 pages
ISSN:1550-4875
EISSN:1550-4883
DOI:10.1145/1075389
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 February 2005
Published in TSLP Volume 2, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Human-computer interaction
  2. speech recognition
  3. spoken dialog systems

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Fusion of EMG and IMU for an Augmentative Speech Detection and Recognition SystemIEEE Access10.1109/ACCESS.2024.335659712(14027-14039)Online publication date: 2024
  • (2023)Conversational agents on smartphones and the webDigital Therapeutics for Mental Health and Addiction10.1016/B978-0-323-90045-4.00010-1(99-112)Online publication date: 2023
  • (2020)Taylor‐AMS features and deep convolutional neural network for converting nonaudible murmur to normal speechComputational Intelligence10.1111/coin.1228136:3(940-963)Online publication date: 14-Feb-2020
  • (2019)Jointly Learning of Visual and Auditory: A New Approach for RS Image and Audio Cross-Modal RetrievalIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2019.294922012:11(4644-4654)Online publication date: Nov-2019
  • (2018)Safety First: Conversational Agents for Health CareStudies in Conversational UX Design10.1007/978-3-319-95579-7_3(33-57)Online publication date: 4-Sep-2018
  • (2016)Is Spoken Language All-or-Nothing? Implications for Future Speech-Based Human-Machine InteractionDialogues with Social Robots10.1007/978-981-10-2585-3_22(281-291)Online publication date: 25-Dec-2016
  • (2014)Towards a user experience design framework for adaptive spoken dialogue in automotive contextsProceedings of the 19th international conference on Intelligent User Interfaces10.1145/2557500.2557506(305-310)Online publication date: 24-Feb-2014
  • (2014)Low latency parameter generation for real-time speech synthesis system2014 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME.2014.6890197(1-6)Online publication date: Jul-2014
  • (2010)Selecting Help Messages by Using Robust Grammar Verification for Handling Out-of-Grammar Utterances in Spoken Dialogue SystemsIEICE Transactions on Information and Systems10.1587/transinf.E93.D.3359E93-D:12(3359-3367)Online publication date: 2010
  • (2010)A unit selection text-to-speech synthesis system optimized for use with screen readersIEEE Transactions on Consumer Electronics10.1109/TCE.2010.560634356:3(1890-1897)Online publication date: 1-Aug-2010
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media