skip to main content
10.1145/1357054.1357117acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Collaborative editing for improved usefulness and usability of transcript-enhanced webcasts

Published: 06 April 2008 Publication History

Abstract

One challenge in facilitating skimming or browsing through archives of on-line recordings of webcast lectures is the lack of text transcripts of the recorded lecture. Ideally, transcripts would be obtainable through Automatic Speech Recognition (ASR). However, current ASR systems can only deliver, in realistic lecture conditions, a Word Error Rate of around 45% -- above the accepted threshold of 25%. In this paper, we present the iterative design of a webcast extension that engages users to collaborate in a wiki-like manner on editing the ASR-produced imperfect transcripts, and show that this is a feasible solution for improving the quality of lecture transcripts. We also present the findings of a field study carried out in a real lecture environment investigating how students use and edit the transcripts.

References

[1]
L. von Ahn and L. Dabbish. Labeling Images With a Computer Game. Proc. ACM CHI, pp. 319--326, 2004.
[2]
B. Arons. Speechskimmer: A System for Interactively Skimming Recorded Speech. ACM Transactions on Computer-Human Interaction, 4(1):3--38, 1997.
[3]
R. Baecker. A Principled Design for Scalable Internet Visual Communications with Rich Media, Interactivity, and Structured Archives. Proc. CASCON, pp. 83--96, 2003.
[4]
K. Crowston, H. Annabi, J. Howson, and C. Masango. Effective Work Practices for Software Engineering: Free/Libre Open Source Software Development. Proc. ACM WISER, pp. 18--26, 2006.
[5]
C. Dufour, E. G. Toms, J. Lewis, and R. Baecker. User Strategies for Handling Information Tasks in Webcasts. Proc. ACM CHI, pp. 1343--1346, 2005.
[6]
A. Forte, and A. Bruckman. From Wikipedia to the Classroom: Exploring Online Publication and Learning. Proc. ICLS, pp. 182--188, 2006.
[7]
F. Fuegen et al. Advances in Lecture Recognition: The ISL RT-06S Evaluation System. Proc. Interspeech, pp. 1229--1232, 2006.
[8]
S. Furui. Recent Progress in Corpus-Based Spontaneous Speech Recognition. IEICE Transactions on Information and Systems, 88(3):366--375, 2005.
[9]
J. Glass et al. Recent Progress in the MIT Spoken Lecture Processing Project. Proc. Interspeech, pp. 2553--2556, 2007
[10]
K. Goldberg, B. Chen, Solomon R., and S. Bui. Collaborative Teleoperation Via The Internet. Proc. IEEE ICRA, pp. 2019--2024, 2000.
[11]
A. Hauptmann et al. Informedia at TRECVID 2003: Analyzing and Searching Broadcast News Video. Proc. (VIDEO) TREC, 2003.
[12]
S. Kuznetsov. Motivations of Contributors to Wikipedia. ACM Computers and Society, 36(2), 2006.
[13]
E. Leeuwis, M. Federico, and M. Cettolo. Language Modeling and Transcription of the TED Corpus Lectures. Proc. IEEE ICASSP, pp. 232--235, 2003.
[14]
S. Li and D. Coleman. Results of CSCW Supported Collaborative GIS Data Production: An Internet-based Solution. Proc. ISPRS SIPT, pp. 1--66, 2002.
[15]
C. Munteanu et al. The Effect of Speech Recognition Accuracy Rates on the Usefulness and Usability of Webcast Archives. Proc. ACM CHI, pp. 493--502, 2006.
[16]
C. Munteanu, G. Penn, and R. Baecker. Web-Based Language Modelling for Automatic Lecture Transcription. Proc. Interspeech, pp. 2353--2356, 2007
[17]
A. Park, T. J. Hazen, and J. R. Glass. Automatic Processing of Audio Lectures for Information Retrieval: Vocabulary Selection and Language Modeling. Proc. IEEE ICASSP, 2005.
[18]
B. L. Pellom. Sonic: The University of Colorado Continuous Speech Recognizer. Technical Report #TR-CSLR-2001-01, University of Colorado, 2001.
[19]
RealNetworks. Introduction to Streaming Media with RealPlayer. www.realnetworks.com/support//education/production.html, 2004.
[20]
I. Rogina and T. Schaaf. Lecture and Presentation Tracking in an Intelligent Meeting Room. Proc. ACM (IEEE) ICMI, pp. 47--52, 2002.
[21]
N. Sawhney and C. Schmandt. Nomadic Radio: Speech & Audio Interaction for Contextual Messaging in Nomadic Environments. ACM Transactions on Computer-Human Interaction, 7(3):353--383, 2000.
[22]
L. Stark, S. Whittaker, and J. Hirschberg. ASR Satisficing: The Effects of ASR Accuracy on Speech Retrieval. Proc. ICSLP, pp. 1069--1072, 2000.
[23]
E. G. Toms, C. Dufour, J. Lewis, and R. Baecker. Assessing Tools For Use With Webcasts. Proc. ACM/IEEE JCDL, pp. 79--88, 2005.
[24]
T. Volkmer, J. Smith, and A. Natsev. A Web-Based System for Collaborative Annotation of Large Image & Video Collections. Proc. ACM MM, pp. 892--901, 2005.
[25]
M. Wald, K. Bain, and S.H. Basson. Speech Recognition in University Classrooms. Proc. ACM SIGACCESS, pp. 192--196, 2002.
[26]
W. Ward and S. Issar. The CMU ATIS System. Proc. ARPA WSLT, pp. 249--251, 1995.
[27]
M. Weintraub, K. Taussig, K. Hunicke-Smith, and A. Snodgrass. Effect of Speaking Style on LVCSR Performance. Proc. Interspeech, pp. 16--19 (Addendum), 1996.
[28]
S. Whittaker et al. Scanmail: A Voicemail Interface that Makes Speech Browsable, Readable and Searchable. Proc. ACM CHI, pp. 275 -- 282, 2002.
[29]
S. Whittaker and J. Hirschberg. Look or Listen: Discovering Effective Techniques for Accessing Speech Data. Proc. British HCI, pp. 253--269, 2003.

Cited By

View all
  • (2024)Record, Transcribe, Share: An Accessible Open-Source Video Platform for Deaf and Hard of Hearing ViewersProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3688495(1-6)Online publication date: 27-Oct-2024
  • (2023)Accuracy of AI-generated Captions With Collaborative Manual Corrections in Real-TimeExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3585724(1-7)Online publication date: 19-Apr-2023
  • (2023)Automatic prediction of rejected edits in Stack OverflowEmpirical Software Engineering10.1007/s10664-022-10242-228:1Online publication date: 1-Jan-2023
  • Show More Cited By

Index Terms

  1. Collaborative editing for improved usefulness and usability of transcript-enhanced webcasts

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CHI '08: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
      April 2008
      1870 pages
      ISBN:9781605580111
      DOI:10.1145/1357054
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 April 2008

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. automatic speech recognition
      2. field study
      3. navigational tools
      4. text transcripts
      5. webcasting
      6. wiki

      Qualifiers

      • Research-article

      Conference

      CHI '08
      Sponsor:

      Acceptance Rates

      CHI '08 Paper Acceptance Rate 157 of 714 submissions, 22%;
      Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

      Upcoming Conference

      CHI 2025
      ACM CHI Conference on Human Factors in Computing Systems
      April 26 - May 1, 2025
      Yokohama , Japan

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)24
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 17 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Record, Transcribe, Share: An Accessible Open-Source Video Platform for Deaf and Hard of Hearing ViewersProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3688495(1-6)Online publication date: 27-Oct-2024
      • (2023)Accuracy of AI-generated Captions With Collaborative Manual Corrections in Real-TimeExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3585724(1-7)Online publication date: 19-Apr-2023
      • (2023)Automatic prediction of rejected edits in Stack OverflowEmpirical Software Engineering10.1007/s10664-022-10242-228:1Online publication date: 1-Jan-2023
      • (2021)Rollback Edit Inconsistencies in Developer Forum2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)10.1109/MSR52588.2021.00050(380-391)Online publication date: May-2021
      • (2020)How Do Users Revise Answers on Technical Q&A Websites? A Case Study on Stack OverflowIEEE Transactions on Software Engineering10.1109/TSE.2018.287447046:9(1024-1038)Online publication date: 1-Sep-2020
      • (2017)Leveraging Complementary Contributions of Different Workers for Efficient Crowdsourcing of Video CaptionsProceedings of the 2017 CHI Conference on Human Factors in Computing Systems10.1145/3025453.3026032(4617-4626)Online publication date: 2-May-2017
      • (2015)Speech-based InteractionProceedings of the 20th International Conference on Intelligent User Interfaces10.1145/2678025.2716263(437-438)Online publication date: 18-Mar-2015
      • (2015)A framework of human-based speech transcription with a speech chunking front-end2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)10.1109/APSIPA.2015.7415486(125-128)Online publication date: Dec-2015
      • (2015)Efficiency and usability study of innovative computer-aided transcription strategies for video lecture repositoriesSpeech Communication10.1016/j.specom.2015.09.00674:C(65-75)Online publication date: 1-Nov-2015
      • (2014)An empirical simulation-based study of real-time speech translation for multilingual global project teamsProceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/2652524.2652537(1-9)Online publication date: 18-Sep-2014
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media