skip to main content
10.1145/1873951.1873986acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

TalkMiner: a lecture webcast search engine

Published: 25 October 2010 Publication History

Abstract

The design and implementation of a search engine for lecture webcasts is described. A searchable text index is created allowing users to locate material within lecture videos found on a variety of websites such as YouTube and Berkeley webcasts. The index is created from words on the presentation slides appearing in the video along with any associated metadata such as the title and abstract when available.
The video is analyzed to identify a set of distinct slide images, to which OCR and lexical processes are applied which in turn generate a list of indexable terms.
Several problems were discovered when trying to identify distinct slides in the video stream. For example, picture-in-picture compositing of a speaker and a presentation slide, switching cameras, and slide builds confuse basic frame-differencing algorithms for extracting keyframe slide images. Algorithms are described that improve slide identification.
A prototype system was built to test the algorithms and the utility of the search engine. Users can browse lists of lectures, slides in a specific lecture, or play the lecture video. Over 10,000 lecture videos have been indexed from a variety of sources. A public website will be published in mid 2010 that allows users to experiment with the search engine.

References

[1]
G. Abowd. Classroom 2000: an experiment with the instrumentation of a living educational environment. IBM Systems Journal, 38(4):508--530, 1999.
[2]
Academic Earth. http://academicearth.org/.
[3]
Berkeley Webcasts. http://webcast.berkeley.edu/.
[4]
Blip TV. http://www.blip.tv/.
[5]
P. Chiu, A. Kapuskar, S. Reitmeier, and L. Wilcox. Room with a rear view: Meeting capture in a multimedia conference room. IEEE MultiMedia, 7:48--54, 2000.
[6]
Creative Commons. http://creativecommons.org, 2007.
[7]
DBSight. http://www.dbsight.net/.
[8]
L. Denoue, D. Hilbert, D. Billsus, and M. Cooper. Projectorbox: Seamless presentation capture for classrooms. In World Conf. on E-Learning in Corporate, Government, Healthcare, and Higher Education, 2005.
[9]
FreeMarker. http://freemarker.sourceforge.net/.
[10]
A. Haubold and J. R. Kender. Augmented segmentation and visualization for presentation videos. In MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia, pages 51{60, New York, NY, USA, 2005. ACM.
[11]
A. Haubold and J. R. Kender. Vast mm: multimedia browser for presentation video. In CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval, pages 41{48, New York, NY, USA, 2007. ACM.
[12]
A. G. Hauptmann, R. Jin, and T. D. Ng. Multi-modal information retrieval from broadcast video using ocr and speech recognition. In JCDL '02: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, pages 160--161, New York, NY, USA, 2002. ACM.
[13]
A. G. Hauptmann and H. D. Wactlar. Indexing and search of multimodal information. In IEEE Intl. Conf. on Acoustics Speech and Signal Processing, volume 1, pages 195--198, 1997.
[14]
T. Kawahara, M. Hasagawa, K. Shitaoka, T. Kitade, and H. Nanjo. Automatic indexing of lecture presentations using unsupervised learning of presumed discourse markers. IEEE Trans, on Audio, Speech, and Language Processing, 12(4):409--419, July 2004.
[15]
A. Kushki, M. Ajmal, and K. N. Plataniotis. Hierarchical fuzzy feature similarity combination for presentation slide retrieval. EURASIP J. Adv. Signal Process, 2008:1--19, 2008.
[16]
G. M. Liew and M.-Y. Kan. Slide image retrieval: a preliminary study. In JCDL '08: Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries, pages 359--362, New York, NY, USA, 2008. ACM.
[17]
D. G. Lowe. Object recognition from local scale-invariant features. In ICCV '99: Proceedings of the International Conference on Computer Vision-Volume 2, page 1150, Washington, DC, USA, 1999. IEEE Computer Society.
[18]
Apache Lucene. http://lucene.apache.org/java/docs/.
[19]
Mediasite by Sonic Foundry. http://www.sonicfoundry.com/mediasite.
[20]
S. Mukhopadhyay and B. Smith. Passive capture and structuring of lectures. In MULTIMEDIA '99: Proceedings of the seventh ACM international conference on Multimedia (Part 1), pages 477--487, New York, NY, USA, 1999. ACM.
[21]
Omnisio. http://www.omnisio.com/.
[22]
PARC Forum. http://www.parc.com/events/forum.html.
[23]
D. Ponceleon, A. Amir, S. Srinivasan, T. Syeda-Mahmood, and D. Petkovic. Cuevideo: automated multimedia indexing and retrieval. In MULTIMEDIA '99: Proceedings of the seventh ACM international conference on Multimedia (Part 2), page 199, New York, NY, USA, 1999. ACM.
[24]
L. A. Rowe, D. Harley, P. Pletcher, and S. Lawrence. Bibs: A lecture webcasting system. Technical report, Center for Studies in Higher Education University of California, Berkeley, 2001.
[25]
VideoLectures.NET. http://videolectures.net/.
[26]
A. Vinciarelli and J.-M. Odobez. Application of information retrieval technologies to presentation slides. IEEE Trans. on Multimedia, 8(5):981--995, 2006.
[27]
YouTube.EDU. http://www.youtube.com/edu.
[28]
YouTube. http://www.youtube.com/.
[29]
P. Ziewer. Navigational indices in full text search by automated analyses of screen recorded data. In Proc. E-Learn 2004, 2004.

Cited By

View all
  • (2024)SwapVid: Integrating Video Viewing and Document Exploration with Direct ManipulationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642515(1-13)Online publication date: 11-May-2024
  • (2024)Video Visualization and Visual Analytics: A Task-Based and Application- Driven InvestigationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.342340234:11(11316-11339)Online publication date: Nov-2024
  • (2022)Hierarchical visual interface for educational video retrieval and summarizationInternational Workshop on Advanced Imaging Technology (IWAIT) 202210.1117/12.2626092(103)Online publication date: 1-May-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '10: Proceedings of the 18th ACM international conference on Multimedia
October 2010
1836 pages
ISBN:9781605589336
DOI:10.1145/1873951
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 October 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. search
  2. video

Qualifiers

  • Research-article

Conference

MM '10
Sponsor:
MM '10: ACM Multimedia Conference
October 25 - 29, 2010
Firenze, Italy

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)SwapVid: Integrating Video Viewing and Document Exploration with Direct ManipulationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642515(1-13)Online publication date: 11-May-2024
  • (2024)Video Visualization and Visual Analytics: A Task-Based and Application- Driven InvestigationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.342340234:11(11316-11339)Online publication date: Nov-2024
  • (2022)Hierarchical visual interface for educational video retrieval and summarizationInternational Workshop on Advanced Imaging Technology (IWAIT) 202210.1117/12.2626092(103)Online publication date: 1-May-2022
  • (2022)LVTIAInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10280259:2Online publication date: 1-Mar-2022
  • (2022)Video Indexing System Based on Multimodal Information Extraction Using Combination of ASR and OCRBig-Data-Analytics in Astronomy, Science, and Engineering10.1007/978-3-030-96600-3_14(201-208)Online publication date: 18-Feb-2022
  • (2021)Summarizing Relevant Parts from Technical Videos2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER50967.2021.00047(434-445)Online publication date: Mar-2021
  • (2021)NoteLink: A Point-and-Shoot Linking Interface between Students' Handwritten Notebooks and Instructional VideosProceedings of the 2021 ACM/IEEE Joint Conference on Digital Libraries10.1109/JCDL52503.2021.00026(140-149)Online publication date: 27-Sep-2021
  • (2021)Video Index Point Detection and Extraction Framework Using Custom YoloV4 Darknet Object Detection ModelIEEE Access10.1109/ACCESS.2021.31180489(143378-143391)Online publication date: 2021
  • (2021)A Comprehensive Review of Recent Automatic Speech Summarization and Keyword Identification TechniquesArtificial Intelligence in Industrial Applications10.1007/978-3-030-85383-9_8(111-126)Online publication date: 8-Dec-2021
  • (2020)Multimodal Analysis of Video Collections: Visual Exploration of Presentation Techniques in TED TalksIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2018.288908126:7(2429-2442)Online publication date: 1-Jul-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media