Article

A multilingual, multimodal digital video library system

Authors:
Michael R. Lyu

The Chinese University of Hong Kong

The Chinese University of Hong Kong
View Profile

,
Edward Yau

The Chinese University of Hong Kong

The Chinese University of Hong Kong
View Profile

,
Sam Sze

The Chinese University of Hong Kong

The Chinese University of Hong Kong
View Profile

JCDL '02: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital librariesJuly 2002Pages 145–153https://doi.org/10.1145/544220.544248

Published:14 July 2002Publication History

JCDL '02: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries

Pages 145–153

ABSTRACT

This paper presents the iVIEW system, a multi-lingual, multi-modal digital video content management system for intelligent searching and access of English and Chinese video contents. iVIEW allows full content indexing, searching and retrieval of multi-lingual text, audio and video material. It consists image processing techniques for scenes and scene changes analyses, speech processing techniques for audio signal transcriptions, and multi-lingual natural language processing techniques for word relevance determination. iVIEW can host multi-lingual contents and allow multi-modal search. It facilitate content developers to perform multi-modal information processing of rich video media and to construct XML-based multimedia representation in enhancing multi-modal indexing and searching capabilities, so that the end users can enjoy viewing flexible and seamless delivery of multimedia contents in various browsing tools and devices.

References

M. Christel, H.D. Wactlar, S. Stevens, R. Reddy, M. Mauldin, and T. Kanade, "Techniques for the Creation and Exploration of Digital Video Libraries" Multimedia Tools and Applications (Volume 2), Borko Furht, editor. Boston, MA: Kluwer Academic Publishers, 1996Google Scholar
L.R. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition", Proc. IEEE, pp. 257--286, February 1989Google Scholar
P.C. Woodland, T. Hain, S.E. Johnson, T.R. Niesler, A. Tuerk, and S.J. Young, "Experiments in broadcast news transcription", Proc. IEEE, pp. 909--912, vol. 2, May 1998Google ScholarCross Ref
H.M. Meng, P.Y. Hui, "Spoken document retrieval for the languages of Hong Kong", Proceedings of 2001 International Symposium on Multimedia Processing, pp. 201--204, May 2001Google ScholarCross Ref
D.M. Lovekin, R.E. Yantorno, K.R. Krishnamachari, "Developing usable speech criteria for speaker identification technology", Proc. IEEE, pp. 421--424, vol. 1, May 2001Google ScholarCross Ref
S. Mori, C.Y. Suen, and K. Yamamoto, "Historical review of OCR research and development," Proceedings of the IEEE , Volume: 80, Issue: 7, July 1992, Page(s): 1029--1058Google ScholarCross Ref
G. Nagy, "At the frontiers of OCR," Proceedings of the IEEE, Volume: 80, Issue: 7 , July 1992, Page(s): 1093--1100Google ScholarCross Ref
Jaehwa Park, V. Govindaraju, and S.N. Srihari, "OCR in a hierarchical feature space," IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 22 Issue: 4, April 2000, Page(s): 400--407 Google ScholarDigital Library
Yihong Xu and G. Nagy, "Prototype extraction and adaptive OCR," IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 21 Issue: 12 , Dec. 1999, Page(s): 1280--1296 Google ScholarDigital Library
M.D. Ganis, C.L. Wilson, and J.L. Blue, "Neural network-based systems for handprint OCR applications," IEEE Transactions on Image Processing, Volume: 7 Issue: 8 , Aug. 1998, Page(s): 1097--1112 Google ScholarDigital Library
Huiping Li, D. Doermann, and O. Kia, "Automatic text detection and tracking in digital video" IEEE Transactions on Image Processing, Volume: 9 Issue: 1 , Jan. 2000, Page(s): 147--156 Google ScholarDigital Library
V. Wu, R. Manmatha, and E.M. Riseman, "Textfinder: an automatic system to detect and recognize text in images," IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 21 Issue: 11 , Nov. 1999, Page(s): 1224--1229 Google ScholarDigital Library
U. Gargi, D. Crandall, S. Antani, T. Gandhi, R. Keener, and R. Kasturi, "A system for automatic text detection in video," Proceedings of the Fifth International Conference on Document Analysis and Recognition (ICDAR '99) , 1999, Page(s): 29--32 Google ScholarDigital Library
C. Garcia, and X. Apostolidis, "Text detection and segmentation in complex color images," Proceedings 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '00, Volume: 4 , 2000, Page(s): 2326--2329 Google ScholarDigital Library
L. Agnihotri, and N. Dimitrova, "Text detection for video analysis," Proceedings 1999 IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL '99), Page(s): 109--113 Google ScholarDigital Library
M. A. Turk and A. P. Pentland, "Face Recognition Using Eigenfaces," Proceedings IEEE Computer Society Conf. Computer Vision and Pattern Recognition, Maui, Hawaii, 1991, pp. 586--591Google Scholar
Tai Sing Lee, "Image Representation Using 2D Gabor Wavelets" IEEE Transactions on pattern analysis and machine intelligence, Vol. 18, No. 10, October 1996 Google ScholarDigital Library
Laurenz Wiskott, Jean-Marc Fellous, Norbert Krüger, and Christoph von der Malsburg, "Face Recognition by Elastic Bunch Graph Matching," IEEE Transactions on pattern analysis and machine intelligence, Vol. 19, No.7 July 1997 Google ScholarDigital Library
{Belhumeur97} Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman. "Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection," IEEE Transactions on pattern analysis and machine intelligence, VOL. 19, NO. 7, JULY 1997 Google ScholarDigital Library
Jun Zhang, Yong Yan, and Martin Lades, "Face Recognition: Eigenface, Elastic Matching, and Neural Nets," Proceedings of the IEEE, Vol. 85, No. 9, September 1997Google Scholar
M.N. Wallick, N. da Vitoria Lobo, M. Shah, "A system for placing videotaped and digital lectures on-line", Proceedings of 2001 International Symposium on Multimedia Processing, pp. 461--464, May 2001Google Scholar
M. Viswanathan, H.S.M. Beigi, A. Tritschler, F. Maali,"Information access using speech, speaker and face recognition", IEEE International Conference on Multimedia and Expo, pp. 493--496, vol. 1, July-August 2000, New YorkGoogle ScholarCross Ref
R. Houghton, "Named Faces: putting names to faces," IEEE Intelligent Systems, Volume: 14 Issue: 5 , Sept.-Oct. 1999, Page(s): 45--50 Google ScholarDigital Library
H.D. Wactlar, T. Kanade, M.A. Smith, S.M. Stevens. "Intelligent Access to Digital Video: Informedia Project," IEEE Computer, volume 29, issue 5, pp. 46--52, May 1996 Google ScholarDigital Library
M. Christel, A. Warmack, A. Hauptmann, and S. Crosby, "Adjustable Filmstrips and Skims as Abstractions for a Digital Video Library," IEEE Advances in Digital Libraries Conference 1999, Baltimore, MD. pp. 98--104, May 19--21, 1999 Google ScholarDigital Library
M. Christel, A. Olligschlaeger, and C. Hung, "Interactive Maps for a Digital Video Library," IEEE Multimedia 7(1), pp. 60--67, 2000 Google ScholarDigital Library
M. Christel, B. Maher, and A. Begun, "XSLT for Tailored Access to a Digital Video Library," Joint Conference on Digital Libraries (JCDL '01), Roanoke, VA, pp.290--299, June 24--28, 2001 Google ScholarDigital Library
W. H. Cheung, M. R. Lyu, and K.W. Ng, "Integrating Digital Libraries by CORBA, XML and Servlet," Proceedings First ACM/IEEE-CS Joint Conference on Digital Libraries, Roanoke, Virginia, June 24--28 2001, pp.472 Google ScholarDigital Library
H.D. Wactlar, "Informedia - Search and Summarization in the Video Medium," Imagina 2000 Conference, Monaco, January 31--February 2, 2000Google Scholar

Index Terms

A multilingual, multimodal digital video library system

Recommendations

A wireless handheld multi-modal digital video library client system
MIR '03: Proceedings of the 5th ACM SIGMM international workshop on Multimedia information retrieval

We developed technologies for transmitting video contents over wireless platforms, and encapsulated these video delivery and presentation technologies into a client system for accessing a multi-modal digital video library. The mobile access system, ...
Read More
Cross-lingual query expansion in multilingual folksonomies: A case study on Flickr

Many studies on folksonomy analysis have focused on discovering meaningful patterns between three main entities (i.e., users, tags, and resources) from a folksonomy system to provide various information services to users. However, most of them have ...
Read More
Automatic processing of multilingual medical terminology: applications to thesaurus enrichment and cross-language information retrieval

Objectives:: We present in this article experiments on multi-language information extraction and access in the medical domain. For such applications, multilingual terminology plays a crucial role when working on specialized languages and specific ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
JCDL '02: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
July 2002
448 pages
ISBN:1581135130
DOI:10.1145/544220
General Chair:
William Hersh
Oregon Health & Science University
,
Program Chair:
Gary Marchionini
University of North Carolina at Chapel Hill
Copyright © 2002 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 July 2002
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
applications
browser on mobile devices
middleware and browser interactions
multi-modal interactions
multimedia management and support
Qualifiers
- Article
Conference

Acceptance Rates
JCDL '02 Paper Acceptance Rate69of240submissions,29%Overall Acceptance Rate415of1,482submissions,28%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 18
  Total Citations
  View Citations
- 1,410
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A multilingual, multimodal digital video library system

JCDL '02: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries

ABSTRACT

References

Cited By

Index Terms

Recommendations

A wireless handheld multi-modal digital video library client system

Cross-lingual query expansion in multilingual folksonomies: A case study on Flickr

Automatic processing of multilingual medical terminology: applications to thesaurus enrichment and cross-language information retrieval