skip to main content
10.1145/544220.544248acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
Article

A multilingual, multimodal digital video library system

Published:14 July 2002Publication History

ABSTRACT

This paper presents the iVIEW system, a multi-lingual, multi-modal digital video content management system for intelligent searching and access of English and Chinese video contents. iVIEW allows full content indexing, searching and retrieval of multi-lingual text, audio and video material. It consists image processing techniques for scenes and scene changes analyses, speech processing techniques for audio signal transcriptions, and multi-lingual natural language processing techniques for word relevance determination. iVIEW can host multi-lingual contents and allow multi-modal search. It facilitate content developers to perform multi-modal information processing of rich video media and to construct XML-based multimedia representation in enhancing multi-modal indexing and searching capabilities, so that the end users can enjoy viewing flexible and seamless delivery of multimedia contents in various browsing tools and devices.

References

  1. M. Christel, H.D. Wactlar, S. Stevens, R. Reddy, M. Mauldin, and T. Kanade, "Techniques for the Creation and Exploration of Digital Video Libraries" Multimedia Tools and Applications (Volume 2), Borko Furht, editor. Boston, MA: Kluwer Academic Publishers, 1996Google ScholarGoogle Scholar
  2. L.R. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition", Proc. IEEE, pp. 257--286, February 1989Google ScholarGoogle Scholar
  3. P.C. Woodland, T. Hain, S.E. Johnson, T.R. Niesler, A. Tuerk, and S.J. Young, "Experiments in broadcast news transcription", Proc. IEEE, pp. 909--912, vol. 2, May 1998Google ScholarGoogle ScholarCross RefCross Ref
  4. H.M. Meng, P.Y. Hui, "Spoken document retrieval for the languages of Hong Kong", Proceedings of 2001 International Symposium on Multimedia Processing, pp. 201--204, May 2001Google ScholarGoogle ScholarCross RefCross Ref
  5. D.M. Lovekin, R.E. Yantorno, K.R. Krishnamachari, "Developing usable speech criteria for speaker identification technology", Proc. IEEE, pp. 421--424, vol. 1, May 2001Google ScholarGoogle ScholarCross RefCross Ref
  6. S. Mori, C.Y. Suen, and K. Yamamoto, "Historical review of OCR research and development," Proceedings of the IEEE , Volume: 80, Issue: 7, July 1992, Page(s): 1029--1058Google ScholarGoogle ScholarCross RefCross Ref
  7. G. Nagy, "At the frontiers of OCR," Proceedings of the IEEE, Volume: 80, Issue: 7 , July 1992, Page(s): 1093--1100Google ScholarGoogle ScholarCross RefCross Ref
  8. Jaehwa Park, V. Govindaraju, and S.N. Srihari, "OCR in a hierarchical feature space," IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 22 Issue: 4, April 2000, Page(s): 400--407 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Yihong Xu and G. Nagy, "Prototype extraction and adaptive OCR," IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 21 Issue: 12 , Dec. 1999, Page(s): 1280--1296 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M.D. Ganis, C.L. Wilson, and J.L. Blue, "Neural network-based systems for handprint OCR applications," IEEE Transactions on Image Processing, Volume: 7 Issue: 8 , Aug. 1998, Page(s): 1097--1112 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Huiping Li, D. Doermann, and O. Kia, "Automatic text detection and tracking in digital video" IEEE Transactions on Image Processing, Volume: 9 Issue: 1 , Jan. 2000, Page(s): 147--156 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. V. Wu, R. Manmatha, and E.M. Riseman, "Textfinder: an automatic system to detect and recognize text in images," IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 21 Issue: 11 , Nov. 1999, Page(s): 1224--1229 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. U. Gargi, D. Crandall, S. Antani, T. Gandhi, R. Keener, and R. Kasturi, "A system for automatic text detection in video," Proceedings of the Fifth International Conference on Document Analysis and Recognition (ICDAR '99) , 1999, Page(s): 29--32 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Garcia, and X. Apostolidis, "Text detection and segmentation in complex color images," Proceedings 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '00, Volume: 4 , 2000, Page(s): 2326--2329 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. Agnihotri, and N. Dimitrova, "Text detection for video analysis," Proceedings 1999 IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL '99), Page(s): 109--113 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. A. Turk and A. P. Pentland, "Face Recognition Using Eigenfaces," Proceedings IEEE Computer Society Conf. Computer Vision and Pattern Recognition, Maui, Hawaii, 1991, pp. 586--591Google ScholarGoogle Scholar
  17. Tai Sing Lee, "Image Representation Using 2D Gabor Wavelets" IEEE Transactions on pattern analysis and machine intelligence, Vol. 18, No. 10, October 1996 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Laurenz Wiskott, Jean-Marc Fellous, Norbert Krüger, and Christoph von der Malsburg, "Face Recognition by Elastic Bunch Graph Matching," IEEE Transactions on pattern analysis and machine intelligence, Vol. 19, No.7 July 1997 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. {Belhumeur97} Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman. "Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection," IEEE Transactions on pattern analysis and machine intelligence, VOL. 19, NO. 7, JULY 1997 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jun Zhang, Yong Yan, and Martin Lades, "Face Recognition: Eigenface, Elastic Matching, and Neural Nets," Proceedings of the IEEE, Vol. 85, No. 9, September 1997Google ScholarGoogle Scholar
  21. M.N. Wallick, N. da Vitoria Lobo, M. Shah, "A system for placing videotaped and digital lectures on-line", Proceedings of 2001 International Symposium on Multimedia Processing, pp. 461--464, May 2001Google ScholarGoogle Scholar
  22. M. Viswanathan, H.S.M. Beigi, A. Tritschler, F. Maali,"Information access using speech, speaker and face recognition", IEEE International Conference on Multimedia and Expo, pp. 493--496, vol. 1, July-August 2000, New YorkGoogle ScholarGoogle ScholarCross RefCross Ref
  23. R. Houghton, "Named Faces: putting names to faces," IEEE Intelligent Systems, Volume: 14 Issue: 5 , Sept.-Oct. 1999, Page(s): 45--50 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. H.D. Wactlar, T. Kanade, M.A. Smith, S.M. Stevens. "Intelligent Access to Digital Video: Informedia Project," IEEE Computer, volume 29, issue 5, pp. 46--52, May 1996 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Christel, A. Warmack, A. Hauptmann, and S. Crosby, "Adjustable Filmstrips and Skims as Abstractions for a Digital Video Library," IEEE Advances in Digital Libraries Conference 1999, Baltimore, MD. pp. 98--104, May 19--21, 1999 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Christel, A. Olligschlaeger, and C. Hung, "Interactive Maps for a Digital Video Library," IEEE Multimedia 7(1), pp. 60--67, 2000 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Christel, B. Maher, and A. Begun, "XSLT for Tailored Access to a Digital Video Library," Joint Conference on Digital Libraries (JCDL '01), Roanoke, VA, pp.290--299, June 24--28, 2001 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. W. H. Cheung, M. R. Lyu, and K.W. Ng, "Integrating Digital Libraries by CORBA, XML and Servlet," Proceedings First ACM/IEEE-CS Joint Conference on Digital Libraries, Roanoke, Virginia, June 24--28 2001, pp.472 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. H.D. Wactlar, "Informedia - Search and Summarization in the Video Medium," Imagina 2000 Conference, Monaco, January 31--February 2, 2000Google ScholarGoogle Scholar

Index Terms

  1. A multilingual, multimodal digital video library system

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          JCDL '02: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
          July 2002
          448 pages
          ISBN:1581135130
          DOI:10.1145/544220

          Copyright © 2002 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 July 2002

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          JCDL '02 Paper Acceptance Rate69of240submissions,29%Overall Acceptance Rate415of1,482submissions,28%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader