skip to main content
10.1145/357744.357880acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article
Free Access

Spoken content metadata and MPEG-7

Authors Info & Claims
Published:04 November 2000Publication History

ABSTRACT

The words spoken in an audio stream form an obvious descriptor essential to most audio-visual metadata standards. When derived using automatic speech recognition systems, the spoken content fits into neither low-level (representative) nor high-level (semantic) metadata categories. This results in difficulties in creating a representation that can support both interoperability between different extraction and application utilities while retaining robustness to the limitations of the extraction process. In this paper, we discuss the issues encountered in the design of the MPEG-7 spoken content descriptor and their applicability to other metadata standards.

References

  1. 1.See, e.g., www.mpeg-7.comGoogle ScholarGoogle Scholar
  2. 2.See, e.g., www.digitalimaging.orgGoogle ScholarGoogle Scholar
  3. 3.For a comprehensive treatment of ASR techniques see Rabiner, L and B. Juang, Fundamentals of Speech Recognition, Wiley (1997).Google ScholarGoogle Scholar
  4. 4.Johnson, S.E., et al., "Spoken document retrieval for TREC- 7 at Cambridge University", Proc. 7th text retrieval conf., NIST special publication 500-242, p 191 (1998).Google ScholarGoogle Scholar
  5. 5.Siegler, M. et al. "Experiments in Spoken Document Retrieval at CMU", Proc. 7th text retrieval conf., NIST special publication 500-242, p319 (1998).Google ScholarGoogle Scholar
  6. 6.Ng, K., "Information fusion for spoken document retrieval", Proc. ICASSP 4, p2405 (2000) Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.Wechsler M, "Spoken document retrieval based on phoneme recognition" PhD thesis, Swiss federal institute of technology, Zurich (1998)Google ScholarGoogle Scholar
  8. 8.Charlesworth, J.P.A., Garner P.N., Srinivasan S "Output of an of automatic speech recognition" ISO/1EC/JCC1/SC29/WG11 MPEG99/4458 (1999)Google ScholarGoogle Scholar
  9. 9.The seventh Text REtrieval Conference, NIST special publication 500-242 (1998)Google ScholarGoogle Scholar
  10. 10.Charlesworth, J.P.A., Gamer P.N., Srinivasan S "Results of CE of automatic speech recognition" ISO/IEC/JCCl/SC29/WGI I MPEG99/5106 (1999)Google ScholarGoogle Scholar

Index Terms

  1. Spoken content metadata and MPEG-7

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in
                  • Published in

                    cover image ACM Conferences
                    MULTIMEDIA '00: Proceedings of the 2000 ACM workshops on Multimedia
                    November 2000
                    248 pages
                    ISBN:1581133111
                    DOI:10.1145/357744

                    Copyright © 2000 ACM

                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 4 November 2000

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • Article

                    Acceptance Rates

                    Overall Acceptance Rate995of4,171submissions,24%

                    Upcoming Conference

                    MM '24
                    MM '24: The 32nd ACM International Conference on Multimedia
                    October 28 - November 1, 2024
                    Melbourne , VIC , Australia

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader