skip to main content
10.1145/2390214.2390220acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Short user-generated videos classification using accompanied audio categories

Authors Info & Claims
Published:02 November 2012Publication History

ABSTRACT

This paper investigates the classification of short user-generated videos (UGVs) using the accompanied audio data since short UGVs accounts for a great proportion of the Internet UGVs and many short UGVs are accompanied by single-category soundtracks. We define seven types of UGVs corresponding to seven audio categories respectively. We also investigate three modeling approaches for audio feature representation, namely, single Gaussian (1G), Gaussian mixture (GMM) and Bag-of-Audio-Word (BoAW) models. Then using Support Vector Machine (SVM) with three different distance measurements corresponding to three feature representations, classifiers are trained to categorize the UGVs. The accompanying evaluation results show that these approaches are effective for categorizing the short UGVs based on their audio track. Experimental results show that a GMM representation with approximated Bhattacharyya distance (ABD) measurement produces the best performance, and BoAW representation with chi_square kernel also reports comparable results.

References

  1. D. Brezeale and D. J. Cook. Using closed captions and visual features to classify movies by genre. In MDM/KDD, San Jose, CA, 2006.Google ScholarGoogle Scholar
  2. D. Brezeale and D. J. Cook. Automatic video classification: A survey of the literature. IEEE Transactions on Systems, Man, and Cybernetics, Part C, 38(3):416--430, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. X. Cheng, C. Dale, and J. Liu. Statistics and social network of youtube videos. In IWQoS, pages 229--238, Enskede, Netherlands, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  4. H. K. Ekenel, T. Semela, and R. Stiefelhagen. Content-based video genre classification using multiple cues. In AIEMPro, pages 21--26, Firenze, Italy, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Glasberg, S. Schmiedeke, M. Mocigemba, and T. Sikora. New real-time approaches for video-genre-classification using high-level descriptors and a set of classifiers. In ICSC, pages 120--127, Washington, DC, USA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Guo, D. Scott, F. Hopfgartner, and C. Gurrin. Detecting complex events in user-generated video using concept classifiers. In CBMI, pages 177--182, Annecy, France, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  7. J. R. Hershey and P. A. Olsen. VariationaluppercaseBhattacharyya divergence for hiddenuppercaseMarkov models. In ICASSP, pages 4557--4560, Las Vegas, Nevada, USA, 2008.Google ScholarGoogle Scholar
  8. B. Ionescu, K. Seyerlehner, C. Rasche, C. Vertan, and P. Lambert. Content-based video description for automatic video genre categorization. In MMM, pages 51--62, Klagenfurt, Austria, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Nancy. Manifesto for a new age. Wired Magazine, page 128, 2007.Google ScholarGoogle Scholar
  10. P. Over, G. Awad, M. Michel, J. Fiscus, W. Kraaij, A. F. Smeaton, and G. Qu?not. Trecvid 2011 -- an overview of the goals, tasks, data, evaluation mechanisms and metrics. In Proceedings of TRECVID 2011. NIST, USA, 2011.Google ScholarGoogle Scholar
  11. Q. D. Phung, C. Dorai, and S. Venkatesh. Video genre categorization using audio wavelet coefficients. In Fifth Asian Conference on Computer Vision, Melbourne, Australia, January 2002.Google ScholarGoogle Scholar
  12. M. Roach and J. S. D. Mason. Classification of video genre using audio. In INTERSPEECH, pages 2693--2696, Aalborg, Denmark, 2001.Google ScholarGoogle Scholar
  13. M. Rouvier, G. Linarès, and D. Matrouf. Robust audio-based classification of video genre. In INTERSPEECH, pages 1159--1162, Brighton, United Kingdom, 2009.Google ScholarGoogle Scholar
  14. M. Rouvier, G. Linares, and D. Matrouf. On-the-fly video genre classification by combination of audio features. In ICASSP, pages 45--48, Dallas, Texas, USA, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  15. C. C. Tan, Y.-G. Jiang, and C.-W. Ngo. Towards textually describing complex video contents with audio-visual concept classifiers. In ACM Multimedia, pages 655--658, Scottsdale, AZ, USA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. You, G. Liu, and A. Perkis. A semantic framework for video genre classification and event analysis. Sig. Proc.: Image Communication, 25(4):287--302, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid. Local features and kernels for classification of texture and object categories: A comprehensive study. International Journal of Computer Vision, 73(2):213--238, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Zhang and L. Guan. An efficient framework on large-scale video genre classification. In MMSP, pages 481--486, Saint Malo, France, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  19. W. Zhu, C. Toklu, and S.-P. Liou. Automatic news video segmentation and categorization based on closed-captioned text. In ICME, Tokyo, Japan, 2001.Google ScholarGoogle Scholar

Index Terms

  1. Short user-generated videos classification using accompanied audio categories

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      AMVA '12: Proceedings of the 2012 ACM international workshop on Audio and multimedia methods for large-scale video analysis
      November 2012
      42 pages
      ISBN:9781450315852
      DOI:10.1145/2390214

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 November 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader