skip to main content
10.1145/1459359.1459382acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Combination of audio and lyrics features for genre classification in digital audio collections

Authors Info & Claims
Published:26 October 2008Publication History

ABSTRACT

In many areas multimedia technology has made its way into mainstream. In the case of digital audio this is manifested in numerous online music stores having turned into profitable businesses. The widespread user adaption of digital audio both on home computers and mobile players show the size of this market. Thus, ways to automatically process and handle the growing size of private and commercial collections become increasingly important; along goes a need to make music interpretable by computers. The most obvious representation of audio files is their sound - there are, however, more ways of describing a song, for instance its lyrics, which describe songs in terms of content words. Lyrics of music may be orthogonal to its sound, and differ greatly from other texts regarding their (rhyme) structure. Consequently, the exploitation of these properties has potential for typical music information retrieval tasks such as musical genre classification; so far, there is a lack of means to efficiently combine these modalities. In this paper, we present findings from investigating advanced lyrics features such as the frequency of certain rhyme patterns, several parts-of-speech features, and statistic features such as words per minute (WPM). We further analyse in how far a combination of these features with existing acoustic feature sets can be exploited for genre classification and provide experiments on two test collections.

References

  1. S. Baumann, T. Pohle, and S. Vembu. Towards a socio-cultural compatibility of mir systems. In Proceedings of the 5th International Conference of Music Information Retrieval (ISMIR'04), pages 460--465, Barcelona, Spain, October 10-14 2004.Google ScholarGoogle Scholar
  2. E. Brochu, N. de Freitas, and K. Bao. The sound of an album cover: Probabilistic multimedia and IR. In C. M. Bishop and B. J. Frey, editors, Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics, Key West, FL, USA, January 3-6 2003.Google ScholarGoogle Scholar
  3. W. B. Cavnar and J. M. Trenkle. N-gram-based text categorization. In Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval (SDAIR'94), pages 161--175, Las Vegas, USA, 1994.Google ScholarGoogle Scholar
  4. J. Downie. Annual Review of Information Science and Technology, volume 37, chapter Music Information Retrieval, pages 295--340. Information Today, Medford, NJ, 2003.Google ScholarGoogle Scholar
  5. J. Foote. An overview of audio information retrieval. Multimedia Systems, 7(1):2--10, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Iskandar, Y. Wang, M.-Y. Kan, and H. Li. Syllabic level automatic synchronization of music signals and text lyrics. In Proceedings of the ACM 14th International Conference on Multimedia (MM'06), pages 659--662, New York, NY, USA, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Knees, M. Schedl, T. Pohle, and G. Widmer. An Innovative Three-Dimensional User Interface for Exploring Music Collections Enriched with Meta-Information from the Web. In Proceedings of the ACM 14th International Conference on Multimedia (MM'06), pages 17--24, Santa Barbara, California, USA, October 23-26 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Knees, M. Schedl, and G. Widmer. Multiple lyrics alignment: Automatic retrieval of song lyrics. In Proceedings of 6th International Conference on Music Information Retrieval (ISMIR'05), pages 564--569, London, UK, September 11-15 2005.Google ScholarGoogle Scholar
  9. T. Lidy and A. Rauber. Evaluation of feature extractors and psycho-acoustic transformations for music genre classification. In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR'05), pages 34--41, London, UK, September 11-15 2005.Google ScholarGoogle Scholar
  10. B. Logan, A. Kositsky, and P. Moreno. Semantic analysis of song lyrics. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME'04), pages 827--830, Taipei, Taiwan, June 27-30 2004.Google ScholarGoogle ScholarCross RefCross Ref
  11. J. P. G. Mahedero, Á,. Martínez, P. Cano, M. Koppenberger, and F. Gouyon. Natural language processing of lyrics. In Proceedings of the ACM 13th International Conference on Multimedia (MM'05), pages 475--478, New York, NY, USA, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Mayer, R. Neumayer, and A. Rauber. Rhyme and style features for musical genre classification by song lyrics. In Proceedings of the 9th International Conference on Music Information Retrieval (ISMIR'08), Philadelphia, PA, USA, September 14-18 2008. Accepted for publication.Google ScholarGoogle Scholar
  13. R. Neumayer and A. Rauber. Integration of text and audio features for genre classification in music information retrieval. In Proceedings of the 29th European Conference on Information Retrieval (ECIR'07), pages 724--727, Rome, Italy, April 2-5 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Neumayer and A. Rauber. Multi-modal music information retrieval - visualisation and evaluation of clusterings by both audio and lyrics. In Proceedings of the 8th Conference Recherche d'Information Assistée par Ordinateur (RIAO'07), Pittsburgh, PA, USA, May 29th - June 1 2007.Google ScholarGoogle Scholar
  15. N. Orio. Music retrieval: A tutorial and review. Foundations and Trends in Information Retrieval, 1(1):1--90, September 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. E. Pampalk, A. Flexer, and G. Widmer. Hierarchical organization and description of music collections at the artist level. In Research and Advanced Technology for Digital Libraries ECDL'05, pages 37--48, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. Pampalk, A. Rauber, and D. Merkl. Content-based Organization and Visualization of Music Archives. In Proceedings of the ACM 10th International Conference on Multimedia (MM'02), pages 570--579, Juan les Pins, France, December 1-6 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Rauber, E. Pampalk, and D. Merkl. Using psycho-acoustic models and self-organizing maps to create a hierarchical structuring of music by musical styles. In Proceedings of the 3rd International Symposium on Music Information Retrieval (ISMIR'02), pages 71--80, Paris, France, October 13-17 2002.Google ScholarGoogle Scholar
  19. G. Salton. Automatic text processing - The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley Longman Publishing Co., Inc., 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G. Tzanetakis and P. Cook. Marsyas: A framework for audio analysis. Organized Sound, 4(30):169--175, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. G. Tzanetakis and P. Cook. Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing, 10(5):293--302, July 2002.Google ScholarGoogle ScholarCross RefCross Ref
  22. Y. Zhu, K. Chen, and Q. Sun. Multimodal content-based structure analysis of karaoke music. In Proceedings of the ACM 13th International Conference on Multimedia (MM'05), pages 638--647, Singapore, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. E. Zwicker and H. Fastl. Psychoacoustics, Facts and Models, volume 22 of Series of Information Sciences. Springer, Berlin, 2 edition, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Combination of audio and lyrics features for genre classification in digital audio collections

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          MM '08: Proceedings of the 16th ACM international conference on Multimedia
          October 2008
          1206 pages
          ISBN:9781605583037
          DOI:10.1145/1459359

          Copyright © 2008 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 26 October 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate995of4,171submissions,24%

          Upcoming Conference

          MM '24
          MM '24: The 32nd ACM International Conference on Multimedia
          October 28 - November 1, 2024
          Melbourne , VIC , Australia

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader