ACM Home Page
Please provide us with feedback. Feedback
Towards musical query-by-semantic-description using the CAL500 data set
Full text PdfPdf (315 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Amsterdam, The Netherlands
SESSION: Music retrieval table of contents
Pages: 439 - 446  
Year of Publication: 2007
ISBN:978-1-59593-597-7
Authors
Douglas Turnbull  University of California, San Diego, La Jolla, CA
Luke Barrington  University of California, San Diego, La Jolla, CA
David Torres  University of California, San Diego, La Jolla, CA
Gert Lanckriet  University of California, San Diego, La Jolla, CA
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 22,   Downloads (12 Months): 246,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1277741.1277817
What is a DOI?

ABSTRACT

Query-by-semantic-description (QBSD)is a natural paradigm for retrieving content from large databases of music. A major impediment to the development of good QBSD systems for music information retrieval has been the lack of a cleanly-labeled, publicly-available, heterogeneous data set of songs and associated annotations. We have collected the Computer Audition Lab 500-song (CAL500) data set by having humans listen to and annotate songs using a survey designed to capture 'semantic associations' between music and words. We adapt the supervised multi-class labeling (SML) model, which has shown good performance on the task of image retrieval, and use the CAL500 data to learn a model for music retrieval. The model parameters are estimated using the weighted mixture hierarchies expectation-maximization algorithm which has been specifically designed to handle real-valued semantic association between words and songs, rather than binary class labels. The output of the SML model, a vector of class-conditional probabilities, can be interpreted as a semantic multinomial distribution over a vocabulary. By also representing a semantic query as a query multinomial distribution, we can quickly rank order the songs in a database based on the Kullback-Leibler divergence between the query multinomial and each song's semantic multinomial. Qualitative and quantitative results demonstrate that our SML model can both annotate a novel song with meaningful words and retrieve relevant songs given a multi-word, text-based query.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
S.L. Feng, R. Manmatha, and Victor Lavrenko. Multiple bernoulli relevance models for image and video annotation. IEEE CVPR 2004.
3
 
4
 
5
International conferences of music information retrieval. http://www.ismir.net/.
 
6
MIREX 2005. Music information retrieval evaluation exchange. http://www.music-ir.org/mirex 2005.
 
7
M. Goto and K. Hirata. Recent studies on music information processing. Acoustical Science and Technology 25(4):419--425, 2004.
 
8
R.B. Dannenberg and N. Hu. Understanding search performance in query-by-humming systems. ISMIR 2004.
 
9
A. Kapur, M. Benning, and G. Tzanetakis. Query by beatboxing: Music information retrieval for the dj. ISMIR 2004.
 
10
G. Eisenberg, J.M. Batke, and T. Sikora. Beatbank - an mpeg-7 compliant query by tapping system. Audio Engineering Society Convention 2004.
 
11
M.F. McKinney and J. Breebaart. Features for audio and music classification. ISMIR 2003.
 
12
T. Li and G. Tzanetakis. Factors in automatic musical genre classification of audio signals. IEEE WASPAA 2003.
 
13
S. Essid, G. Richard, and B. David. Inferring efficient hierarchical taxonomies for music information retrieval tasks: Application to musical instruments. ISMIR 2005.
 
14
F. Pachet and D. Cazaly. A taxonomy of musical genres. RIAO 2000.
 
15
 
16
 
17
B. Whitman and D. Ellis. Automatic record reviews. ISMIR 2004.
 
18
B. Whitman and R. Rifkin. Musical query-by-description as a multiclass learning problem. IEEE Workshop on Multimedia Signal Processing 2002.
 
19
M. Slaney. Semantic-audio retrieval. IEEE ICASSP 2002.
 
20
P. Cano and M. Koppenberger. Automatic sound annotation. In IEEE workshop on Machine Learning for Signal Processing 2004.
 
21
M. Slaney. Mixtures of probability experts for audio retrieval and indexing. IEEE Multimedia and Expo 2002.
 
22
 
23
N. Vasconcelos. Image indexing with mixture hierarchies. IEEE CVPR pages 3--10, 2001.
 
24
D. Turnbull, L. Barrington, and G. Lanckriet. Modelling music and words using a multi-class naíve bayes approach. ISMIR 2006.
 
25
C. McKay, D. McEnnis, and I. Fujinaga. A large publicly accessible prototype audio database for music research. ISMIR 2006.
 
26
J. Skowronek, M. McKinney, and S. ven de Par. Ground-truth for automatic music mood classification. ISMIR 2006.
 
27
Xiao Hu, J.S. Downie, and A.F. Ehmann. Exploiting recommended usage metadata: Exploratory analyses. ISMIR 2006.
 
28
 
29
L. Barrington, A. Chan, D. Turnbull, and G. Lanckriet. Audio information retrieval using semantic similarity. In IEEE ICASSP 2007.
 
30
D. Turnbull, R. Liu, L. Barrington, D. Torres,and G. Lanckriet. Using games to collect semantic information about music. Technical report, 2007.
 
31


Collaborative Colleagues:
Douglas Turnbull: colleagues
Luke Barrington: colleagues
David Torres: colleagues
Gert Lanckriet: colleagues