ACM Home Page
Please provide us with feedback. Feedback
A statistical approach to retrieval under user-dependent uncertainty in query-by-humming systems
Full text PdfPdf (293 KB)
Source International Multimedia Conference archive
Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval table of contents
New York, NY, USA
SESSION: Applications I table of contents
Pages: 113 - 118  
Year of Publication: 2004
ISBN:1-58113-940-3
Authors
Erdem Unal  University of Southern California, CA
Shrikanth S. Narayanan  University of Southern California, CA
Elaine Chew  University of Southern California, CA
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 67,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1026711.1026731
What is a DOI?

ABSTRACT

Robustly addressing uncertainty in query formulation and search is one of the most challenging problems in multimedia information retrieval (MIR) systems. In this paper, a statistical approach to the problem of retrieval under the effect of uncertainty in Query by Humming (QBH) systems is presented. Direct transcription of audio to pitch and duration symbols is performed. From the transcribed data vector, finger prints that carry a fixed length of information from characteristic local points of the hummed melody are extracted. Instead of employing the humming input as a whole, extracted characteristic information packages are used for search through the database. The distance for each finger print to the original melodies in the database is calculated and converted to probabilistic similarity measures. Melodies with the highest similarity measures are returned to the user as the most likely query result. This algorithm is tested with manually annotated data comprising 250 humming samples in conjunction with a database of 200 pre-processed midi files. Retrieval accuracy of 94 percent is demonstrated for the samples of subjects that have some musical training/background compared to 72 percent accuracy achieved for the samples of non-trained subjects. Results also show that extracting finger prints with respect to characteristic local points of the hummed tune is an effective and robust way for search and retrieval under the effect of uncertainty


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Shih H.-H., Narayanan, S. S. and Kuo, C.-C. J. An HMM-based approach to humming transcription. In Proceedings of IEEE International Conference on Multimedia and Expo (ICME2002), August 2002.
 
2
Shih H.-H., Narayanan, S. S. and Kuo, C.-C. J. Multidimensional Humming Transcription Using Hidden Markov Models for Query by Humming Systems. In Proceedings of IEEE International conference on Acoustics Speech and Signal Processing, 2003
3
 
4
Bamberger, J. Turning Music Theory on its Ear. International Journal of Computers for Mathematical Learning Vol.1, No.1, 1996
 
5
Desain, P, Honing, H. The formation of rhythmic categories and metric priming. Music Perception, 2003, Vol 32, pp 341--365
6
7
 
8
9
10
 
11
Shih, H.-H., Zhang, T. and Kuo, C.-C. J. Real-time retrieval of song from music database with query-by-humming. In Proceedings of ISMIP (1999), 251--57.
 
12
Chen B. and Roger Jang, J.-S. Query by Singing. In Proceedings of 11th IPPR Conference on Computer Vision, Graphics and Image Processing (Taiwan, 1998).
 
13
Lu, L., You, H., and Zhang, H.-J. A new approach to query by humming in music retrieval. In Proceedings of IEEE International Conference on Multimedia and Expo (2001)
 
14
Haus, G. and Pollstri, E. An Audio Front End for Query-by-Humming Systems. In Proceedings of ISMIR 2001(Bloomington, Indiana, October 2001)
15
 
16
Huron, D. Tone and Voice: A Derivation of the Rules of Voice-leading from Perceptual Principles. Music Perception, Vol. 19, No. 1 (2001) pp. 1--64.
 
17
Rossing, T. D., Science of Sound, 3rd ed. (with F. Richard Moore, Paul A. Wheeler), Addison-Wesley, San Francisco, 2002
 
18
Capleton., B. Perfect Pitch http://www.amarilli.co.uk/piano/perfectp.asp


Collaborative Colleagues:
Erdem Unal: colleagues
Shrikanth S. Narayanan: colleagues
Elaine Chew: colleagues