|
ABSTRACT
A Query by Humming system allows the user to find a song by humming part of the tune. No musical training is needed. Previous query by humming systems have not provided satisfactory results for various reasons. Some systems have low retrieval precision because they rely on melodic contour information from the hum tune, which in turn relies on the error-prone note segmentation process. Some systems yield better precision when matching the melody directly from audio, but they are slow because of their extensive use of Dynamic Time Warping (DTW). Our approach improves both the retrieval precision and speed compared to previous approaches. We treat music as a time series and exploit and improve well-developed techniques from time series databases to index the music for fast similarity queries. We improve on existing DTW indexes technique by introducing the concept of envelope transforms, which gives a general guideline for extending existing dimensionality reduction methods to DTW indexes. The net result is high scalability. We confirm our claims through extensive experiments.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
AKoff_Sound_Labs. Ak off music composer version 2.0, http://www.akoff.com/music-composer.html, 2000.
|
 |
3
|
Norbert Beckmann , Hans-Peter Kriegel , Ralf Schneider , Bernhard Seeger, The R*-tree: an efficient and robust access method for points and rectangles, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.322-331, May 23-26, 1990, Atlantic City, New Jersey, United States
|
| |
4
|
|
 |
5
|
|
| |
6
|
|
 |
7
|
Christos Faloutsos , M. Ranganathan , Yannis Manolopoulos, Fast subsequence matching in time-series databases, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, p.419-429, May 24-27, 1994, Minneapolis, Minnesota, United States
|
 |
8
|
Asif Ghias , Jonathan Logan , David Chamberlin , Brian C. Smith, Query by humming: musical information retrieval in an audio database, Proceedings of the third ACM international conference on Multimedia, p.231-236, November 05-09, 1995, San Francisco, California, United States
[doi> 10.1145/217279.215273]
|
| |
9
|
|
| |
10
|
|
 |
11
|
|
| |
12
|
E. Keogh and T. Folias. The UCR Time Series Data Mining Archive{http://www.cs.ucr.edu/ eamonn/tsdma/index.html}, Riverside CA. University of California - Computer Science and Engineering Department, 2002.
|
| |
13
|
E. J. Keogh. Exact indexing of dynamic time warping. In VLDB 2002, Proceedings of 28th International Conference on Very Large Data Bases, August 20--23, 2002, Hong Kong, China, pages 406--417, 2002.
|
 |
14
|
Eamonn Keogh , Kaushik Chakrabarti , Michael Pazzani , Sharad Mehrotra, Locally adaptive dimensionality reduction for indexing large time series databases, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.151-162, May 21-24, 2001, Santa Barbara, California, United States
|
 |
15
|
|
 |
16
|
Flip Korn , H. V. Jagadish , Christos Faloutsos, Efficiently supporting ad hoc queries in large datasets of time sequences, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.289-300, May 11-15, 1997, Tucson, Arizona, United States
|
| |
17
|
|
 |
18
|
Naoko Kosugi , Yuichi Nishihara , Tetsuo Sakata , Masashi Yamamuro , Kazuhiko Kushima, A practical query-by-humming system for a large music database, Proceedings of the eighth ACM international conference on Multimedia, p.333-342, October 2000, Marina del Rey, California, United States
[doi> 10.1145/354384.354520]
|
| |
19
|
D. Mazzoni and R. B. Dannenberg. Melody matching directly from audio. In 2nd Annual International Symposium on Music Information Retrieval, Bloomington, Indiana, USA, 2001.
|
| |
20
|
R. J. McNab, L. A. Smith, D. Bainbridge, and I. H. Witten. The new zealand digital library melody index. In D-Lib Magazine, 1997.
|
 |
21
|
|
| |
22
|
S. Park, W. W. Chu, J. Yoon, and C. Hsu. Fast retrieval of similar sub-sequences under time warping. In ICDE, pages 23--32, 2000.
|
| |
23
|
I. Popivanov and R. J. Miller. Similarity search over time series data using wavelets. In ICDE, 2002.
|
| |
24
|
J. Profita and T.G.Bidder. Perfect pitch. In American Journal of Medical Genetics, pages 763--771, 1988.
|
 |
25
|
|
 |
26
|
|
| |
27
|
T. Tolonen and M. Karjalainen. A computationally efficient multi-pitch analysis model. IEEE Transactions on Speech and Audio Processing, 2000.
|
 |
28
|
|
 |
29
|
|
 |
30
|
Yi-Leh Wu , Divyakant Agrawal , Amr El Abbadi, A comparison of DFT and DWT based similarity search in time-series databases, Proceedings of the ninth international conference on Information and knowledge management, p.488-495, November 06-11, 2000, McLean, Virginia, United States
[doi> 10.1145/354756.354857]
|
 |
31
|
|
| |
32
|
|
| |
33
|
|
| |
34
|
|
| |
35
|
Y. Zhu and D. Shasha. Statstream: Statistical monitoring of thousands of data streams in real time. In VLDB 2002, Proceedings of 28th International Conference on Very Large Data Bases, August 20--23, 2002, Hong Kong, China, pages 358--369, 2002.
|
CITED BY 22
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bin Cui , Ling Liu , Calton Pu , Jialie Shen , Kian-Lee Tan, QueST: querying music databases by acoustic and textual features, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Qiuxia Chen , Lei Chen , Xiang Lian , Yunhao Liu , Jeffrey Xu Yu, Indexable PLA for efficient similarity search, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
|
|
|
|
|
|
|