Article

K-BOX: a query-by-singing based music retrieval system

Authors:
Dacheng Tao

Chinese University of Hong Kong, Shatin, Hong Kong

Chinese University of Hong Kong, Shatin, Hong Kong
View Profile

,
Hao Liu

Chinese University of Hong Kong, Shatin, Hong Kong

Chinese University of Hong Kong, Shatin, Hong Kong
View Profile

,
Xiaoou Tang

Chinese University of Hong Kong, Shatin, Hong Kong

Chinese University of Hong Kong, Shatin, Hong Kong
View Profile

MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on MultimediaOctober 2004Pages 464–467https://doi.org/10.1145/1027527.1027639

Published:10 October 2004Publication History

MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia

Pages 464–467

ABSTRACT

In this paper, we present an efficient query-by-singing based musical retrieval system. We first combine multiple Support Vector Machines by classifier committee learning to segment the sentences from a song automatically. Many new methods in manipulating Mel-Frequency Cepstral Coefficient (MFCC) matrix are studied and compared for optimal feature selection. Experiments show that the 3rd coefficient is the most relevant to music comparison out of 13 coefficients and the proposed simplified MFCC feature is able to achieve a reasonable trade-off between accuracy and efficiency. To improve system efficiency, we re-organize the database by a new two-stage clustering scheme in both time space and feature space. We combine K-means algorithm and dynamic time wrapping similarity measurement for feature space clustering. We also propose a new method for model-selection of K-means algorithm. Experiments show that the proposed approach can achieve more than 30 percent increase in accuracy while speed up more than 16 times in average query time.

References

A. Ghias, J. Logan, D. Chamberlin and B. C. Smith, "Query by Humming - Musical Information Retrieval in an Audio Database," ACM Multimedia 1995. Google ScholarDigital Library
J.S. Roger Jang, H. R. Lee and M. Y. Kao, "Content-based music retrieval using linear scaling and branch-and bound tree search," IEEE ICME 2001.Google Scholar
L Lu, H. You and H. J. Zhang, "A New Approach To Query By Humming in Music Retrieval," IEEE ICME 2001.Google Scholar
H. H. Shih, S. S. Narayanan and C.-C. Jay Kuo, "Multidimensional Humming Transcription Using a Statistical Approach for Query by Humming Systems," IEEE ICASSP 2003.Google Scholar
L. Lu, H. J. Zhang and S. Z. Li, "Content-based audio classification and segmentation by using support vector machines," ACM Multimedia System Journal 8(6), pp. 482--492, 2003.Google ScholarCross Ref
T. Joachims, "Making large-scale SVM learning practical," Advances in Kernel Methods - Support Vector Learning, MIT Press, 1999. Google ScholarDigital Library
G. Guo and S. Z. Li, "Content-based Audio Classification and Retrieval by Support Vector Machines," IEEE Trans. on NN, Vol. 14, No. 1, 2003. Google ScholarDigital Library
J. Kittler, M. Hatef, P. W. Duin and J. Matas, "On Combining Classifiers," IEEE Trans. On PAMI. Vol. 20, no. 3, pp. 226--239, Mar. 1998. Google ScholarDigital Library
L. Rabiner and B. H. Juang, "Fundamentals of Speech Recognition", Prentice Hall Signal Processing Series. Prentice Hall, Englewood Cliffs, NJ, 1993. Google ScholarDigital Library

Index Terms

K-BOX: a query-by-singing based music retrieval system
1. Hardware
  1. Communication hardware, interfaces and storage
    1. Signal processing systems
  2. Robustness
    1. Hardware reliability
      1. Signal integrity and noise analysis

Recommendations

Music segmentation and summarization based on self-similarity matrix
ICUIMC '13: Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication

In this paper, we propose a new method for segmenting and summarizing music based on its structure analysis. To do that, we first extract timbre feature from acoustic music signal and construct a self-similarity matrix that shows similarities among the ...
Read More
AAM: a dataset of Artificial Audio Multitracks for diverse music information retrieval tasks
Abstract
We present a new dataset of 3000 artificial music tracks with rich annotations based on real instrument samples and generated by algorithmic composition with respect to music theory. Our collection provides ground truth onset information and has ...
Read More
Towards Timbre-Invariant Audio Features for Harmony-Based Music

Chroma-based audio features are a well-established tool for analyzing and comparing harmony-based Western music that is based on the equal-tempered scale. By identifying spectral components that differ by a musical octave, chroma features possess a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia
October 2004
1028 pages
ISBN:1581138938
DOI:10.1145/1027527
General Chairs:
Henning Schulzrinne
Columbia University
,
Nevenka Dimitrova
Philips Research
,
Program Chairs:
Angela Sasse
UCL
,
Sue Moon
KAIST
,
Rainer Lienhart
U Augsburg
Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 October 2004
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
MFCC
music clustering
music retrieval
music segmentation
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 519
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

K-BOX: a query-by-singing based music retrieval system

MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Music segmentation and summarization based on self-similarity matrix

AAM: a dataset of Artificial Audio Multitracks for diverse music information retrieval tasks

Towards Timbre-Invariant Audio Features for Harmony-Based Music