ABSTRACT
We focus on the problem of large-scale near duplicate image retrieval. Recent studies have shown that local image features, often referred to as key points, are effective for near duplicate image retrieval. The most popular approach for key point based image matching is the clustering-based bag-of-words model. It maps each key point to a visual word in a code-book that is constructed by a clustering algorithm, and represents each image by a histogram of visual words. Despite its success, there are two main shortcomings of the clustering-based bag-of-words model: (i) it is computationally expensive to cluster millions of key points into thousands of visual words; (ii) there is no theoretical analysis on the performance of the bag-of-words model. We propose a new scheme for key point quantization that addresses these shortcomings. Instead of clustering, the proposed scheme quantizes each key point into a binary vector using a collection of randomly generated hyper-spheres, and a bag-of-words model is constructed based on such randomized quantization.
Our theoretical analysis shows that the resulting image similarity provides an upper bound for the similarity based on the optimal partial matching between two sets of key points. Empirical study on a database of 100,000 images shows that the proposed scheme is not only more efficient but also more effective than the clustering-based approach for near duplicate image retrieval.
- ANSI/NIST-ITL 1-2007 standard: Data Format for the Interchange of Fingerprint, Facial,&Other Biometric Information. 2007.Google Scholar
- S. Boughhorbel, J.-P. Tarel, and F. Fleuret. Non-mercer kernels for svm object recognition. In British Machine Vision Conference, 2004.Google ScholarCross Ref
- O. Chum, J. Philbin, and A. Zisserman. Near duplicate image detection: min-hash and tf-idf weighting. In Proceedings of the British Machine Vision Conference, 2008.Google ScholarCross Ref
- G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision, ECCV, pages 1--22, 2004.Google Scholar
- M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the twentieth annual symposium on Computational geometry, 2004. Google ScholarDigital Library
- W. Dong, Z. Wang, M. Charikar, and K. Li. Efficiently matching sets of features with random histograms. In Proceeding of ACM Multimedia, pages 179--188, 2008. Google ScholarDigital Library
- R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience Publication, 2000. Google ScholarDigital Library
- K. Grauman and T. Darrell. The pyramid match kernel: Efficient learning with sets of features. J. Mach. Learn. Res., 8:725--760, 2007. Google ScholarDigital Library
- W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13--30, 1963.Google ScholarCross Ref
- Y. Ke, R. Sukthankar, and L. Huston. Efficient near-duplicate detection and sub-image retrieval. In Proceeding of ACM Multimedia, pages 869--876, 2004. Google ScholarDigital Library
- R. Kondor and T. Jebara. A kernel between sets of vectors. In Proceedings of the International Conference on Machine Learning, pages 361--368, 2003.Google Scholar
- J.-E. Lee, A. K. Jain, and R. Jin. Scars, marks and tattoos (SMT): Soft biometric for suspect and victim identification. In Biometric Consortium Conference and Technology Expo, 2008.Google ScholarCross Ref
- V. Lepetit, P. Lagger, and P. Fua. Randomized trees for real-time keypoint recognition. In CVPR, pages 775--781, 2005. Google ScholarDigital Library
- E. Lipton and J. Glanz. Limits of DNA research pushed to identify the dead of sept. 11. New York Times, April 2002.Google Scholar
- T. Liu, A. Moore, A. Gray, and K. Yang. An investigation of practical approximate nearest neighbor algorithms. In Neural Information Processing Systems, pages 825--832, 2004.Google Scholar
- T. Liu, C. Rosenberg, and H. Rowley. Clustering billions of images with large scale nearest neighbor search. In IEEE Workshop on Applications of Computer Vision (WACV), page 28, 2007. Google ScholarDigital Library
- D. Lowe. Distinctive image features form scale-invariant keypoints. In International Journal of Computer Vision, volume 60, pages 91--110, 2004. Google ScholarDigital Library
- S. Lyu. Mercer kernels for object recognition with local features. In CVPR, pages 223--229, 2005. Google ScholarDigital Library
- H. Moon and P. J. Phillips. Computational and performance aspects of pca--based face recognition algorithms. Perception, 30:303--321, 2001.Google ScholarCross Ref
- P. Moreno, P. Ho, and N. Vasconcelos. A kullback-leibler divergence based kernel for svm classification in multimedia applications. In Neural Information Processing Systems, 2003.Google Scholar
- M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In VISSAPP (1), pages 331--340, 2009.Google Scholar
- D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In CVPR, pages 2161--2168, 2006. Google ScholarDigital Library
- T. Pavlidis. Limitations of cbir. In ICPR, 2008.Google Scholar
- J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In CVPR, pages 1--8, 2007.Google ScholarCross Ref
- S. E. Robertson, S. Walker, and M. Hancock-Beaulieu. Okapi at trec-7. In Proceedings of the Seventh Text REtrieval Conference, 1998.Google Scholar
- G. Shakhnarovich, T. Darrell, and P. Indyk. Nearest-Neighbor Methods in Learning and Vision: Theory and Practice. MIT Press, 2006. Google ScholarDigital Library
- C. Silpa-Anan and R. Hartley. Localization using an imagemap. In Proceedings of the 2004 Australasian Conference on Robotics&Automation, 2004.Google Scholar
- C. Silpa-Anan and R. Hartley. Optimised kd-trees for fast image descriptor matching. In CVPR, pages 1--8, 2008.Google ScholarCross Ref
- J. Sivic and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. In ICCV, pages 1470--1477, 2003. Google ScholarDigital Library
- C. Wallraven, B. Caputo, and A. Graf. Recognition with local features: the kernel recipe. In CVPR, pages 257--264, 2003. Google ScholarDigital Library
- B. Wang, Z. Li, M. Li, and W.-Y. Ma. Large-scale duplicate detection for web image search. In ICME, pages 353--356, 2006.Google ScholarCross Ref
- L. Wolf and A. Shashua. Learning over sets using kernel principal angles. Journal of Machine Learning Research, 4:913--931, 2003. Google ScholarDigital Library
Index Terms
- An efficient key point quantization algorithm for large scale image retrieval
Recommendations
Re-ranking algorithm using post-retrieval clustering for content-based image retrieval
In this paper, we propose a re-ranking algorithm using post-retrieval clustering for content-based image retrieval (CBIR). In conventional CBIR systems, it is often observed that images visually dissimilar to a query image are ranked high in retrieval ...
Clustering for photo retrieval at Image CLEF 2008
CLEF'08: Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information accessThis paper presents the first participation of the University of Ottawa group in the Photo Retrieval task at Image CLEF 2008. Our system uses Lucene for text indexing and LIRE for image indexing. We experiment with several clustering methods in order to ...
Large scale partially duplicated web image retrieval
MM '10: Proceedings of the 18th ACM international conference on MultimediaThe state-of-the-art image retrieval approaches represent images with a high dimensional vector of visual words by quantizing local features, such as SIFT, in the descriptor space. The geometric clues among visual words in an image is usually ignored or ...
Comments