research-article

Near-duplicate keyframe retrieval by nonrigid image matching

Authors:

Steven C.H. Hoi,

Michael R. Lyu,

Shuicheng YanAuthors Info & Claims

MM '08: Proceedings of the 16th ACM international conference on Multimedia

Pages 41 - 50

https://doi.org/10.1145/1459359.1459366

Published: 26 October 2008 Publication History

Abstract

Near-duplicate image retrieval plays an important role in many real-world multimedia applications. Most previous approaches have some limitations. For example, conventional appearance-based methods may suffer from the illumination variations and occlusion issue, and local feature correspondence-based methods often do not consider local deformations and the spatial coherence between two point sets. In this paper, we propose a novel and effective Nonrigid Image Matching (NIM) approach to tackle the task of near-duplicate keyframe retrieval from real-world video corpora. In contrast to previous approaches, the NIM technique can recover an explicit mapping between two near-duplicate images with a few deformation parameters and find out the correct correspondences from noisy data effectively. To make our technique applicable to large-scale applications, we suggest an effective multi-level ranking scheme that filters out the irrelevant results in a coarse-to-fine manner. In our ranking scheme, to overcome the extremely small training size challenge, we employ a semi-supervised learning method for improving the performance using unlabeled data. To evaluate the effectiveness of our solution, we have conducted extensive experiments on two benchmark testbeds extracted from the TRECVID2003 and TRECVID2004 corpora. The promising results show that our proposed method is more effective than other state-of-the-art approaches for near-duplicate keyframe retrieval.

References

[1]

http://vireo.cs.cityu.edu.hk/research/NDK/ndk.html.

[2]

http://www.cse.cuhk.edu.hk/~jkzhu/dup_detect.html.

[3]

H. Bay, T. Tuytelaars, and L. J. V. Gool. Surf: Speeded up robust features. In Proc. European Conf. Computer Vision, pages 404--417, 2006.

Digital Library

[4]

S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.

Digital Library

[5]

J. Canny. A computational approach to edge detection. IEEE Trans. PAMI, 8(6):679--698, 1986.

Digital Library

[6]

O. Chum and J. Matas. Matching with prosac- progressive sample consensus. In Proc. Conf. Computer Vision and Pattern Recognition, volume 1, pages 220--226, 2005.

Digital Library

[7]

M. A. Fischler and R. C. Bolles. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. CACM, 24(6):381--395, 1981.

Digital Library

[8]

P. Fua and Y. Leclerc. Object-centered surface reconstruction: Combining multi-image stereo and shading. Int'l J. Computer Vision, 16(1):35--56, Sep. 1995.

Digital Library

[9]

K. Fukunaga. Introduction to statistical pattern recognition. Academic Press Professional, Inc., 1990.

Digital Library

[10]

C.-H. Hoi, W. Wang, and M. R. Lyu. A novel scheme for video similarity detection. In CIVR, pages 373--382, 2003.

Digital Library

[11]

S. C. Hoi and M. R. Lyu. A multi-modal and multi-level ranking framework for content-based video retrieval. To appear in IEEE Transactions on Multimedia, 2008.

Digital Library

[12]

M. Kass, A. Witkin, and D. Terzopoulos. Snakes: Active contour models. Int'l J. Computer Vision, 1(4):321--331, Jan. 1988.

[13]

Y. Ke, R. Sukthankar, and L. Huston. Efficient near-duplicate detction and sub-image retrieval system. In ACM MULTIMEDIA'04, pages 869--876. ACM, 2004.

Digital Library

[14]

M. Lades, J. C. Vorbruggen, J. Buhmann, J. Lange, C. von der Malsburg, R. P. Wurtz, and W. Konen. Distortion invariant object recognition in the dynamic link architecture. IEEE Trans. Computers, 42(5):300--311, 1993.

Digital Library

[15]

D. G. Lowe. Distinctive Image Features from Scale-Invariant Keypoints. Int'l J. Computer Vision, 60(2):91--110, 2004.

Digital Library

[16]

K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. IEEE Trans. on Pattern Analysis and Machine Intelligence, 27(10):1615--1630, 2005.

Digital Library

[17]

C.-W. Ngo, W.-L. Zhao, and Y.-G. Jiang. Fast tracking of near-duplicate keyframes in broadcast domain with transitivity propagation. In ACM MULTIMEDIA'06, pages 845--854. ACM, 2006.

Digital Library

[18]

T. Ojala, M. Pietikainen, and D. Harwood. A comparative study of texture measures with classification based on feature distributions. 29(1):51--59, January 1996.

[19]

J. Pilet, V. Lepetit, and P. Fua. Fast non-rigid surface detection, registration, and realistic augmentation. Int'l J. Computer Vision, 76(2):109--122, 2008.

Digital Library

[20]

A. Qamra, Y. Meng, and E. Y. Chang. Enhanced perceptual distance functions and indexing for image replica recognition. IEEE Trans. Pattern Anal. Mach. Intell., 27(3):379--391, 2005.

Digital Library

[21]

V. Sindhwani, P. Niyogi, and M. Belkin. Beyond the point cloud: from transductive to semi-supervised learning. In ICML'05, pages 824--831. ACM Press, 2005.

Digital Library

[22]

J. Sivic and A. Zisserman. Video google: A text retrieval approach to object matching in videos. In International Conference on Computer Vision (ICCV2003), pages 1470--1477, 2003.

Digital Library

[23]

TRECVID. TREC video retrieval evaluation. In http://www-nlpir.nist.gov/projects/trecvid/.

[24]

V. N. Vapnik. Statistical Learning Theory. John Wiley & Sons, 1998.

[25]

X. Wu, A. G. Hauptmann, and C.-W. Ngo. Novelty detection for cross-lingual news stories with visual duplicates and speech transcripts. In ACM MULTIMEDIA'07, pages 168--177. ACM, 2007.

Digital Library

[26]

X. Wu, A. G. Hauptmann, and C.-W. Ngo. Practical elimination of near-duplicates from web video search. In ACM MULTIMEDIA'07, pages 218--227. ACM, 2007.

Digital Library

[27]

X. Wu, W.-L. Zhao, and C.-W. Ngo. Near-duplicate keyframe retrieval with visual keywords and semantic context. In ACM CIVR'07, pages 162--169. ACM, 2007.

Digital Library

[28]

Z. Xu, R. Jin, J. Zhu, I. King, and M. R. Lyu. Efficient convex relaxation for transductive support vector machine. In NIPS'2007, 2007.

[29]

R. Yan, A. G. Hauptmann, and R. Jin. Negative pseudo-relevance feedback in content-based video retrieval. In ACM MULTIMEDIA'03, pages 343--346, 2003.

Digital Library

[30]

D.-Q. Zhang and S.-F. Chang. Detecting image near-duplicate by stochastic attributed relational graph matching with learning. In ACM MULTIMEDIA'04, pages 877--884. ACM, 2004.

Digital Library

[31]

W. Zhao, Y. Jiang, and C. Ngo. Keyframe retrieval by keypoints: Can point-to-point matching help? In CIVR06, pages 72--81, 2006.

Digital Library

[32]

W.-L. Zhao, C.-W. Ngo, H. K. Tan, and X. Wu. Near-duplicate keyframe identification with interest point matching and pattern learning. IEEE Trans. on Multimedia, 9(5):1037--1048, 2007.

Digital Library

[33]

J. Zhu. Semi-supervised learning literature survey. Technical report, Carnegie Mellon University, 2005.

[34]

J. Zhu, S. C. Hoi, and M. R. Lyu. Face annotation by transductive kernel fisher discriminant. IEEE Trans. on Multimedia, 10(1):86--96, 2008.

Digital Library

[35]

J. Zhu and M. R. Lyu. Progressive finite newton approach to real-time nonrigid surface detection. In Proc. Conf. Computer Vision and Pattern Recognition, pages 1--8, 2007.

[36]

J. Zhu, M. R. Lyu, and T. S. Huang. A fast 2d shape recovery approach by fusing features and appearance. To appear in IEEE Trans. Pattern Anal. Mach. Intell., 2008.

Digital Library

Cited By

Mohanty NPradhan MMallick P(2022)A Detailed Schematic Study on Feature Extraction Methodologies and Its Applications: A Position PaperBiologically Inspired Techniques in Many Criteria Decision Making10.1007/978-981-16-8739-6_52(585-602)Online publication date: 4-Jun-2022
https://doi.org/10.1007/978-981-16-8739-6_52
Yang XShyu MYu HSun SYin NChen W(2019)Integrating Image and Textual Information in Human–Robot Interactions for Children With Autism Spectrum DisorderIEEE Transactions on Multimedia10.1109/TMM.2018.286582821:3(746-759)Online publication date: Mar-2019
https://doi.org/10.1109/TMM.2018.2865828
Tian DShi Z(2019)A two-stage hybrid probabilistic topic model for refining image annotationInternational Journal of Machine Learning and Cybernetics10.1007/s13042-019-00983-w11:2(417-431)Online publication date: 20-Jul-2019
https://doi.org/10.1007/s13042-019-00983-w
Show More Cited By

Index Terms

Near-duplicate keyframe retrieval by nonrigid image matching
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks

Recommendations

Near-duplicate keyframe retrieval by semi-supervised learning and nonrigid image matching

Near-duplicate keyframe (NDK) retrieval techniques are critical to many real-world multimedia applications. Over the last few years, we have witnessed a surge of attention on studying near-duplicate image/keyframe retrieval in the multimedia community. ...
Improved Keypoint Matching Method for Near-Duplicate Keyframe Retrieval
ISM '09: Proceedings of the 2009 11th IEEE International Symposium on Multimedia

We propose a Near-Duplicate Keyframe (NDK) retrieval method that can handle extreme zooming and significant object motion. The first stage consists of eliminating false keypoint matches using symmetric property and a ratio of nearest and second-nearest ...
Correlation-based retrieval for heavily changed near-duplicate videos

The unprecedented and ever-growing number of Web videos nowadays leads to the massive existence of near-duplicate videos. Very often, some near-duplicate videos exhibit great content changes, while the user perceives little information change, for ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '08: Proceedings of the 16th ACM international conference on Multimedia

October 2008

1206 pages

ISBN:9781605583037

DOI:10.1145/1459359

General Chairs:
Abdulmotaleb EL Saddik
University of Ottawa
,
Son Vuong
University of British Colombia
,
Program Chairs:
Carsten Griwodz
University of Oslo
,
Alberto Del Bimbo
University degli Studi di Firenze
,
K. Selcuk Candan
Arizona State University
,
Alejandro Jaimes
Telefonica R&D, Madrid, Spain

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM08

Sponsor:

MM08: ACM Multimedia Conference 2008

October 26 - 31, 2008

British Columbia, Vancouver, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

79
Total Citations
View Citations
707
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mohanty NPradhan MMallick P(2022)A Detailed Schematic Study on Feature Extraction Methodologies and Its Applications: A Position PaperBiologically Inspired Techniques in Many Criteria Decision Making10.1007/978-981-16-8739-6_52(585-602)Online publication date: 4-Jun-2022
https://doi.org/10.1007/978-981-16-8739-6_52
Yang XShyu MYu HSun SYin NChen W(2019)Integrating Image and Textual Information in Human–Robot Interactions for Children With Autism Spectrum DisorderIEEE Transactions on Multimedia10.1109/TMM.2018.286582821:3(746-759)Online publication date: Mar-2019
https://doi.org/10.1109/TMM.2018.2865828
Tian DShi Z(2019)A two-stage hybrid probabilistic topic model for refining image annotationInternational Journal of Machine Learning and Cybernetics10.1007/s13042-019-00983-w11:2(417-431)Online publication date: 20-Jul-2019
https://doi.org/10.1007/s13042-019-00983-w
Gao LWang PSong JHuang ZShao JShen HSingh SMarkovitch S(2017)Event video mashupProceedings of the Thirty-First AAAI Conference on Artificial Intelligence10.5555/3298239.3298433(1323-1330)Online publication date: 4-Feb-2017
https://dl.acm.org/doi/10.5555/3298239.3298433
Wang YLin XWu LZhang W(2017)Effective Multi-Query ExpansionsIEEE Transactions on Image Processing10.1109/TIP.2017.265544926:3(1393-1404)Online publication date: 1-Mar-2017
https://dl.acm.org/doi/10.1109/TIP.2017.2655449
Chu LZhang YLi GWang SZhang WHuang Q(2016)Effective Multimodality Fusion Framework for Cross-Media Topic DetectionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2014.234755126:3(556-569)Online publication date: 1-Mar-2016
https://dl.acm.org/doi/10.1109/TCSVT.2014.2347551
Min WBao BXu C(2016)An incremental probabilistic model for temporal theme analysis of landmarksMultimedia Systems10.1007/s00530-014-0431-822:4(465-477)Online publication date: 1-Jul-2016
https://dl.acm.org/doi/10.1007/s00530-014-0431-8
Wang YLin XWu LZhang WZhou XSmeaton ATian QBulterman DShen HMayer-Patel KYan S(2015)Effective Multi-Query ExpansionsProceedings of the 23rd ACM international conference on Multimedia10.1145/2733373.2806233(79-88)Online publication date: 13-Oct-2015
https://dl.acm.org/doi/10.1145/2733373.2806233
Bao BXu CMin WHossain M(2015)Cross-Platform Emerging Topic Detection and Elaboration from Multimedia StreamsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/273088911:4(1-21)Online publication date: 2-Jun-2015
https://dl.acm.org/doi/10.1145/2730889
Fang QXu CSang JHossain MMuhammad G(2015)Word-of-Mouth Understanding: Entity-Centric Multimodal Aspect-Opinion Mining in Social MediaIEEE Transactions on Multimedia10.1109/TMM.2015.249101917:12(2281-2296)Online publication date: Dec-2015
https://doi.org/10.1109/TMM.2015.2491019
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten