skip to main content
10.1145/1459359.1459366acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Near-duplicate keyframe retrieval by nonrigid image matching

Published: 26 October 2008 Publication History

Abstract

Near-duplicate image retrieval plays an important role in many real-world multimedia applications. Most previous approaches have some limitations. For example, conventional appearance-based methods may suffer from the illumination variations and occlusion issue, and local feature correspondence-based methods often do not consider local deformations and the spatial coherence between two point sets. In this paper, we propose a novel and effective Nonrigid Image Matching (NIM) approach to tackle the task of near-duplicate keyframe retrieval from real-world video corpora. In contrast to previous approaches, the NIM technique can recover an explicit mapping between two near-duplicate images with a few deformation parameters and find out the correct correspondences from noisy data effectively. To make our technique applicable to large-scale applications, we suggest an effective multi-level ranking scheme that filters out the irrelevant results in a coarse-to-fine manner. In our ranking scheme, to overcome the extremely small training size challenge, we employ a semi-supervised learning method for improving the performance using unlabeled data. To evaluate the effectiveness of our solution, we have conducted extensive experiments on two benchmark testbeds extracted from the TRECVID2003 and TRECVID2004 corpora. The promising results show that our proposed method is more effective than other state-of-the-art approaches for near-duplicate keyframe retrieval.

References

[1]
http://vireo.cs.cityu.edu.hk/research/NDK/ndk.html.
[2]
http://www.cse.cuhk.edu.hk/~jkzhu/dup_detect.html.
[3]
H. Bay, T. Tuytelaars, and L. J. V. Gool. Surf: Speeded up robust features. In Proc. European Conf. Computer Vision, pages 404--417, 2006.
[4]
S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
[5]
J. Canny. A computational approach to edge detection. IEEE Trans. PAMI, 8(6):679--698, 1986.
[6]
O. Chum and J. Matas. Matching with prosac- progressive sample consensus. In Proc. Conf. Computer Vision and Pattern Recognition, volume 1, pages 220--226, 2005.
[7]
M. A. Fischler and R. C. Bolles. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. CACM, 24(6):381--395, 1981.
[8]
P. Fua and Y. Leclerc. Object-centered surface reconstruction: Combining multi-image stereo and shading. Int'l J. Computer Vision, 16(1):35--56, Sep. 1995.
[9]
K. Fukunaga. Introduction to statistical pattern recognition. Academic Press Professional, Inc., 1990.
[10]
C.-H. Hoi, W. Wang, and M. R. Lyu. A novel scheme for video similarity detection. In CIVR, pages 373--382, 2003.
[11]
S. C. Hoi and M. R. Lyu. A multi-modal and multi-level ranking framework for content-based video retrieval. To appear in IEEE Transactions on Multimedia, 2008.
[12]
M. Kass, A. Witkin, and D. Terzopoulos. Snakes: Active contour models. Int'l J. Computer Vision, 1(4):321--331, Jan. 1988.
[13]
Y. Ke, R. Sukthankar, and L. Huston. Efficient near-duplicate detction and sub-image retrieval system. In ACM MULTIMEDIA'04, pages 869--876. ACM, 2004.
[14]
M. Lades, J. C. Vorbruggen, J. Buhmann, J. Lange, C. von der Malsburg, R. P. Wurtz, and W. Konen. Distortion invariant object recognition in the dynamic link architecture. IEEE Trans. Computers, 42(5):300--311, 1993.
[15]
D. G. Lowe. Distinctive Image Features from Scale-Invariant Keypoints. Int'l J. Computer Vision, 60(2):91--110, 2004.
[16]
K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. IEEE Trans. on Pattern Analysis and Machine Intelligence, 27(10):1615--1630, 2005.
[17]
C.-W. Ngo, W.-L. Zhao, and Y.-G. Jiang. Fast tracking of near-duplicate keyframes in broadcast domain with transitivity propagation. In ACM MULTIMEDIA'06, pages 845--854. ACM, 2006.
[18]
T. Ojala, M. Pietikainen, and D. Harwood. A comparative study of texture measures with classification based on feature distributions. 29(1):51--59, January 1996.
[19]
J. Pilet, V. Lepetit, and P. Fua. Fast non-rigid surface detection, registration, and realistic augmentation. Int'l J. Computer Vision, 76(2):109--122, 2008.
[20]
A. Qamra, Y. Meng, and E. Y. Chang. Enhanced perceptual distance functions and indexing for image replica recognition. IEEE Trans. Pattern Anal. Mach. Intell., 27(3):379--391, 2005.
[21]
V. Sindhwani, P. Niyogi, and M. Belkin. Beyond the point cloud: from transductive to semi-supervised learning. In ICML'05, pages 824--831. ACM Press, 2005.
[22]
J. Sivic and A. Zisserman. Video google: A text retrieval approach to object matching in videos. In International Conference on Computer Vision (ICCV2003), pages 1470--1477, 2003.
[23]
TRECVID. TREC video retrieval evaluation. In http://www-nlpir.nist.gov/projects/trecvid/.
[24]
V. N. Vapnik. Statistical Learning Theory. John Wiley & Sons, 1998.
[25]
X. Wu, A. G. Hauptmann, and C.-W. Ngo. Novelty detection for cross-lingual news stories with visual duplicates and speech transcripts. In ACM MULTIMEDIA'07, pages 168--177. ACM, 2007.
[26]
X. Wu, A. G. Hauptmann, and C.-W. Ngo. Practical elimination of near-duplicates from web video search. In ACM MULTIMEDIA'07, pages 218--227. ACM, 2007.
[27]
X. Wu, W.-L. Zhao, and C.-W. Ngo. Near-duplicate keyframe retrieval with visual keywords and semantic context. In ACM CIVR'07, pages 162--169. ACM, 2007.
[28]
Z. Xu, R. Jin, J. Zhu, I. King, and M. R. Lyu. Efficient convex relaxation for transductive support vector machine. In NIPS'2007, 2007.
[29]
R. Yan, A. G. Hauptmann, and R. Jin. Negative pseudo-relevance feedback in content-based video retrieval. In ACM MULTIMEDIA'03, pages 343--346, 2003.
[30]
D.-Q. Zhang and S.-F. Chang. Detecting image near-duplicate by stochastic attributed relational graph matching with learning. In ACM MULTIMEDIA'04, pages 877--884. ACM, 2004.
[31]
W. Zhao, Y. Jiang, and C. Ngo. Keyframe retrieval by keypoints: Can point-to-point matching help? In CIVR06, pages 72--81, 2006.
[32]
W.-L. Zhao, C.-W. Ngo, H. K. Tan, and X. Wu. Near-duplicate keyframe identification with interest point matching and pattern learning. IEEE Trans. on Multimedia, 9(5):1037--1048, 2007.
[33]
J. Zhu. Semi-supervised learning literature survey. Technical report, Carnegie Mellon University, 2005.
[34]
J. Zhu, S. C. Hoi, and M. R. Lyu. Face annotation by transductive kernel fisher discriminant. IEEE Trans. on Multimedia, 10(1):86--96, 2008.
[35]
J. Zhu and M. R. Lyu. Progressive finite newton approach to real-time nonrigid surface detection. In Proc. Conf. Computer Vision and Pattern Recognition, pages 1--8, 2007.
[36]
J. Zhu, M. R. Lyu, and T. S. Huang. A fast 2d shape recovery approach by fusing features and appearance. To appear in IEEE Trans. Pattern Anal. Mach. Intell., 2008.

Cited By

View all
  • (2022)A Detailed Schematic Study on Feature Extraction Methodologies and Its Applications: A Position PaperBiologically Inspired Techniques in Many Criteria Decision Making10.1007/978-981-16-8739-6_52(585-602)Online publication date: 4-Jun-2022
  • (2019)Integrating Image and Textual Information in Human–Robot Interactions for Children With Autism Spectrum DisorderIEEE Transactions on Multimedia10.1109/TMM.2018.286582821:3(746-759)Online publication date: Mar-2019
  • (2019)A two-stage hybrid probabilistic topic model for refining image annotationInternational Journal of Machine Learning and Cybernetics10.1007/s13042-019-00983-w11:2(417-431)Online publication date: 20-Jul-2019
  • Show More Cited By

Index Terms

  1. Near-duplicate keyframe retrieval by nonrigid image matching

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '08: Proceedings of the 16th ACM international conference on Multimedia
    October 2008
    1206 pages
    ISBN:9781605583037
    DOI:10.1145/1459359
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 October 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. image copy detection
    2. near-duplicate keyframe
    3. nonrigid image matching
    4. semi-supervised learning

    Qualifiers

    • Research-article

    Conference

    MM08
    Sponsor:
    MM08: ACM Multimedia Conference 2008
    October 26 - 31, 2008
    British Columbia, Vancouver, Canada

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)A Detailed Schematic Study on Feature Extraction Methodologies and Its Applications: A Position PaperBiologically Inspired Techniques in Many Criteria Decision Making10.1007/978-981-16-8739-6_52(585-602)Online publication date: 4-Jun-2022
    • (2019)Integrating Image and Textual Information in Human–Robot Interactions for Children With Autism Spectrum DisorderIEEE Transactions on Multimedia10.1109/TMM.2018.286582821:3(746-759)Online publication date: Mar-2019
    • (2019)A two-stage hybrid probabilistic topic model for refining image annotationInternational Journal of Machine Learning and Cybernetics10.1007/s13042-019-00983-w11:2(417-431)Online publication date: 20-Jul-2019
    • (2017)Event video mashupProceedings of the Thirty-First AAAI Conference on Artificial Intelligence10.5555/3298239.3298433(1323-1330)Online publication date: 4-Feb-2017
    • (2017)Effective Multi-Query ExpansionsIEEE Transactions on Image Processing10.1109/TIP.2017.265544926:3(1393-1404)Online publication date: 1-Mar-2017
    • (2016)Effective Multimodality Fusion Framework for Cross-Media Topic DetectionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2014.234755126:3(556-569)Online publication date: 1-Mar-2016
    • (2016)An incremental probabilistic model for temporal theme analysis of landmarksMultimedia Systems10.1007/s00530-014-0431-822:4(465-477)Online publication date: 1-Jul-2016
    • (2015)Effective Multi-Query ExpansionsProceedings of the 23rd ACM international conference on Multimedia10.1145/2733373.2806233(79-88)Online publication date: 13-Oct-2015
    • (2015)Cross-Platform Emerging Topic Detection and Elaboration from Multimedia StreamsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/273088911:4(1-21)Online publication date: 2-Jun-2015
    • (2015)Word-of-Mouth Understanding: Entity-Centric Multimodal Aspect-Opinion Mining in Social MediaIEEE Transactions on Multimedia10.1109/TMM.2015.249101917:12(2281-2296)Online publication date: Dec-2015
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media