skip to main content
10.1145/1180639.1180827acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Fast tracking of near-duplicate keyframes in broadcast domain with transitivity propagation

Published: 23 October 2006 Publication History

Abstract

The identification of near-duplicate keyframe (NDK) pairs is a useful task for a variety of applications such as news story threading and content-based video search. In this paper, we propose a novel approach for the discovery and tracking of NDK pairs and threads in the broadcast domain. The detection of NDKs in a large data set is a challenging task due to the fact that when the data set increases linearly, the computational cost increases in a quadratic speed, and so does the number of false alarms. This paper explores the symmetric and transitive nature of near-duplicate for the effective detection and fast tracking of NDK pairs based upon the matching of local keypoints in frames. In the detection phase, we propose a robust measure, namely pattern entropy (PE), to measure the coherency of symmetric keypoint matching across the space of two keyframes. This measure is shown to be effective in discovering the NDK identity of a frame. In the tracking phase, the NDK pairs and threads are rapidly propagated and linked with sitivity without the need of detection. This step ends up a significant boost in speed efficiency. We evaluate proposed approach against a month of the 2004 broadcast videos. The experimental results indicate our approach outperforms other techniques in terms of recall and precision with a large margin. In addition, by considering the transitivity and the underlying distribution of NDK pairs along time span, a speed up of 3 to 5 times is achieved when keeping the performance close enough to the optimal one obtained by exhaustive evaluation.

References

[1]
C. Chang, J. Wang, C. Li, and G. Wiederhold. RIME: A replicated image detector for the world-wide web. In Multimedia Storage and Archiving Systems 1998.
[2]
S.F. Chang, W. Hsu, L. Kennedy, L. Xie, A. Yanagawa, E. Zavesky, and D.-Q. Zhang. Columbia university trecvid-2005 video search and high-level feature extraction. In TRECVID 2005.
[3]
P. Duygulu, J.-Y. Pan, and D.A. Forsyth. Towards auto-documentary: Tracking the evolution of news stories. In ACM Multimedia Conference pages 820--827, 2004.
[4]
Y. Ke and R. Sukthankar. PCA-SIFT: A more distinctive representation for local image descriptors. In CVPR volume 2, pages 506--513, 2004.
[5]
Y. Ke, R. Suthankar, and L. Huston. Efficient near-duplicate detection and sub-image retrieval. In ACM Multimedia Conference pages 869--876, 2004.
[6]
D. Lowe. Distinctive image features from scale-invariant keypoints. Int. Journal on Computer Vision 60(2):91--110, 2004.
[7]
Y. Meng, E. Chang, and B. Li. Enhancing dpf for near-replica image recognition. In CVPR 2003.
[8]
K. Mikolajczyk and C. Schmid. An affine invariant interest point detector. In ECCV 2002.
[9]
K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. IEEE Trans. on PAMI 27(10),2005.
[10]
K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L.V. Gool. A comparison of affine region detectors. Int. Journal on Computer Vision 65(1/2):43--72, 2005.
[11]
M. Schneider and S.F. Chang. A robust content based digital signature for image authentication. In Int. Conf. on Image Processing 1996.
[12]
J.S. Seo, J. Haitsma, T. Kalker, and C.D. Yoo. A robust image fingerprinting system using Randon transform. Signal Processing: Image Communication 19:325--339, 2004.
[13]
M. Steinbach, G. Karypis, and V. Kumar. A comparison of document clustering techniques. In KDD Workshop on Text Mining 2000.
[14]
X. Wu, C.-W. Ngo, and Q. Li. Threading and autodocumenting news videos. Signal Processing Magazine 23(2):59--68, Mar 2006.
[15]
D.-Q. Zhang and S.-F. Chang. Detecting image near-duplicate by stochastic attributed relational graph matching with learning. In ACM Multimedia Conference pages 877--884, 2004.
[16]
W. Zhao, Y.G. Jiang, and C.W. Ngo. Keyframe retrieval by keypoints: Can point-to-point matching help? In Conf. on Image and Video Retrieval 2006.

Cited By

View all
  • (2023)Cross-media web video event mining based on multiple semantic-paths embeddingNeural Computing and Applications10.1007/s00521-023-09050-636:2(667-683)Online publication date: 18-Oct-2023
  • (2020)A Novel Collaborative Optimization Framework for Web Video Event Mining Based on the Combination of Inaccurate Visual Similarity Detection Information and Sparse Textual InformationIEEE Access10.1109/ACCESS.2020.29647148(10516-10527)Online publication date: 2020
  • (2020)Advance on large scale near-duplicate video retrievalFrontiers of Computer Science10.1007/s11704-019-8229-714:5Online publication date: 3-Jan-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '06: Proceedings of the 14th ACM international conference on Multimedia
October 2006
1072 pages
ISBN:1595934472
DOI:10.1145/1180639
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. keyframe tracking
  2. keypoint matching
  3. near-duplicate detection
  4. pattern entropy
  5. transitivity propagation

Qualifiers

  • Article

Conference

MM06
MM06: The 14th ACM International Conference on Multimedia 2006
October 23 - 27, 2006
CA, Santa Barbara, USA

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Cross-media web video event mining based on multiple semantic-paths embeddingNeural Computing and Applications10.1007/s00521-023-09050-636:2(667-683)Online publication date: 18-Oct-2023
  • (2020)A Novel Collaborative Optimization Framework for Web Video Event Mining Based on the Combination of Inaccurate Visual Similarity Detection Information and Sparse Textual InformationIEEE Access10.1109/ACCESS.2020.29647148(10516-10527)Online publication date: 2020
  • (2020)Advance on large scale near-duplicate video retrievalFrontiers of Computer Science10.1007/s11704-019-8229-714:5Online publication date: 3-Jan-2020
  • (2019)Video SkimmingACM Computing Surveys10.1145/334771252:5(1-38)Online publication date: 13-Sep-2019
  • (2019)Integrating Image and Textual Information in Human–Robot Interactions for Children With Autism Spectrum DisorderIEEE Transactions on Multimedia10.1109/TMM.2018.286582821:3(746-759)Online publication date: Mar-2019
  • (2019)A hamming distance and fuzzy logic-based algorithm for P2P content distribution in enterprise networksPeer-to-Peer Networking and Applications10.1007/s12083-018-0711-812:5(1323-1335)Online publication date: 22-Feb-2019
  • (2017)Stochastic Multiview Hashing for Large-Scale Near-Duplicate Video RetrievalIEEE Transactions on Multimedia10.1109/TMM.2016.261032419:1(1-14)Online publication date: 1-Jan-2017
  • (2017)A fast near-duplicate keyframe detection method based on local features2017 IEEE 17th International Conference on Communication Technology (ICCT)10.1109/ICCT.2017.8359890(1544-1547)Online publication date: Oct-2017
  • (2017)Using correspondence analysis to select training set for multi-modal information dataCluster Computing10.1007/s10586-017-0945-x21:1(893-905)Online publication date: 1-Jun-2017
  • (2016)Effective Multimodality Fusion Framework for Cross-Media Topic DetectionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2014.234755126:3(556-569)Online publication date: 1-Mar-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media