skip to main content
10.1145/1180639.1180855acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Automatic video annotation by semi-supervised learning with kernel density estimation

Published: 23 October 2006 Publication History

Abstract

Insufficiency of labeled training data is a major obstacle for automatically annotating large-scale video databases with semantic concepts. Existing semi-supervised learning algorithms based on parametric models try to tackle this issue by incorporating the information in a large amount of unlabeled data. However, they are based on a "model assumption" that the assumed generative model is correct, which usually cannot be satisfied in automatic video annotation due to the large variations of video semantic concepts. In this paper, we propose a novel semi-supervised learning algorithm, named Semi Supervised Learning by Kernel Density Estimation (SSLKDE), which is based on a non-parametric method, and therefore the "model assumption" is avoided. While only labeled data are utilized in the classical Kernel Density Estimation (KDE) approach, in SSLKDE both labeled and unlabeled data are leveraged to estimate class conditional probability densities based on an extended form of KDE. We also investigate the connection between SSLKDE and existing graph-based semi-supervised learning algorithms. Experiments prove that SSLKDE significantly outperforms existing supervised methods for video annotation.

References

[1]
TRECVID: TREC Video Retrieval Evaluation, "http://www-nlpir.nist.gov/projects/trecvid"
[2]
UCI Repository of Machine Learning Databases, "http://www.ics.uci.edu/~mlearn/"
[3]
Belkin, M., Matveeva, I., and Niyogi, P. Regularization and semi-supervised learning on large graphs. Proc. Annual Conf. on Learning Theory, 2004
[4]
Blum, A., and Mitchell, T. Combining labeled and unlabeled data with co-training. Proc. Workshop on Computational Learning Theory, 1998.
[5]
Blum, A., and Chawla, S. Learning from labeled and unlabeled data using graph mincuts. Proc. International Conf. on Machine Learning, 2001.
[6]
Castelli, V., and Cover, T. The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter. IEEE trans. on Information Theory, vol. 42, 1996
[7]
Chang, C.-C., and Lin, C.-J., LIBSVM: a library for support vector machines, 2001
[8]
Chapelle, O., Zien, A., and Scholkopf, B., Semi-supervised learning, MIT Press, 2006
[9]
Cohen, I., Cozman, F.G., Sebe, N., Cirelo, M.C., and Huang, T.S. Semi-supervised learning of classifiers: theory, algorithms and their application to human-computer interaction. IEEE trans. on PAMI, vol. 26, no. 12, 2004.
[10]
Cozman, F., Cohen, I. and Cirelo, M. Semi-supervised learning of mixture models and Bayesian Networks, Proc. International Conf. on Machine Learning, 2003.
[11]
Delalleau, O., Bengio, Y., and Roux, N.L. Efficient non-parametric function induction in semi-supervised learning. Proc. International Workshop on Artificial Intelligence and Statistics, 2005
[12]
Devroye, L., and Penrod, C.S., The consistency of automatic kernel density estimates. The Annals of Statistics, vold. 12, 1984
[13]
Devroye, L., and Krzyzak, A. An equivalence theorem for L1 convergence of the kernel regression estimate. Journal of Statistical Planning and Inference, 1989
[14]
Feng, S.L., Manmatha, R., and Lavrenko, V., Multiple Bernoulli Relevance Models for Image and Video Annotation, Proc. International Conf. on Computer Vision and Pattern Recognition, 2004
[15]
Ghoshal, A., Arcing, P., and Khudanpur, S., Hidden Markov Models for Automatic Annotation and Content-Based Retrieval of Images and Video, Proc. ACM SIGIR, 2005
[16]
Hand, D.J., Kernel Discriminant Analysis. Research Studies Press, Wiley, Chichester, 1982
[17]
He, J.R., Li, M.J., Zhang, H.J., Tong, H.H. and Zhang, C.S., Manifold-ranking based image retrieval, Proc. ACM Multimedia, 2004
[18]
Lavrenko, L., Feng, S.L. and Manmatha, R., Statistical models for automatic video annotation and Retrieval, Proc. International conf. on Acoustics, speech and signal processing, 2004
[19]
Nigam, K., McCallum, A.K., Thrun, S., and Mitchell, T. Text classification from labeled and unlabeled documents using EM. Machine Learning, vol. 39, 2000
[20]
Parzen, E., On the estimation of a probability density function and the mode, Annals of Mathematical Statistics, vol. 33, 1962
[21]
Robert, C.P., and Casella, G., Monte Carlo Statistical Methods, Springer Verlag, 1999
[22]
Rosenberg, C., Heberg, M., and Schneiderman, H. Semi-supervised self-training of object detection models. Proc. 7th IEEE Workshop on Applications of Computer Vision, 2005.
[23]
Seeger, M. Learning with Labeled and Unlabeled Data. Tachnical report, Edinburgh University, 2001.
[24]
Song, Y., Hua, X.S., Dai, L.R. and Wang. M., Semi-Automatic Video Annotation Based on Active Learning with Multiple Complementary Predictors. ACM SIGMM International Workshop on Multimedia Information Retrieval, 2005
[25]
Yan, R., and Naphade, M., Semi-supervised Cross Feature Learning for Semantic Concept Detection in Videos, Proc. International Conf. on Computer Vision and Pattern Recognition, 2005
[26]
Zhang, L., Li, M. and Zhang, H.-J., Boosting Image Orientation Detection with Indoor vs. Outdoor Classification, Proc. 6 th IEEE Workshop on Applications of Computer Vision, 2002
[27]
Zhang, T., and Oles, F.J. A probability analysis on the value of unlabeled data for classification problems. Proc. 17th International Conf. on Machine Learning, 2000.
[28]
Zhou, D., Bousquet, O., Lal, T., Weston, J., and Scholkopf, B. Learning with local and global consistency. Proc. Advances in Neural Information Processing System, 2004
[29]
Zhu, X., Ghahramani, Z., and Lafferty, J. Semi-supervised learning using Gaussian fields and harmonic functions. Proc. International Conf. on Machine Learning, 2003.
[30]
Zhu, X. Semi-supervised learning literature survey, Technical Report (1530), University of Wisconsin-Madison, www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf.
[31]
Zhu, X. and Ghahramani, Z. Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02-106. Carnegie Mellon University

Cited By

View all
  • (2024)Label Expansion through Walking Trajectories for Wi-Fi CSI-Based Indoor Localization2024 IEEE 35th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC)10.1109/PIMRC59610.2024.10817203(1-6)Online publication date: 2-Sep-2024
  • (2024)Comparative analysis of manual and annotations for crowd assessment and classification using artificial intelligenceData Science and Management10.1016/j.dsm.2024.04.001Online publication date: Apr-2024
  • (2021)A Review of Methods for The Image Automatic AnnotationJournal of Physics: Conference Series10.1088/1742-6596/1892/1/0120021892:1(012002)Online publication date: 1-Apr-2021
  • Show More Cited By

Index Terms

  1. Automatic video annotation by semi-supervised learning with kernel density estimation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '06: Proceedings of the 14th ACM international conference on Multimedia
    October 2006
    1072 pages
    ISBN:1595934472
    DOI:10.1145/1180639
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 October 2006

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. SSLKDE
    2. kernel density estimation
    3. semi-supervised learning
    4. video annotation

    Qualifiers

    • Article

    Conference

    MM06
    MM06: The 14th ACM International Conference on Multimedia 2006
    October 23 - 27, 2006
    CA, Santa Barbara, USA

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)10
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Label Expansion through Walking Trajectories for Wi-Fi CSI-Based Indoor Localization2024 IEEE 35th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC)10.1109/PIMRC59610.2024.10817203(1-6)Online publication date: 2-Sep-2024
    • (2024)Comparative analysis of manual and annotations for crowd assessment and classification using artificial intelligenceData Science and Management10.1016/j.dsm.2024.04.001Online publication date: Apr-2024
    • (2021)A Review of Methods for The Image Automatic AnnotationJournal of Physics: Conference Series10.1088/1742-6596/1892/1/0120021892:1(012002)Online publication date: 1-Apr-2021
    • (2019)Weakly supervised segment annotation via expectation kernel density estimationIET Computer Vision10.1049/iet-cvi.2018.532513:4(435-441)Online publication date: 8-May-2019
    • (2016)What Is Happening in the Video? —Annotate Video by SentenceIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2015.247581526:9(1746-1757)Online publication date: 1-Sep-2016
    • (2015)A Novel Statistical Approach for Image and Video Retrieval and Its Adaption for Active LearningProceedings of the 23rd ACM international conference on Multimedia10.1145/2733373.2806368(935-938)Online publication date: 13-Oct-2015
    • (2014)Stop-Frame Removal Improves Web Video ClassificationProceedings of International Conference on Multimedia Retrieval10.1145/2578726.2578803(499-502)Online publication date: 1-Apr-2014
    • (2014)Automatic image annotation by semi-supervised manifold kernel density estimationInformation Sciences: an International Journal10.1016/j.ins.2013.09.016281(648-660)Online publication date: 1-Oct-2014
    • (2014)Typicality rankingMultimedia Tools and Applications10.1007/s11042-011-0892-070:2(647-660)Online publication date: 1-May-2014
    • (2013)Multimedia encyclopedia construction by mining web knowledgeSignal Processing10.1016/j.sigpro.2012.06.02893:8(2361-2368)Online publication date: 1-Aug-2013
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media