skip to main content
research-article

Annotation propagation in image databases using similarity graphs

Published:27 December 2013Publication History
Skip Abstract Section

Abstract

The practicality of large-scale image indexing and querying methods depends crucially upon the availability of semantic information. The manual tagging of images with semantic information is in general very labor intensive, and existing methods for automated image annotation may not always yield accurate results. The aim of this paper is to reduce to a minimum the amount of human intervention required in the semantic annotation of images, while preserving a high degree of accuracy. Ideally, only one copy of each object of interest would be labeled manually, and the labels would then be propagated automatically to all other occurrences of the objects in the database. To this end, we propose an influence propagation strategy, SW-KProp, that requires no human intervention beyond the initial labeling of a subset of the images. SW-KProp distributes semantic information within a similarity graph defined on all images in the database: each image iteratively transmits its current label information to its neighbors, and then readjusts its own label according to the combined influences of its neighbors. SW-KProp influence propagation can be efficiently performed by means of matrix computations, provided that pairwise similarities of images are available. We also propose a variant of SW-KProp which enhances the quality of the similarity graph by selecting a reduced feature set for each prelabeled image and rebuilding its neighborhood. The performances of the SW-KProp method and its variant were evaluated against several competing methods on classification tasks for three image datasets: a handwritten digit dataset, a face dataset and a web image dataset. For the digit images, SW-KProp and its variant performed consistently better than the other methods tested. For the face and web images, SW-KProp outperformed its competitors for the case when the number of prelabeled images was relatively small. The performance was seen to improve significantly when the feature selection strategy was applied.

References

  1. Ames, M. and Naaman, M. 2007. Why we tag: Motivations for annotation in mobile and online media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 971--980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Avrachenkov, K., Dobrynin, V., Nemirovsky, D., Pham, S. K., and Smirnova, E. 2008. Pagerank based clustering of hypertext document collections. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 873--874. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D. M., and Jordan, M. I. 2003. Matching words and pictures. J. Mach. Learn. Res. 3, 1107--1135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Belkin, M., Niyogi, P., and Sindhwani, V. 2006. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399--2434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Blum, A. and Chawla, S. 2001. Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the 18th International Conference on Machine Learning. 19--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bradski, G. R. and Kaehler, A. 2008. Learning OpenCV - Computer Vision with the OpenCV Library: Software that Sees. O'Reilly.Google ScholarGoogle Scholar
  7. Cao, L., Pozo, A. D., Jin, X., Luo, J., Han, J., and Huang, T. S. 2010. RankCompete: Simultaneous ranking and clustering of web photos. In Proceedings of the 19th International Conference on World Wide Web. 1071--1072. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chang, E., Goh, K., Sychay, G., and Wu, G. 2003. CBSA: Content-Based Soft Annotation for multimodal image retrieval using Bayes point machines. IEEE Trans. Circ. Syst. Video Tech. 13, 1, 26--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chua, T.-S., Tang, J., Hong, R., Li, H., Luo, Z., and Zheng, Y.-T. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of ACM Conference on Image and Video Retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Cusano, C., Ciocca, G., and Schettini, R. 2003. Image annotation using SVM. In Engineers SPIE Conference Series, Vol. 5304, 330--338.Google ScholarGoogle Scholar
  11. Desai, C., Kalashnikov, D. V., Mehrotra, S., and Venkatasubramanian, N. 2009. Using semantics for speech annotation of images. In Proceedings of the IEEE International Conference on Data Engineering. 1227--1230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Duygulu, P., Barnard, K., de Freitas, J. F. G., and Forsyth, D. A. 2002. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Proceedings of the 7th European Conference on Computer Vision:Part IV. 97--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Everingham, M., Sivic, J., and Zisserman, A. 2006. “Hello! My name is… Buffy” -- Automatic naming of characters in TV video. In Proceedings of the British Machine Vision Conference. 899--908.Google ScholarGoogle Scholar
  14. Hageman, L. and Young, D. 2004. Applied Iterative Methods. Dover Publications.Google ScholarGoogle Scholar
  15. Hardoon, D. R., Saunders, C., Szedmák, S., and Shawe-Taylor, J. 2006. A correlation approach for automatic image annotation. In Advanced Data Mining and Applications. 681--692. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hestenes, M. R. and Stiefel, E. 1952. Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Standards 49, 409--436.Google ScholarGoogle ScholarCross RefCross Ref
  17. Higham, N. J. and Tisseur, F. 2003. Bounds for eigenvalues of matrix polynomials. Linear Algebra Appl. 358, 1--3, 5--22.Google ScholarGoogle ScholarCross RefCross Ref
  18. Houle, M. E., Oria, V., Satoh, S., and Sun, J. 2011. Knowledge propagation in large image databases using neighborhood information. In Proceedings of the ACM Multimedia. 1033--1036. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Houle, M. E. and Sakuma, J. 2005. Fast approximate similarity search in extremely high-dimensional data sets. In Proceedings of the 21st International Conference on Data Engineering. 619--630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hu, X. and Qian, X. 2009. A novel graph-based image annotation with two level bag generators. In Proceedings of the International Conference on Computational Intelligence and Security. 71--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jeh, G. and Widom, J. 2002. SimRank: A measure of structural-context similarity. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 538--543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jeon, J., Lavrenko, V., and Manmatha, R. 2003. Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. 119--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jing, Y. and Baluja, S. 2008. VisualRank: Applying PageRank to large-scale image search. IEEE Trans. Patt. Anal. Mach. Intell. 30, 11, 1877--1890. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11, 2278--2324.Google ScholarGoogle ScholarCross RefCross Ref
  25. Li, R., Zhang, Y., Lu, Z., Lu, J., and Tian, Y. 2010. Technique of image retrieval based on multi-label image annotation. In Proceedings of the 2010 2nd International Conference on Multimedia and Information Technology, Vol. 2. 10--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Li, X., Chen, L., Zhang, L., Lin, F., and Ma, W.-Y. 2006. Image annotation by large-scale content-based image retrieval. In Proceedings of the 14th Annual ACM International Conference on Multimedia. 607--610. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Liu, J., Li, M., Ma, W.-Y., Liu, Q., and Lu, H. 2006. An adaptive graph model for automatic image annotation. In Multimed. Inf. Ret. 61--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Liu, W., Dumais, S., Sun, Y., Zhang, H., Czerwinski, M., and Field, B. 2001. Semi-automatic image annotation. In Proceedings of Interact: Conference on Human-Computer Interaction. 326--333.Google ScholarGoogle Scholar
  29. Liu, W., He, J., and Chang, S.-F. 2010. Large graph construction for scalable semi-supervised learning. In Proceedings of the 27th International Conference on Machine Learning. 679--686.Google ScholarGoogle Scholar
  30. Liu, W., Wang, J., and Chang, S.-F. 2012. Robust and scalable graph-based semisupervised learning. Proc. IEEE 100, 9, 2624--2638.Google ScholarGoogle ScholarCross RefCross Ref
  31. Makadia, A., Pavlovic, V., and Kumar, S. 2008. A new baseline for image annotation. In Proceedings of the 10th European Conference on Computer Vision: Part III. 316--329. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Melacci, S. and Belkin, M. 2011. Laplacian support vector machines trained in the primal. J. Mach. Learn. Res. 12, 1149--1184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Nov, O. and Ye, C. 2010. Why do people tag?: Motivations for photo tagging. Comm. ACM 53, 7, 128--131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ono, A., Amano, M., Hakaridani, M., Satou, T., and Sakauchi, M. 1996. A flexible content-based image retrieval system with combined scene description keyword. In Proceedings of the 3rd IEEE International Conference on Multimedia Computing and Systems. 201--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Ozkan, D. and Duygulu, P. 2006. A graph based approach for naming faces in news photos. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 1477--1482. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Page, L., Brin, S., Motwani, R., and Winograd, T. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report 1999--66., Stanford InfoLab.Google ScholarGoogle Scholar
  37. Russell, B., Torralba, A., Murphy, K., and Freeman, W. 2008. LabelMe: A database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 1--3, 157--173. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Saad, Y. and Schultz, M. H. 1986. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Statist. Comput. 7, 856--869. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Shi, R., Lee, C.-H., and Chua, T.-S. 2007. Enhancing image annotation by integrating concept ontology and text-based bayesian learning model. In Proceedings of the 15th International Conference on Multimedia. 341--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Srikanth, M., Varner, J., Bowden, M., and Moldovan, D. 2005. Exploiting ontologies for automatic image annotation. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 552--558. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Tang, J., Hong, R., Yan, S., Chua, T.-S., Qi, G.-J., and Jain, R. 2011. Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images. ACM Trans. Intel. Syst. Tech. 2, 2, 14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Von Ahn, L. and Dabbish, L. 2004. Labeling images with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 319--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Wang, C., Jing, F., Zhang, L., and Zhang, H. 2006. Image annotation refinement using random walk with restarts. In Proceedings of the ACM Multimedia. 647--650. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Zhou, D., Bousquet, O., Lal, T. N., Weston, J., and Schölkopf, B. 2003a. Learning with local and global consistency. In Advances in Neural Information Processing Systems 16.Google ScholarGoogle Scholar
  45. Zhou, D., Weston, J., Gretton, A., Bousquet, O., and Schölkopf, B. 2003b. Ranking on data manifolds. In Advances in Neural Information Processing Systems 16.Google ScholarGoogle Scholar
  46. Zhu, J., Hoi, S. C. H., and Lyu, M. R. 2008. Face annotation using transductive kernel fisher discriminant. IEEE Trans. Multimed. 10, 1, 86--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Zhu, X., Ghahramani, Z., and Lafferty, J. D. 2003. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 20th International Conference on Machine Learning. 912--919.Google ScholarGoogle Scholar

Index Terms

  1. Annotation propagation in image databases using similarity graphs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 10, Issue 1
      December 2013
      166 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/2559928
      Issue’s Table of Contents

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 December 2013
      • Accepted: 1 May 2013
      • Revised: 1 November 2012
      • Received: 1 August 2012
      Published in tomm Volume 10, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader