Abstract
The practicality of large-scale image indexing and querying methods depends crucially upon the availability of semantic information. The manual tagging of images with semantic information is in general very labor intensive, and existing methods for automated image annotation may not always yield accurate results. The aim of this paper is to reduce to a minimum the amount of human intervention required in the semantic annotation of images, while preserving a high degree of accuracy. Ideally, only one copy of each object of interest would be labeled manually, and the labels would then be propagated automatically to all other occurrences of the objects in the database. To this end, we propose an influence propagation strategy, SW-KProp, that requires no human intervention beyond the initial labeling of a subset of the images. SW-KProp distributes semantic information within a similarity graph defined on all images in the database: each image iteratively transmits its current label information to its neighbors, and then readjusts its own label according to the combined influences of its neighbors. SW-KProp influence propagation can be efficiently performed by means of matrix computations, provided that pairwise similarities of images are available. We also propose a variant of SW-KProp which enhances the quality of the similarity graph by selecting a reduced feature set for each prelabeled image and rebuilding its neighborhood. The performances of the SW-KProp method and its variant were evaluated against several competing methods on classification tasks for three image datasets: a handwritten digit dataset, a face dataset and a web image dataset. For the digit images, SW-KProp and its variant performed consistently better than the other methods tested. For the face and web images, SW-KProp outperformed its competitors for the case when the number of prelabeled images was relatively small. The performance was seen to improve significantly when the feature selection strategy was applied.
- Ames, M. and Naaman, M. 2007. Why we tag: Motivations for annotation in mobile and online media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 971--980. Google ScholarDigital Library
- Avrachenkov, K., Dobrynin, V., Nemirovsky, D., Pham, S. K., and Smirnova, E. 2008. Pagerank based clustering of hypertext document collections. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 873--874. Google ScholarDigital Library
- Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D. M., and Jordan, M. I. 2003. Matching words and pictures. J. Mach. Learn. Res. 3, 1107--1135. Google ScholarDigital Library
- Belkin, M., Niyogi, P., and Sindhwani, V. 2006. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399--2434. Google ScholarDigital Library
- Blum, A. and Chawla, S. 2001. Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the 18th International Conference on Machine Learning. 19--26. Google ScholarDigital Library
- Bradski, G. R. and Kaehler, A. 2008. Learning OpenCV - Computer Vision with the OpenCV Library: Software that Sees. O'Reilly.Google Scholar
- Cao, L., Pozo, A. D., Jin, X., Luo, J., Han, J., and Huang, T. S. 2010. RankCompete: Simultaneous ranking and clustering of web photos. In Proceedings of the 19th International Conference on World Wide Web. 1071--1072. Google ScholarDigital Library
- Chang, E., Goh, K., Sychay, G., and Wu, G. 2003. CBSA: Content-Based Soft Annotation for multimodal image retrieval using Bayes point machines. IEEE Trans. Circ. Syst. Video Tech. 13, 1, 26--38. Google ScholarDigital Library
- Chua, T.-S., Tang, J., Hong, R., Li, H., Luo, Z., and Zheng, Y.-T. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of ACM Conference on Image and Video Retrieval. Google ScholarDigital Library
- Cusano, C., Ciocca, G., and Schettini, R. 2003. Image annotation using SVM. In Engineers SPIE Conference Series, Vol. 5304, 330--338.Google Scholar
- Desai, C., Kalashnikov, D. V., Mehrotra, S., and Venkatasubramanian, N. 2009. Using semantics for speech annotation of images. In Proceedings of the IEEE International Conference on Data Engineering. 1227--1230. Google ScholarDigital Library
- Duygulu, P., Barnard, K., de Freitas, J. F. G., and Forsyth, D. A. 2002. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Proceedings of the 7th European Conference on Computer Vision:Part IV. 97--112. Google ScholarDigital Library
- Everingham, M., Sivic, J., and Zisserman, A. 2006. “Hello! My name is… Buffy” -- Automatic naming of characters in TV video. In Proceedings of the British Machine Vision Conference. 899--908.Google Scholar
- Hageman, L. and Young, D. 2004. Applied Iterative Methods. Dover Publications.Google Scholar
- Hardoon, D. R., Saunders, C., Szedmák, S., and Shawe-Taylor, J. 2006. A correlation approach for automatic image annotation. In Advanced Data Mining and Applications. 681--692. Google ScholarDigital Library
- Hestenes, M. R. and Stiefel, E. 1952. Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Standards 49, 409--436.Google ScholarCross Ref
- Higham, N. J. and Tisseur, F. 2003. Bounds for eigenvalues of matrix polynomials. Linear Algebra Appl. 358, 1--3, 5--22.Google ScholarCross Ref
- Houle, M. E., Oria, V., Satoh, S., and Sun, J. 2011. Knowledge propagation in large image databases using neighborhood information. In Proceedings of the ACM Multimedia. 1033--1036. Google ScholarDigital Library
- Houle, M. E. and Sakuma, J. 2005. Fast approximate similarity search in extremely high-dimensional data sets. In Proceedings of the 21st International Conference on Data Engineering. 619--630. Google ScholarDigital Library
- Hu, X. and Qian, X. 2009. A novel graph-based image annotation with two level bag generators. In Proceedings of the International Conference on Computational Intelligence and Security. 71--75. Google ScholarDigital Library
- Jeh, G. and Widom, J. 2002. SimRank: A measure of structural-context similarity. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 538--543. Google ScholarDigital Library
- Jeon, J., Lavrenko, V., and Manmatha, R. 2003. Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. 119--126. Google ScholarDigital Library
- Jing, Y. and Baluja, S. 2008. VisualRank: Applying PageRank to large-scale image search. IEEE Trans. Patt. Anal. Mach. Intell. 30, 11, 1877--1890. Google ScholarDigital Library
- Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11, 2278--2324.Google ScholarCross Ref
- Li, R., Zhang, Y., Lu, Z., Lu, J., and Tian, Y. 2010. Technique of image retrieval based on multi-label image annotation. In Proceedings of the 2010 2nd International Conference on Multimedia and Information Technology, Vol. 2. 10--13. Google ScholarDigital Library
- Li, X., Chen, L., Zhang, L., Lin, F., and Ma, W.-Y. 2006. Image annotation by large-scale content-based image retrieval. In Proceedings of the 14th Annual ACM International Conference on Multimedia. 607--610. Google ScholarDigital Library
- Liu, J., Li, M., Ma, W.-Y., Liu, Q., and Lu, H. 2006. An adaptive graph model for automatic image annotation. In Multimed. Inf. Ret. 61--70. Google ScholarDigital Library
- Liu, W., Dumais, S., Sun, Y., Zhang, H., Czerwinski, M., and Field, B. 2001. Semi-automatic image annotation. In Proceedings of Interact: Conference on Human-Computer Interaction. 326--333.Google Scholar
- Liu, W., He, J., and Chang, S.-F. 2010. Large graph construction for scalable semi-supervised learning. In Proceedings of the 27th International Conference on Machine Learning. 679--686.Google Scholar
- Liu, W., Wang, J., and Chang, S.-F. 2012. Robust and scalable graph-based semisupervised learning. Proc. IEEE 100, 9, 2624--2638.Google ScholarCross Ref
- Makadia, A., Pavlovic, V., and Kumar, S. 2008. A new baseline for image annotation. In Proceedings of the 10th European Conference on Computer Vision: Part III. 316--329. Google ScholarDigital Library
- Melacci, S. and Belkin, M. 2011. Laplacian support vector machines trained in the primal. J. Mach. Learn. Res. 12, 1149--1184. Google ScholarDigital Library
- Nov, O. and Ye, C. 2010. Why do people tag?: Motivations for photo tagging. Comm. ACM 53, 7, 128--131. Google ScholarDigital Library
- Ono, A., Amano, M., Hakaridani, M., Satou, T., and Sakauchi, M. 1996. A flexible content-based image retrieval system with combined scene description keyword. In Proceedings of the 3rd IEEE International Conference on Multimedia Computing and Systems. 201--208. Google ScholarDigital Library
- Ozkan, D. and Duygulu, P. 2006. A graph based approach for naming faces in news photos. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 1477--1482. Google ScholarDigital Library
- Page, L., Brin, S., Motwani, R., and Winograd, T. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report 1999--66., Stanford InfoLab.Google Scholar
- Russell, B., Torralba, A., Murphy, K., and Freeman, W. 2008. LabelMe: A database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 1--3, 157--173. Google ScholarDigital Library
- Saad, Y. and Schultz, M. H. 1986. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Statist. Comput. 7, 856--869. Google ScholarDigital Library
- Shi, R., Lee, C.-H., and Chua, T.-S. 2007. Enhancing image annotation by integrating concept ontology and text-based bayesian learning model. In Proceedings of the 15th International Conference on Multimedia. 341--344. Google ScholarDigital Library
- Srikanth, M., Varner, J., Bowden, M., and Moldovan, D. 2005. Exploiting ontologies for automatic image annotation. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 552--558. Google ScholarDigital Library
- Tang, J., Hong, R., Yan, S., Chua, T.-S., Qi, G.-J., and Jain, R. 2011. Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images. ACM Trans. Intel. Syst. Tech. 2, 2, 14. Google ScholarDigital Library
- Von Ahn, L. and Dabbish, L. 2004. Labeling images with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 319--326. Google ScholarDigital Library
- Wang, C., Jing, F., Zhang, L., and Zhang, H. 2006. Image annotation refinement using random walk with restarts. In Proceedings of the ACM Multimedia. 647--650. Google ScholarDigital Library
- Zhou, D., Bousquet, O., Lal, T. N., Weston, J., and Schölkopf, B. 2003a. Learning with local and global consistency. In Advances in Neural Information Processing Systems 16.Google Scholar
- Zhou, D., Weston, J., Gretton, A., Bousquet, O., and Schölkopf, B. 2003b. Ranking on data manifolds. In Advances in Neural Information Processing Systems 16.Google Scholar
- Zhu, J., Hoi, S. C. H., and Lyu, M. R. 2008. Face annotation using transductive kernel fisher discriminant. IEEE Trans. Multimed. 10, 1, 86--96. Google ScholarDigital Library
- Zhu, X., Ghahramani, Z., and Lafferty, J. D. 2003. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 20th International Conference on Machine Learning. 912--919.Google Scholar
Index Terms
- Annotation propagation in image databases using similarity graphs
Recommendations
Automatic image annotation using semi-supervised generative modeling
Image annotation approaches need an annotated dataset to learn a model for the relation between images and words. Unfortunately, preparing a labeled dataset is highly time consuming and expensive. In this work, we describe the development of an ...
Knowledge propagation in large image databases using neighborhood information
MM '11: Proceedings of the 19th ACM international conference on MultimediaThe aim of this paper is to reduce to a minimum the level of human intervention in the semantic annotation process of images. Ideally, only one copy of each object of interest would be labeled manually, and the labels would then be propagated ...
Multi-view based multi-label propagation for image annotation
Multi-view learning and multi-label propagation are two common approaches to address the problem of image annotation. Traditional multi-view methods disregard the consistencies among different views while existing algorithms toward multi-label ...
Comments