|
ABSTRACT
To enable automatic multi-level image annotation, we have addressed two inter-related important issues:(1)more effective framework for image content representation and feature extraction to characterize the middle-level semantics of image contents;(2)new framework for hierarchical probabilistic image concept reasoning and detection. To address the first issue salient objects are used as the semantic building blocks to characterize the middle-level semantics of image contents effectively while reducing the image analysis cost significantly. We have proposed three approaches to designing the detection functions for automatic salient object detection,and automatic function selection is also supported to find the "right "assumptions of the principal visual properties for the corresponding salient object classes. To address the second issue wehaveproposed a novel framework to incorporate the concept ontology to achieve hierarchical probabilistic image concept reasoning for multi-level image annotation. The concept ontology for a large-scale public image database called Label Me is semi-automatically derived from the available image labels by using WordNet The image concepts at the first level of the concept ontology are used to characterize the most specific semantics of image contents with the smallest variations, and their correspondences with the semantic building blocks (i.e.,salient objects)are well-de fined and can be modeled accurately by using Bayesian networks. In addition,the predictions of the appearances of the higher-level image concepts with large variations are adopted by the underlying concept ontology or by combining the available predictions of the appearances of their children concepts through hierarchical Bayesian networks.Our experiments on a large public dataset have shown that our framework for hierarchical probabilistic image concept reasoning is scalable to diverse image contents (i.e.,large amount of salient object classes)with large within-category variations.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Y. Rui, T. S. Huang, and S.-F. Chang, "Image Retrieval:Current Techniques,Promising Directions and Open Issues", Journal of Visual Communication and Image Representation Vol.10, pp.39--62, 1999.
|
 |
2
|
|
| |
3
|
|
| |
4
|
R. Zhao, W. I. Grosky, "Negotiating the semantic gap: from feature maps to semantic landscapes", Pattern Recognition vol.35, no.3, pp.593--600, 2002.
|
 |
5
|
Xiaofei He , Wei-Ying Ma , Oliver King , Mingjing Li , Hongjiang Zhang, Learning and inferring a semantic space from user's relevance feedback for image retrieval, Proceedings of the tenth ACM international conference on Multimedia, December 01-06, 2002, Juan-les-Pins, France
[doi> 10.1145/641007.641080]
|
| |
6
|
R. Lienhart and A. Hartmann," Classifying images on the web automatically", Journal of Electronic Imaging vol. 11, no.4, pp. 445--454, 2002.
|
| |
7
|
|
| |
8
|
|
| |
9
|
K. Vu, K. A. Hua, W. Tavanapong, "Image Retrieval Basedon Regions of Interest", IEEE Trans. TKDE vol.15, no.4, pp. 1045--1049, 2003.
|
| |
10
|
|
| |
11
|
|
 |
12
|
|
| |
13
|
A. B. Benitez, S.-F. Chang, "Image classi fication using multimedia knowledge networks", ICIP, pp.613--616, 2003.
|
| |
14
|
A. B. Benitez, J. R. Smith, S.-F. Chang,"MediaNet: A multimedia information network for knowledge representation", SPIE, vol. 4210, 2000.
|
 |
15
|
|
 |
16
|
|
 |
17
|
|
| |
18
|
A. G. Hauptmann,"Towards a large scale concept ontology for broadcast video", CIVR, 2004.
|
 |
19
|
|
| |
20
|
|
| |
21
|
K. Barnard and D. Forsyth,"Learning the semantics of words and pictures", Proc. ICCV, pp.408--415, 2001.
|
| |
22
|
N. Vasconcelos, "Image indexing with mixture hierarchies", IEEE CVPR, 2001.
|
 |
23
|
|
 |
24
|
|
| |
25
|
N. Serrano, A. E. Savakis, J. Luo, "Improved scene classification using efficient low-level features and semantic cues ",Pattern Recognition vol.37, no.9, pp.1773--1784, 2004.
|
| |
26
|
R. Jin, A. G. Hauptmann, "Using a probabilistic source model for comparing images", ICIP, pp.941--944, 2002.
|
| |
27
|
A. Vailaya, M. Figueiredo, A. K. Jain, H. J. Zhang, "Image classification for content-based indexing ", IEEE Trans. on Image Processing vol.10, pp. 117--130, 2001.
|
| |
28
|
C. Fellbaum, WordNet: An Electronic Lexical Database MIT Press, 1998.
|
| |
29
|
|
| |
30
|
|
 |
31
|
|
 |
32
|
|
 |
33
|
Kristina Toutanova , Francine Chen , Kris Popat , Thomas Hofmann, Text classification in a hierarchical mixture model for small training sets, Proceedings of the tenth international conference on Information and knowledge management, October 05-10, 2001, Atlanta, Georgia, USA
[doi> 10.1145/502585.502604]
|
 |
34
|
|
| |
35
|
|
| |
36
|
Y. Freund, R. E. Schapire, "Experiments with a new boosting algorithm", Proc. ICML, pp. 148--156, 1996.
|
| |
37
|
A. Torralba, K. Murphy, W. Freeman, "Sharing features: effcient boosting procedures for multiclass object detection", CVPR, 2004.
|
| |
38
|
|
| |
39
|
J. C. Platt, "Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods", in Adavances in Large Margin Classifiers MIT Press, 1999.
|
 |
40
|
|
| |
41
|
Y. Gao, J. Fan, H. Luo, X. Xue, R. Jain, "Automatic image annotation by incorporating feature hierarchy and boosting to scale up SVM classi fiers", ACM Multimedia, 2006.
|
| |
42
|
|
|