|
ABSTRACT
Automatic image annotation is a promising solution to enable semantic image retrieval via keywords. In this paper, we propose a multi-level approach to annotate the semantics of <b><i>natural scenes</i></b> by using both the dominant image components (salient objects) and the relevant semantic concepts. To achieve automatic image annotation at the content level, we use salient objects as the dominant image components for image content representation and feature extraction. To support automatic image annotation at the concept level, a novel image classification technique is developed to map the images into the most relevant semantic image concepts. In addition, Support Vector Machine (SVM) classifiers are used to learn the detection functions for the pre-defined salient objects and finite mixture models are used for semantic concept interpretation and modeling. An <b><i>adaptive EM algorithm</i></b> has been proposed to determine the optimal model structure and model parameters simultaneously. We have also demonstrated that our algorithms are very effective to enable multi-level annotation of <b><i>natural scenes</i></b> in a large-scale image dataset.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
E. Chang, "Statistical learning for effective visual information retrieval", Proc. ICIP, 2003.
|
 |
3
|
Xiaofei He , Wei-Ying Ma , Oliver King , Mingjing Li , Hongjiang Zhang, Learning and inferring a semantic space from user's relevance feedback for image retrieval, Proceedings of the tenth ACM international conference on Multimedia, December 01-06, 2002, Juan-les-Pins, France
[doi> 10.1145/641007.641080]
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
R. Schettini, A. Valsasna, C. Brambilla, M. De Ponti, "A indoor/outdoor/close-up photo classifier", Proc. Color Imaging, 2001.
|
| |
9
|
|
 |
10
|
|
| |
11
|
N. Campbell, B. Thomas, T. Troscianko, "Automatic segmentation and classification of outdoor images using neural networks", Intl. Journal of Neural Systems, vol.8, pp.137--144, 1997.
|
| |
12
|
|
| |
13
|
A. Vailaya, M. Figueiredo, A.K. Jain, H.J. Zhang, "Image classification for content-based indexing", IEEE Trans. on Image Processing, vol.10, 2001.
|
| |
14
|
A. Hartmann, R. Lienhart, "Automatic classification of images on the web", Proc. SPIE, vol.4676, 2002.
|
| |
15
|
E. Chang, K. Goh, G. Sychay, G. Wu, "CBSA: Content-based annotation for multimodal image retrieval using Bayes point machines", IEEE Trans. CSVT, 2002.
|
 |
16
|
|
| |
17
|
A. Mojsilovic, J. Gomes, B. Rogowitz, "ISee: Perceptual features for image library navigation", Proc. SPIE, 2001.
|
| |
18
|
|
| |
19
|
J.R. Smith and S.-F. Chang, "Multi-stage classification of images from features and related text", Proc. DELOS, 1997.
|
 |
20
|
|
| |
21
|
J. Luo and S. Etz, "A physical model-based approach to detecting sky in photographic images", IEEE Trans. on Image Processing, vol.11, 2002.
|
| |
22
|
S.F. Chang, W. Chen, H. Sundaram, "Semantic visual template: Linking visual features to semantics", Proc. ICIP, 1998.
|
 |
23
|
|
 |
24
|
|
| |
25
|
|
| |
26
|
Y. Wu, Q. Tian, T.S. Huang, "Discriminant-EM algorithm with application to image retrieval", Proc. CVPR, pp.222--227, 2000.
|
| |
27
|
J. Lin, "Divergence measures based on the Shannon entropy", IEEE Trans. on IT, vol.37, no.1, 1991.
|
| |
28
|
A.B. Benitez, J.R. Smith and S.-F. Chang, "MediaNet: A multimedia information network for knowledge representation", Proc. SPIE, vol.4210, 2000.
|
| |
29
|
|
| |
30
|
K. Barnard and D. Forsyth, "Learning the semantics of words and pictures", Proc. ICCV, pp.408--415, 2001.
|
| |
31
|
M.R. Naphade, X. Zhou, and T.S. Huang, "Image classification using a set of labeled and unlabeled images", Proc. SPIE, 2000.
|
| |
32
|
M.R. Naphade and T.S. Huang, "A probabilistic framework for semantic video indexing, filtering, and retrival", IEEE Trans. on Multimedia, vol.3, pp.141--151, 2001.
|
| |
33
|
R. Oami, A. Benitez, S.-F. Chang, N. Dimitrova, "Understanding and modeling user interests in consumer videos", ICME, 2004.
|
| |
34
|
|
| |
35
|
B. Zhang, C. Zhang, X. Yi, "Competitive EM algorithm for finite mixture models", Pattern Recognition, vol.37, pp.131--144, 2004.
|
| |
36
|
|
CITED BY 13
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yuli Gao , Jianping Fan , Xiangyang Xue , Ramesh Jain, Automatic image annotation by incorporating feature hierarchy and boosting to scale up SVM classifiers, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
Michael S. Lew , Nicu Sebe , Chabane Djeraba , Ramesh Jain, Content-based multimedia information retrieval: State of the art and challenges, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), v.2 n.1, p.1-19, February 2006
|
|