ACM Home Page
Please provide us with feedback. Feedback
Multi-level annotation of natural scenes using dominant image components and semantic concepts
Full text PdfPdf (1.27 MB)
Source International Multimedia Conference archive
Proceedings of the 12th annual ACM international conference on Multimedia table of contents
New York, NY, USA
SESSION: Technical best paper contest session table of contents
Pages: 540 - 547  
Year of Publication: 2004
ISBN:1-58113-893-8
Authors
Jianping Fan  UNC-Charlotte, Charlotte, NC
Yuli Gao  UNC-Charlotte, Charlotte, NC
Hangzai Luo  UNC-Charlotte, Charlotte, NC
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 149,   Citation Count: 13
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1027527.1027660
What is a DOI?

ABSTRACT

Automatic image annotation is a promising solution to enable semantic image retrieval via keywords. In this paper, we propose a multi-level approach to annotate the semantics of <b><i>natural scenes</i></b> by using both the dominant image components (salient objects) and the relevant semantic concepts. To achieve automatic image annotation at the content level, we use salient objects as the dominant image components for image content representation and feature extraction. To support automatic image annotation at the concept level, a novel image classification technique is developed to map the images into the most relevant semantic image concepts. In addition, Support Vector Machine (SVM) classifiers are used to learn the detection functions for the pre-defined salient objects and finite mixture models are used for semantic concept interpretation and modeling. An <b><i>adaptive EM algorithm</i></b> has been proposed to determine the optimal model structure and model parameters simultaneously. We have also demonstrated that our algorithms are very effective to enable multi-level annotation of <b><i>natural scenes</i></b> in a large-scale image dataset.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
E. Chang, "Statistical learning for effective visual information retrieval", Proc. ICIP, 2003.
3
 
4
 
5
 
6
 
7
 
8
R. Schettini, A. Valsasna, C. Brambilla, M. De Ponti, "A indoor/outdoor/close-up photo classifier", Proc. Color Imaging, 2001.
 
9
10
 
11
N. Campbell, B. Thomas, T. Troscianko, "Automatic segmentation and classification of outdoor images using neural networks", Intl. Journal of Neural Systems, vol.8, pp.137--144, 1997.
 
12
 
13
A. Vailaya, M. Figueiredo, A.K. Jain, H.J. Zhang, "Image classification for content-based indexing", IEEE Trans. on Image Processing, vol.10, 2001.
 
14
A. Hartmann, R. Lienhart, "Automatic classification of images on the web", Proc. SPIE, vol.4676, 2002.
 
15
E. Chang, K. Goh, G. Sychay, G. Wu, "CBSA: Content-based annotation for multimodal image retrieval using Bayes point machines", IEEE Trans. CSVT, 2002.
16
 
17
A. Mojsilovic, J. Gomes, B. Rogowitz, "ISee: Perceptual features for image library navigation", Proc. SPIE, 2001.
 
18
 
19
J.R. Smith and S.-F. Chang, "Multi-stage classification of images from features and related text", Proc. DELOS, 1997.
20
 
21
J. Luo and S. Etz, "A physical model-based approach to detecting sky in photographic images", IEEE Trans. on Image Processing, vol.11, 2002.
 
22
S.F. Chang, W. Chen, H. Sundaram, "Semantic visual template: Linking visual features to semantics", Proc. ICIP, 1998.
23
24
 
25
 
26
Y. Wu, Q. Tian, T.S. Huang, "Discriminant-EM algorithm with application to image retrieval", Proc. CVPR, pp.222--227, 2000.
 
27
J. Lin, "Divergence measures based on the Shannon entropy", IEEE Trans. on IT, vol.37, no.1, 1991.
 
28
A.B. Benitez, J.R. Smith and S.-F. Chang, "MediaNet: A multimedia information network for knowledge representation", Proc. SPIE, vol.4210, 2000.
 
29
 
30
K. Barnard and D. Forsyth, "Learning the semantics of words and pictures", Proc. ICCV, pp.408--415, 2001.
 
31
M.R. Naphade, X. Zhou, and T.S. Huang, "Image classification using a set of labeled and unlabeled images", Proc. SPIE, 2000.
 
32
M.R. Naphade and T.S. Huang, "A probabilistic framework for semantic video indexing, filtering, and retrival", IEEE Trans. on Multimedia, vol.3, pp.141--151, 2001.
 
33
R. Oami, A. Benitez, S.-F. Chang, N. Dimitrova, "Understanding and modeling user interests in consumer videos", ICME, 2004.
 
34
 
35
B. Zhang, C. Zhang, X. Yi, "Competitive EM algorithm for finite mixture models", Pattern Recognition, vol.37, pp.131--144, 2004.
 
36

CITED BY  13
 

Collaborative Colleagues:
Jianping Fan: colleagues
Yuli Gao: colleagues
Hangzai Luo: colleagues