| To search or to label?: predicting the performance of search-based automatic image classifiers |
| Full text |
Pdf
(1.59 MB)
|
| Source
|
International Multimedia Conference
archive
Proceedings of the 8th ACM international workshop on Multimedia information retrieval
table of contents
Santa Barbara, California, USA
SESSION: Special session 1: query systems for data retrieval in large personal image and video databases
table of contents
Pages: 249 - 258
Year of Publication: 2006
ISBN:1-59593-495-2
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 16, Downloads (12 Months): 111, Citation Count: 3
|
|
|
ABSTRACT
In this work we explore the trade-offs in acquiring training data for image classification models through automated web search as opposed to human annotation. Automated web search comes at no cost in human labor, but sometimes leads to decreased classification performance, while human annotations come at great expense in human labor but result in better performance. The primary contribution of this work is a system for predicting which visual concepts will show the greatest increase in performance from investing human effort in obtaining annotations. We propose to build this system as an estimation of the absolute gain in average precision (AP) experienced from using human annotations instead of web search. To estimate the AP gain, we rely on statistical classifiers built on top of a number of quality prediction features. We employ a feature selection algorithm to compare the quality of each of the predictors and find that cross-domain image similarity and cross-domain model generalization metrics are strong predictors, while concept frequency and within-domain model quality are weak predictors. In a test application, we find that the prediction scheme can result in a savings in annotation effort of up to 75\%, while only incurring marginal damage (10% relative decrease in mean average precision) to the overall performance of the concept models.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
NIST TREC Video Retrieval Evaluation http://www-nlpir.nist.gov/projects/trecvid/.
|
| |
2
|
LSCOM lexicon definitions and annotations version 1.0, DTO challenge workshop on large scale concept ontology for multimedia. Technical report, Columbia University, March 2006.
|
| |
3
|
A. Amir, J. Argillander, M. Campbell, A. Haubold, G. Iyengar, S. Ebadollahi, F. Kang, M. R. Naphade, A. Natsev, J. R. Smith, J. Tesic, and T. Volkmer. IBM Research TRECVID-2005 Video Retrieval System. In NIST TRECVID workshop, Gaithersburg, MD, November 2005.
|
| |
4
|
C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/¿cjlin/libsvm.
|
| |
5
|
S.-F. Chang, W. Hsu, L. Kennedy, L. Xie, A. Yanagawa, E. Zavesky, and D. Zhang. Columbia University TRECVID-2005 Video Search and High-Level Feature Extraction. In NIST TRECVID workshop, Gaithersburg, MD, November 2005.
|
| |
6
|
|
| |
7
|
G. A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography, 3(4): 235--244, 1990.
|
| |
8
|
C. G. Snoek, M. Worring, D. C.Koelma, and A. W. Smeulders. Learned lexicon-driven interactive video retrieval. In CIVR, 2006.
|
| |
9
|
R. Typke, R. C. Veltkamp, and F. Wiering. A measure for evaluating retrieval techniques based on partially ordered ground truth lists. In ICME, 2006.
|
 |
10
|
|
| |
11
|
|
 |
12
|
|
CITED BY 3
|
|
|
|
|
Lyndon Kennedy , Mor Naaman , Shane Ahern , Rahul Nair , Tye Rattenbury, How flickr helps us make sense of the world: context and content in community-contributed media collections, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
|
|