|
ABSTRACT
Annotated collections of images and videos are a necessary basis for the successful development of multimedia retrieval systems. The underlying models of such systems rely heavily on quality and availability of large training collections. The annotation of large collections, however, is a time-consuming and error prone task as it has to be performed by human annotators. In this paper we present the IBM Efficient Video Annotation (EVA) system, a server-based tool for semantic concept annotation of large video and image collections. It is optimised for collaborative annotation and includes features such as workload sharing and support in conducting inter-annotator analysis. We discuss initial results of an ongoing user-evaluation of this system. The results are based on data collected during the 2005 TRECVID Annotation Forum, where more than 100 annotators have been using the system.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
M. Christel , T. Kanade , M. Mauldin , R. Reddy , M. Sirbu , S. Stevens , H. Wactlar, Informedia Digital Video Library, Communications of the ACM, v.38 n.4, p.57-58, April 1995
[doi> 10.1145/205323.205337]
|
| |
2
|
E. Cooke, P. Ferguson, G. Gaughan, C. Gurrin, G. Jones, H. L. Borgue, H. Lee, S. Marlow, K. McDonald, M. McHugh, N. Murphy, N. O'Connor, N. O'Hare, S. Rothwell, A. Smeaton, and P. Wilkins. TRECVID 2004 experiments in Dublin City University. In TRECVID 2004 Workshop Notebook Papers, Gaithersburg, MD, USA, 15--16 November 2004. http://www-nlpir.nist.gov/projects/tvpubs/tvpapers04/dcu.pdf.
|
| |
3
|
|
| |
4
|
A. L. Edwards. An Introduction to Linear Regression and Correlation, chapter 4. The Correlation Coefficient, pages 33--46. W. H. Freeman, San Francisco, CA, USA, 1976.
|
| |
5
|
The Informedia Digital Library Project. http://www.informedia.cs.cmu.edu.
|
| |
6
|
W. Kraaij, A. F. Smeaton, P. Over, and J. Arlandis. TRECVID-2004 -- An introduction. In E. M. Voorhees and L. P. Buckland, editors, TRECVID 2004 Workshop Notebook Papers, Gaithersburg, MD, USA, 15--16 November 2004. http://www-nlpir.nist.gov/projects/tvpubs/tvpapers04/tv4overview.pdf.
|
| |
7
|
C.-Y. Lin, B. L. Tseng, and J. R. Smith. Video collaborative annotation forum: Establishing ground-truth labels on large multimedia datasets. In E. M. Voorhees and L. P. Buckland, editors, TRECVID 2003 Workshop Notebook Papers, Gaithersburg, MD, USA, 18--21 November 2003. http://www.alphaworks.ibm.com/tech/videoannex.
|
| |
8
|
The ricoh MovieTool. http://ricoh.co.jp/src/multimedia/MovieTool.
|
| |
9
|
M. Naphade, L. Kennedy, J. Kender, S. Chang, J. R. Smith, P. Over, and A. Hauptmann. A light scale concept ontology for multimedia understanding for TRECVID 2005. Technical Report RC23612, IBM T.J. Watson Research Center, Hawthorne, NY, USA, May 2005. http://domino.watson.ibm.com/library/CyberDig.nsf/papers/A33ABDB65967B5%3B852570070056B36F/$File/rc23612.pdf.
|
| |
10
|
C. Petersohn. Fraunhofer HHI at TRECVID~2004: Shot boundary detection system. In TRECVID 2004 Workshop Notebook Papers, Gaithersburg, MD, USA, 15--16 November 2004. http://www-nlpir.nist.gov/projects/tvpubs/tvpapers04/fraunhofer.pdf.
|
| |
11
|
T. Pfund and S. Marchand-Maillet. A dynamic multimedia annotation tool. In G. Beretta and R. Schettini, editors, In Proceedings of SPIE Photonics West, Electronic Imaging 2002, Internet Imaging III, volume 4672, pages 216--224, San Jose, CA, USA, January 2002. http://viper.unige.ch/research/annotation.
|
| |
12
|
A. F. Smeaton, W. Kraaij, and P. Over. TRECVID-2003 -- An introduction. In E. M. Voorhees and L. P. Buckland, editors, TRECVID 2003 Workshop Notebook Papers, Gaithersburg, MD, USA, 18--21 November 2003. http://www-nlpir.nist.gov/projects/tvpubs/tvpapers03/tv3overview.pdf.
|
| |
13
|
A. F. Smeaton and P. Over. The TREC-2002 video track report. In E. M. Voorhees and L. P. Buckland, editors, NIST Special Publication 500-251: Proceedings of the Eleventh Text REtrieval Conference (TREC 2002), pages 69--85, Gaithersburg, MD, USA, 19--22 November 2002. http://trec.nist.gov/pubs/trec11/papers/VIDEO.OVER.pdf.
|
| |
14
|
TREC Video Retrieval Evaluation (TRECVID). http://www-nlpir.nist.gov/projects/trecvid.
|
| |
15
|
The video desciption tool (VIDETO). http://www.zgdv.de/zgdv/departments/zr1/Produkte/videto.
|
 |
16
|
|
CITED BY 14
|
|
|
|
|
Cosmin Munteanu , Gerald Penn , Ron Baecker , Yuecheng Zhang, Automatic speech recognition for webcasts: how good is good enough and what to do when it isn't, Proceedings of the 8th international conference on Multimodal interfaces, November 02-04, 2006, Banff, Alberta, Canada
|
|
|
|
|
|
|
Timo Volkmer , James A. Thom , S. M. M. Tahaghoghi, Exploring human judgement of digital imagery, Proceedings of the thirtieth Australasian conference on Computer science, p.151-160, January 30-February 02, 2007, Ballarat, Victoria, Australia
|
|
|
Danyu Liu , Yu Cao , Ki-Hwan Kim , Sean Stanek , Bancha Doungratanaex-Chai , Kungen Lin , Wallapak Tavanapong , Johnny Wong , JungHwan Oh , Piet C. de Groen, Arthemis: Annotation software in an integrated capturing and analysis system for colonoscopy, Computer Methods and Programs in Biomedicine, v.88 n.2, p.152-163, November, 2007
|
|
Cees G. M. Snoek , Marcel Worring , Jan C. van Gemert , Jan-Mark Geusebroek , Arnold W. M. Smeulders, The challenge problem for automated detection of 101 semantic concepts in multimedia, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
|
|
|
|
|
|
|
Ritendra Datta , Weina Ge , Jia Li , James Z. Wang, Toward bridging the annotation-retrieval gap in image search by a generative modeling approach, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
|
|
|
|