skip to main content
10.1145/1459359.1459374acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Exploring multimedia in a keyword space

Published: 26 October 2008 Publication History

Abstract

We address the problem of searching multimedia by semantic similarity in a keyword space. In contrast to previous research we represent multimedia content by a vector of keywords instead of a vector of low-level features. This vector of keywords can be obtained through user manual annotations or computed by an automatic annotation algorithm. In this setting, we studied the influence of two aspects of the search by semantic similarity process: (1) accuracy of user keywords versus automatic keywords and (2) functions to compute semantic similarity between keyword vectors of two multimedia documents. We consider these two aspects to be crucial in the design of a keyword space that can exploit social-media information and can enrich applications such as Flickr and YouTube. Experiments were performed on an image and a video dataset with a large number of keywords, with different similarity functions and with two annotation methods. Surprisingly, we found that multimedia semantic similarity with automatic keywords performs as good as or better than 95% accurate user keywords.

References

[1]
G. Carneiro and N. Vasconcelos, "Formulating semantic image annotation as a supervised learning problem," in IEEE Conf. on Computer Vision and Pattern Recognition, San Diego, CA, USA, 2005.
[2]
S.-F. Chang, W. Chen, and H. Sundaram, "Semantic visual templates: linking visual features to semantics," in Int'l Conference on Image Processing, Chicago, IL, USA, 1998.
[3]
P. Duygulu, K. Barnard, N. de Freitas, and D. Forsyth, "Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary," in European Conf. on Computer Vision, Copenhagen, Denmark, 2002, pp. 97--112.
[4]
S. L. Feng, V. Lavrenko, and R. Manmatha, "Multiple Bernoulli relevance models for image and video annotation," in IEEE Conf. on Computer Vision and Pattern Recognition, Cambridge, UK, 2004, pp. 1002--1009.
[5]
M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, "Query by image and video content: the QBIC system," IEEE Computer, vol. 28, pp. 23--32, Sep 1995.
[6]
A. Haubold, A. Natsev, and M. Naphade, "Semantic multimedia retrieval using lexical query expansion and model-based re-ranking," in IEEE Int'l Conference on Multimedia and Expo Toronto, Canada, 2006.
[7]
A. Hauptmann, R. Yan, and W.-H. Lin, "How many high-level conecpts will fill the semantic gap in news video retrieval?," in ACM Conf. on image and video retrieval, Amsterdam, The Netherlands, 2007.
[8]
X. He, O. King, W.-Y. Ma, M. Li, and H.-J. Zhang, "Learning a semantic space from user's relevance feedback for image retrieval," IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, pp. 39--48, Jan 2003.
[9]
P. Howarth and S. Rüger, "Fractional distance measures for content-based image retrieval," in European Conference on Information Retrieval, Santiago de Compostela, Spain, 2005.
[10]
P. Howarth and S. Rüger, "Trading accuracy for speed," in Int'l Conf. on Image and Video Retrieval Singapore, 2005.
[11]
M. S. Lew, N. Sebe, C. Djeraba, and R. Jain, "Content-based multimedia information retrieval: State of the art and challenges," ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 2, pp. 1--19, February 2006.
[12]
J. Lin, "Divergence measures based on the Shannon entropy," IEEE Trans. Inform. Theory, vol. 37, pp. 145--151, January 1991.
[13]
Y. Lu, C. Hu, X. Zhu, H. Zhang, and Q. Yang, "A unified framework for semantics and feature based relevance feedback in image retrieval systems," in ACM Conf. on Multimedia, Los Angeles, CA, USA, 2000, pp. 31--37.
[14]
J. Magalhães, "Statistical models for semantic-multimedia information retrieval," PhD Thesis, University of London, Imperial College of Science, Technology and Medicine, 2008.
[15]
J. Magalhães and S. Rüger, "High-dimensional visual vocabularies for image retrieval," in ACM SIGIR Conf. on research and development in information retrieval, Amsterdam, The Netherlands, 2007.
[16]
J. Magalhães and S. Rüger, "Information-theoretic semantic multimedia indexing," in ACM Conf. on Image and Video Retrieval, Amsterdam, The Netherlands, 2007.
[17]
A. McCallum and K. Nigam, "A comparison of event models for naive Bayes text classification," in AAAI Workshop on Learning for Text Categorization, 1998.
[18]
F. Monay and D. Gatica-Perez, "Modeling Semantic Aspects for Cross-Media Image Indexing," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, pp. 1802--1817, October 2007.
[19]
M. Naphade, J. R. Smith, J. Tesic, S.-F. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis, "Large-scale concept ontology for multimedia," IEEE Multimedia Magazine, vol. 13, pp. 86--91, 2006.
[20]
A. Natsev, A. Haubold, J. Tesic, L. Xie, and R. Yan, "Semantic concept-based query expansion and re-ranking for multimedia retrieval," in ACM Conf. on Multimedia Augsburg, Germany, 2007.
[21]
N. Rasiwasia, P. Moreno, and N. Vasconcelos, "Bridiging the gap: Query by semantic example," IEEE Transactions on Multimedia, vol. 9, pp. 923--938, August 2007.
[22]
N. Rasiwasia, N. Vasconcelos, and P. Moreno, "Query by semantic example," in CIVR, Phoenix, AZ, USA, 2006.
[23]
N. Sebe, M. S. Lew, and D. P. Huijsmans, "Toward improved ranking metrics," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, pp. 1132--1143, October 2000.
[24]
A. F. Smeaton and I. Quigley, "Experiments on using semantic distances between words in image caption retrieval," in ACM SIGIR Conf. on research and development in information retrieval, Zurich, Switzerland, 1996.
[25]
C. G. M. Snoek, M. Worring, J.-M. Geusebroek, D. C. Koelma, F. J. Seinstra, and A. W. M. Smeulders, "The semantic pathfinder: using an authoring metaphor for generic multimedia indexing," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, pp. 1678--1689, October 2006.
[26]
J. Tesic, A. Natsev, and J. R. Smith, "Cluster-based data modelling for semantic video search," in ACM Conf. on Image and Video Retrieval Amsterdam, The Netherlands, 2007.
[27]
J. Z. Wang, J. Li, and G. Wiederhold, "SIMPLIcity: semantics-sensitive integrated matching for picture libraries," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, pp. 947--963, September 2001.
[28]
C. Yang, M. Dong, and F. Fotouhi, "Semantic feedback for interactive image retrieval," in Int'l Multimedia Modelling Conference, Singapore, 2005.
[29]
A. Yavlinsky, E. Schofield, and S. Rüger, "Automated image annotation using global features and robust nonparametric density estimation," in Int'l Conf. on Image and Video Retrieval, Singapore, 2005.
[30]
C. Zhang and T. Chen, "An active learning framework for content-based information retrieval," IEEE Transactions on Multimedia, vol. 4, pp. 260--268, Jun 2002.
[31]
X. S. Zhou and T. S. Huang, "Unifying keywords and visual contents in image retrieval," IEEE Multimedia, vol. 9, pp. 23--33, Apr-Jun 2002.

Cited By

View all
  • (2016)Cross-Modal Search on Social Networking Systems by Exploring Wikipedia ConceptsDigital Libraries: Knowledge, Information, and Data in an Open Access Society10.1007/978-3-319-49304-6_41(381-393)Online publication date: 15-Nov-2016
  • (2013)Information Network Construction and Alignment from Automatically Acquired Comparable CorporaBuilding and Using Comparable Corpora10.1007/978-3-642-20128-8_13(243-263)Online publication date: 14-Dec-2013
  • (2012)Web-Scale Multimedia Information NetworksProceedings of the IEEE10.1109/JPROC.2012.2201909100:9(2688-2704)Online publication date: Sep-2012
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '08: Proceedings of the 16th ACM international conference on Multimedia
October 2008
1206 pages
ISBN:9781605583037
DOI:10.1145/1459359
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. automatic keyword annotations
  2. keyword spaces
  3. multimedia
  4. search
  5. user keyword annotations

Qualifiers

  • Research-article

Conference

MM08
Sponsor:
MM08: ACM Multimedia Conference 2008
October 26 - 31, 2008
British Columbia, Vancouver, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2016)Cross-Modal Search on Social Networking Systems by Exploring Wikipedia ConceptsDigital Libraries: Knowledge, Information, and Data in an Open Access Society10.1007/978-3-319-49304-6_41(381-393)Online publication date: 15-Nov-2016
  • (2013)Information Network Construction and Alignment from Automatically Acquired Comparable CorporaBuilding and Using Comparable Corpora10.1007/978-3-642-20128-8_13(243-263)Online publication date: 14-Dec-2013
  • (2012)Web-Scale Multimedia Information NetworksProceedings of the IEEE10.1109/JPROC.2012.2201909100:9(2688-2704)Online publication date: Sep-2012
  • (2012)Cross Domain Search by Exploiting WikipediaProceedings of the 2012 IEEE 28th International Conference on Data Engineering10.1109/ICDE.2012.13(546-557)Online publication date: 1-Apr-2012
  • (2012)Using manual and automated annotations to search images by semantic similarityMultimedia Tools and Applications10.1007/s11042-010-0558-356:1(109-129)Online publication date: 1-Jan-2012
  • (2010)Enhancing multi-lingual information extraction via cross-media inference and fusionProceedings of the 23rd International Conference on Computational Linguistics: Posters10.5555/1944566.1944638(630-638)Online publication date: 23-Aug-2010
  • (2010)Challenges from information extraction to information fusionProceedings of the 23rd International Conference on Computational Linguistics: Posters10.5555/1944566.1944624(507-515)Online publication date: 23-Aug-2010
  • (2010)Integrating web 2.0 resources by wikipediaProceedings of the 18th ACM international conference on Multimedia10.1145/1873951.1874058(707-710)Online publication date: 25-Oct-2010
  • (2009)Inferring semantic concepts from community-contributed images and noisy tagsProceedings of the 17th ACM international conference on Multimedia10.1145/1631272.1631305(223-232)Online publication date: 23-Oct-2009
  • (2009)From usage to annotationProceedings of the first SIGMM workshop on Social media10.1145/1631144.1631151(27-34)Online publication date: 23-Oct-2009
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media