skip to main content
10.1145/2671188.2749373acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Swap Retrieval: Retrieving Images of Cats When the Query Shows a Dog

Published:22 June 2015Publication History

ABSTRACT

Query-by-example remains popular in image retrieval because it can exploit contextual information encoded in the image, that is difficult to express in a traditional textual query. Textual queries, on the other hand, give more flexibility in that it's easy to reformulate and refine a text query based on initial results.

In this work we make a first step towards getting the best of both worlds: we use an image to specify the context, but let the user specify a related category as main search criterion. For instance, starting from an image of a dog in a certain situation/context, the goal is to find images of cats with a similar situation/context.

We present an evaluation scheme for this new and challenging task, which we call swap retrieval, and use it to compare various methods. Results show that standard query-by-example techniques do not adapt well to the new task. Instead, techniques based on semantic knowledge extracted from textual descriptions available at training time perform reasonably well, although they are still far from the performance needed for practical use.

References

  1. A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM, 51(1):117--122, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. L. Berg, A. C. Berg, and J. Shih. Automatic attribute discovery and characterization from noisy web data. In ECCV, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Datta, D. Joshi, J. Li, and J. Z. Wang. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv., 40(2), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. V. Davis, B. Kulis, P. Jain, S. Sra, and I. S. Dhillon. Information-theoretic metric learning. In ICML, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531, 2013.Google ScholarGoogle Scholar
  6. M. Douze, A. Ramisa, and C. Schmid. Combining attributes and fisher vectors for efficient image retrieval. In CVPR, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Fernando, A. Habrard, M. Sebban, and T. Tuytelaars. Unsupervised visual domain adaptation using subspace alignment. In ICCV, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Gordo, J. A. Rodríguez-Serrano, F. Perronnin, and E. Valveny. Leveraging category-level labels for instance-level image retrieval. In CVPR, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  10. D. Grangier and S. Bengio. A discriminative kernel-based approach to rank images from text queries. TPAMI, 30(8):1371--1384, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Habibian, T. Mensink, and C. G. M. Snoek. Composite concept discovery for zero-shot video event detection. In ICMR, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Jégou, M. Douze, and C. Schmid. Product quantization for nearest neighbor search. TPAMI, 33(1):117--128, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Kovashka and K. Grauman. Attribute pivots for guiding relevance feedback in image search. In ICCV, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Kovashka, D. Parikh, and K. Grauman. Whittlesearch: Image search with relative attribute feedback. In CVPR, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar. Describable visual attributes for face verification and image search. TPAMI, 33(10):1962--1977, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Layne, T. M. Hospedales, and S. Gong. Re-id: Hunting attributes in the wild. In BMVC, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  17. C. Li, D. Parikh, and T. Chen. Automatic discovery of groups of objects for scene understanding. In CVPR, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. X. Li, C. G. M. Snoek, M. Worring, and A. W. M. Smeulders. Harvesting social images for bi-concept search. IEEE Transactions on Multimedia, 14(4):1091--1104, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, J. Hays, P. Perona, D. Ramanan, P. Dollar, and L. Zitnick. Microsoft coco: Common objects in context. In ECCV, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  20. P. Over, G. Awad, M. Michel, J. G. Fiscus, G. Sanders, B. Shaw, W. Kraaij, A. F. Smeaton, and G. Quénot. TRECVID 2012 - an overview of the goals, tasks, data, evaluation mechanisms and metrics. In Proceedings of the TRECVID Conference, 2012.Google ScholarGoogle Scholar
  21. D. Parikh and K. Grauman. Relative attributes. In ICCV, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. C. Pereira and N. Vasconcelos. Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems. CVIU, 124:123--135, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  23. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In CVPR, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  24. D. Qin, C. Wengert, and L. J. V. Gool. Query adaptive similarity for large scale object retrieval. In CVPR, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Quattoni, M. Collins, and T. Darrell. Learning visual representations using images with captions. In CVPR, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  26. M. A. Sadeghi and A. Farhadi. Recognition using visual phrases. In CVPR, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. Saenko, B. Kulis, M. Fritz, and T. Darrell. Adapting visual category models to new domains. In ECCV, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Salakhutdinov and G. E. Hinton. Semantic hashing. Int. J. Approx. Reasoning, 50(7):969--978, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Sharif Razavian, H. Azizpour, J. Sullivan, and S. Carlsson. Cnn features off-the-shelf: An astounding baseline for recognition. In CVPR Workshops, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. N. V. Shirahatti and K. Barnard. Evaluating image retrieval. In CVPR (1), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. B. Siddiquie, R. S. Feris, and L. S. Davis. Image ranking and retrieval based on multi-attribute queries. In CVPR, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. L. Torresani, M. Szummer, and A. W. Fitzgibbon. Efficient object category recognition using classemes. In ECCV, pages 776--789, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Wang, S. Kumar, and S. Chang. Semi-supervised hashing for large-scale search. TPAMI, 34(12):2393--2406, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In NIPS, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. F. X. Yu, L. Cao, R. S. Feris, J. R. Smith, and S. Chang. Designing category-level attributes for discriminative visual recognition. In CVPR, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. F. X. Yu, R. Ji, M.-H. Tsai, G. Ye, and S.-F. Chang. Weak attributes for large-scale image retrieval. In CVPR, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Swap Retrieval: Retrieving Images of Cats When the Query Shows a Dog

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval
          June 2015
          700 pages
          ISBN:9781450332743
          DOI:10.1145/2671188

          Copyright © 2015 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 June 2015

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          ICMR '15 Paper Acceptance Rate48of127submissions,38%Overall Acceptance Rate254of830submissions,31%

          Upcoming Conference

          ICMR '24
          International Conference on Multimedia Retrieval
          June 10 - 14, 2024
          Phuket , Thailand

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader