ABSTRACT
Query-by-example remains popular in image retrieval because it can exploit contextual information encoded in the image, that is difficult to express in a traditional textual query. Textual queries, on the other hand, give more flexibility in that it's easy to reformulate and refine a text query based on initial results.
In this work we make a first step towards getting the best of both worlds: we use an image to specify the context, but let the user specify a related category as main search criterion. For instance, starting from an image of a dog in a certain situation/context, the goal is to find images of cats with a similar situation/context.
We present an evaluation scheme for this new and challenging task, which we call swap retrieval, and use it to compare various methods. Results show that standard query-by-example techniques do not adapt well to the new task. Instead, techniques based on semantic knowledge extracted from textual descriptions available at training time perform reasonably well, although they are still far from the performance needed for practical use.
- A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM, 51(1):117--122, 2008. Google ScholarDigital Library
- T. L. Berg, A. C. Berg, and J. Shih. Automatic attribute discovery and characterization from noisy web data. In ECCV, 2010. Google ScholarDigital Library
- R. Datta, D. Joshi, J. Li, and J. Z. Wang. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv., 40(2), 2008. Google ScholarDigital Library
- J. V. Davis, B. Kulis, P. Jain, S. Sra, and I. S. Dhillon. Information-theoretic metric learning. In ICML, 2007. Google ScholarDigital Library
- J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531, 2013.Google Scholar
- M. Douze, A. Ramisa, and C. Schmid. Combining attributes and fisher vectors for efficient image retrieval. In CVPR, 2011. Google ScholarDigital Library
- B. Fernando, A. Habrard, M. Sebban, and T. Tuytelaars. Unsupervised visual domain adaptation using subspace alignment. In ICCV, 2013. Google ScholarDigital Library
- Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, 2011.Google ScholarDigital Library
- A. Gordo, J. A. Rodríguez-Serrano, F. Perronnin, and E. Valveny. Leveraging category-level labels for instance-level image retrieval. In CVPR, 2012.Google ScholarCross Ref
- D. Grangier and S. Bengio. A discriminative kernel-based approach to rank images from text queries. TPAMI, 30(8):1371--1384, 2008. Google ScholarDigital Library
- A. Habibian, T. Mensink, and C. G. M. Snoek. Composite concept discovery for zero-shot video event detection. In ICMR, 2014. Google ScholarDigital Library
- H. Jégou, M. Douze, and C. Schmid. Product quantization for nearest neighbor search. TPAMI, 33(1):117--128, 2011. Google ScholarDigital Library
- A. Kovashka and K. Grauman. Attribute pivots for guiding relevance feedback in image search. In ICCV, 2013. Google ScholarDigital Library
- A. Kovashka, D. Parikh, and K. Grauman. Whittlesearch: Image search with relative attribute feedback. In CVPR, 2012. Google ScholarDigital Library
- N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar. Describable visual attributes for face verification and image search. TPAMI, 33(10):1962--1977, 2011. Google ScholarDigital Library
- R. Layne, T. M. Hospedales, and S. Gong. Re-id: Hunting attributes in the wild. In BMVC, 2014.Google ScholarCross Ref
- C. Li, D. Parikh, and T. Chen. Automatic discovery of groups of objects for scene understanding. In CVPR, 2012. Google ScholarDigital Library
- X. Li, C. G. M. Snoek, M. Worring, and A. W. M. Smeulders. Harvesting social images for bi-concept search. IEEE Transactions on Multimedia, 14(4):1091--1104, 2012. Google ScholarDigital Library
- T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, J. Hays, P. Perona, D. Ramanan, P. Dollar, and L. Zitnick. Microsoft coco: Common objects in context. In ECCV, 2014.Google ScholarCross Ref
- P. Over, G. Awad, M. Michel, J. G. Fiscus, G. Sanders, B. Shaw, W. Kraaij, A. F. Smeaton, and G. Quénot. TRECVID 2012 - an overview of the goals, tasks, data, evaluation mechanisms and metrics. In Proceedings of the TRECVID Conference, 2012.Google Scholar
- D. Parikh and K. Grauman. Relative attributes. In ICCV, 2011. Google ScholarDigital Library
- J. C. Pereira and N. Vasconcelos. Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems. CVIU, 124:123--135, 2014.Google ScholarCross Ref
- J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In CVPR, 2008.Google ScholarCross Ref
- D. Qin, C. Wengert, and L. J. V. Gool. Query adaptive similarity for large scale object retrieval. In CVPR, 2013. Google ScholarDigital Library
- A. Quattoni, M. Collins, and T. Darrell. Learning visual representations using images with captions. In CVPR, 2007.Google ScholarCross Ref
- M. A. Sadeghi and A. Farhadi. Recognition using visual phrases. In CVPR, 2011. Google ScholarDigital Library
- K. Saenko, B. Kulis, M. Fritz, and T. Darrell. Adapting visual category models to new domains. In ECCV, 2010. Google ScholarDigital Library
- R. Salakhutdinov and G. E. Hinton. Semantic hashing. Int. J. Approx. Reasoning, 50(7):969--978, 2009. Google ScholarDigital Library
- A. Sharif Razavian, H. Azizpour, J. Sullivan, and S. Carlsson. Cnn features off-the-shelf: An astounding baseline for recognition. In CVPR Workshops, 2014. Google ScholarDigital Library
- N. V. Shirahatti and K. Barnard. Evaluating image retrieval. In CVPR (1), 2005. Google ScholarDigital Library
- B. Siddiquie, R. S. Feris, and L. S. Davis. Image ranking and retrieval based on multi-attribute queries. In CVPR, 2011. Google ScholarDigital Library
- L. Torresani, M. Szummer, and A. W. Fitzgibbon. Efficient object category recognition using classemes. In ECCV, pages 776--789, 2010. Google ScholarDigital Library
- J. Wang, S. Kumar, and S. Chang. Semi-supervised hashing for large-scale search. TPAMI, 34(12):2393--2406, 2012. Google ScholarDigital Library
- Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In NIPS, 2008.Google ScholarDigital Library
- F. X. Yu, L. Cao, R. S. Feris, J. R. Smith, and S. Chang. Designing category-level attributes for discriminative visual recognition. In CVPR, 2013. Google ScholarDigital Library
- F. X. Yu, R. Ji, M.-H. Tsai, G. Ye, and S.-F. Chang. Weak attributes for large-scale image retrieval. In CVPR, 2012. Google ScholarDigital Library
Index Terms
- Swap Retrieval: Retrieving Images of Cats When the Query Shows a Dog
Recommendations
Leveraging non-relevant images to enhance image retrieval performance
MULTIMEDIA '02: Proceedings of the tenth ACM international conference on MultimediaInherent subjectivity in user's perception of an image has motivated the use of relevance feedback (RF) in the image desigined output's retrieval process. RF techniques interactively determine the user's query concept, given the user's relevance ...
Re-ranking algorithm using post-retrieval clustering for content-based image retrieval
In this paper, we propose a re-ranking algorithm using post-retrieval clustering for content-based image retrieval (CBIR). In conventional CBIR systems, it is often observed that images visually dissimilar to a query image are ranked high in retrieval ...
A statistical correlation model for image retrieval
MULTIMEDIA '01: Proceedings of the 2001 ACM workshops on Multimedia: multimedia information retrievalA bigram correlation model for image retrieval is proposed, which captures the semantic relationship among images in a database from simple statistics of users' relevance feedback information. It is used in the post-processing of image retrieval results ...
Comments