ABSTRACT
Community-based question answering systems have become very popular for providing answers to a wide variety of "how-to" questions. However most such systems present only textual answers. In many cases, users would prefer visual answers such as videos which are more direct and intuitive.
Currently, there is very little research on automatically presenting precise reference videos based on user's question. In this paper, we explore how to leverage YouTube video collections as a source of reference to fulfilll such task and develop a novel multimedia application named:Video Reference. There are two steps to generating a video reference. The first is recall-driven video search, which is to increase the coverage of question by finding other similar questions. The second is precision-based video ranking. A three level ranking scheme based on visual analysis, opinion analysis and video redundancy is adopted to find the most relevant video reference from YouTube. Experiments conducted using questions from Consumer Electronics domain of Yahoo! Answers archive show the feasibility and effectiveness of our approach.
- Yahoo alpha search: http://au.alpha.yahoo.com/.Google Scholar
- Yahoo! answers: http://answers.yahoo.com/.Google Scholar
- H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool. Speeded-up robust features (SURF). Comput. Vis. Image Underst., 110(3), 2008. Google ScholarDigital Library
- J. Cao and J. F. Nunamaker. Question answering on lecture videos: a multifaceted approach. In JCDL, 2004. Google ScholarDigital Library
- S. Dumais, M. Banko, E. Brill, J. Lin, and A. Ng. Web question answering: is more always better? In SIGIR, 2002. Google ScholarDigital Library
- H. Feng, A. Chandrashekhara, and T.-S. Chua. Atmra: An automatic temporal multi-resolution analysis framework for shot boundary detection. In MMM, 2003.Google Scholar
- J.L.Song. Scable image retrieval based on feature forest, Submitted to ACCV, 2009. Google ScholarDigital Library
- H. Liu. Montylingua: An end-to-end natual language processor with common sense, available at: web.media.mit.edu/hugo/montylingua.Google Scholar
- C. Schmid. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR, 2006. Google ScholarDigital Library
- C. Tat-seng, T. Jinhui, H. Richang, L. Haojie, L. Zhiping, and Z. Yantao. Bnus-wide: A real-world web image databased from national university of singapore. In CIVR, 2009.Google Scholar
- K. Wang, Z. Ming, and T.-S. Chua. A syntactic tree matching approach to finding similar questions in community-based QA services. In SIGIR, 2009. Google ScholarDigital Library
- H. Yang, L. Chaisorn, Y. Zhao, S.-Y. Neo, and T.-S. Chua. VideoQA: question answering on news video. In ACM Mulitimedia, 2003. Google ScholarDigital Library
- T. Yeh, J. J. Lee, and T. Darrell. Photo-based question answering. In ACM Mulitimedia, 2008. Google ScholarDigital Library
Index Terms
- Video reference: question answering on YouTube
Recommendations
VideoQA: question answering on news video
MULTIMEDIA '03: Proceedings of the eleventh ACM international conference on MultimediaWhen querying a news video archive, the users are interested in retrieving precise answers in the form of a summary that best answers the query. However, current video retrieval systems, including the search engines on the web, are designed to retrieve ...
Video reference: a video question answering engine
MMM'10: Proceedings of the 16th international conference on Advances in Multimedia ModelingCommunity-based question answering systems have become very popular for providing answers to a wide variety of ”how-to” questions. However, most such systems present only textual answers. In many cases, users would prefer visual answers such as videos ...
Keyword-aware Multi-modal Enhancement Attention for Video Question Answering
CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial IntelligenceVideo question answering (VideoQA) is an intriguing topic in the field of visual language. Most of the current VideoQA models directly harness the global video information to answer questions. However, in VideoQA task, the answers associated with the ...
Comments