ABSTRACT
Websites vary in terms of reliability. One could assume that NASA's website will be very accurate for Astronomy questions. Wikipedia is less accurate but is still more accurate than a generic Google search. In this research we ask a large number of "factoid" questions to several different search engines. We collect those responses and determine the correctness of each candidate answer. The answers are grouped by website source, and are compared to other websites to infer website correctness.
- X. Yin, W. Tan and C. Liu, "FACTO: A Fact Lookup Engine Based on Web Tables," in World Wide Web Conference (WWW), Hyderabad, India, 2011. Google ScholarDigital Library
- S. Brin and L. Page, "The Anatomy of a Large-Scale Hypertextual Web Search Engine," Computer Networks and ISDN Systems 30, pp. 107--117, 1988. Google ScholarDigital Library
- X. Yin, J. Han and P. S. Yu, "Truth Discovery with Multiple Conflicting Information Providers on the Web," Knowledge Discovery and Data Mining (KDD), 2007. Google ScholarDigital Library
- X. L. Dong, L. Berti-Equille and D. Srivastava, "Integrating Conflicting Data: The Role of Source Dependence," Very Large Databases (VLDB), 2009. Google ScholarDigital Library
- A. Galland, A. Marian, S. Abiteboul and P. Senellart, "Corroborating Information from Disagreeing Views," Web Search and Data Mining (WSDM), 2010. Google ScholarDigital Library
- S. O'Hara and T. Bylander, "Numeric Query Answering on the Web," International Journal on Semantic Web and Information Systems, pp. 1--17, January-March 2011. Google ScholarDigital Library
- B. Katz, S. Felshin, D. Yuret, A. Ibrahim, J. Lin, G. Marton, A. J. McFarland and B. Temelkuran, "Omnibase: Uniform Access to Heterogeneous Data for Question Answering," in Proceedings of the 7th International Workshop on Applications of Natural Language to Information Systems, Stockholm, Sweden, 2002. Google ScholarDigital Library
- V. I. Levenshtein, "Binary Codes Capable of Correcting Deletions, Insertions and Reversals," Cybernetics and Control Theory, pp. 845--848, 1965.Google Scholar
- X. Li and D. Roth, "Learning Question Classifiers: The Role of Semantic Information," in International Conference on Computational Linguistics, Taipei, 2002. Google ScholarDigital Library
- J. Ko, L. Si and E. Nyberg, "A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering," in Proceedings of SIGIR, Amsterdam, 2007. Google ScholarDigital Library
- C. Kwok, O. Etzioni and D. S. Weld, "Scaling question answering to the web," ACM Transactions on Information Systems, vol. 19, no. 3, pp. 242--262, July 2001. Google ScholarDigital Library
- M. Barcala, J. Vilares, M. A. Alonso, J. Grana and m. Vilares, "Tokenization and Proper Noun Recognition for Information Retrieval," Departamento de Computacion, Universidade da Coruna, La Coruna, Spain.Google Scholar
- D. Roussinov, W. Fan and J. Robles-Flores, "Beyond keywords: Automated question answering on the web," Communications of the ACM, vol. 51, no. 9, pp. 60--65, September 2008. Google ScholarDigital Library
Index Terms
- Predicting website correctness from consensus analysis
Recommendations
Web Searching with Multiple Correct Answers
WIMS '14: Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS14)Most web search engines today are geared towards providing a list of relevant websites, along with snippets of text from each website that are relevant to the user's search text. Some of them may also provide specific answers to the user's question. ...
Predicting Website Audience Demographics forWeb Advertising Targeting Using Multi-Website Clickstream Data
Intelligent Data Analysis in Granular ComputingSeveral recent studies have explored the virtues of behavioral targeting and personalization for online advertising. In this paper, we add to this literature by proposing a cost-effective methodology for the prediction of demographic website visitor ...
Comments