ABSTRACT
Conventional search engines such as Bing and Google provide a user with a short answer to some queries as well as a ranked list of documents, in order to better meet her information needs. In this paper we study a class of such queries that we call math. Calculations (e.g. "12% of 24$ ", "square root of 120"), unit conversions (e.g. "convert 10 meter to feet"), and symbolic computations (e.g. "plot x^2+x+1") are examples of math queries. Among the queries that should be answered, math queries are special because of the infinite combinations of numbers and symbols, and rather few keywords that form them. Answering math queries must be done through real time computations rather than keyword searches or database look ups. The lack of a formal definition for the entire range of math queries makes it hard to automatically identify them all. We propose a novel approach for recognizing and classifying math queries using large scale search logs, and investigate its accuracy through empirical experiments and statistical analysis. It allows us to discover classes of math queries even if we do not know their structures in advance. It also helps to identify queries that are not math even though they might look like math queries.
We also evaluate the usefulness of math answers based on the implicit feedback from users. Traditional approaches for evaluating the quality of search results mostly rely on the click information and interpret a click on a link as a sign of satisfaction. Answers to math queries do not contain links, therefore such metrics are not applicable to them. In this paper we describe two evaluation metrics that can be applied for math queries, and present the results on a large collection of math queries taken from Bing's search logs.
- F. J. Anscombe. The validity of comparative experiments. InJournal of the Royal Statistical Society, pages 181--211, 1948.Google ScholarCross Ref
- I. Bordino, C. Castillo, D. Donato, and A. Gionis. Query similarity by projecting the query-flow graph. In SIGIR, pages 515--522, 2010. Google ScholarDigital Library
- C. Castillo, A. Gionis, R. Lempel, and Y. Maarek. When no clicks are good news. In SIGIR, 2010.Google Scholar
- L. B. Chilton and J. Teevan. Addressing people's information needs directly in a web search result page. In WWW, pages 27--36, 2011. Google ScholarDigital Library
- A. Hassan, R. Jones, and K. L. Klinkner. Beyond dcg: user behavior as a predictor of a successful search. In WSDM, pages 221--230, 2010. Google ScholarDigital Library
- A. K. Jain and R. C. Dubes.Algorithms for Clustering Data. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1988. Google ScholarDigital Library
- S. Kamali and F. W. Tompa. A new mathematics retrieval system. In CIKM, pages 1413--1416, 2010. Google ScholarDigital Library
- S. Kamali and F. W. Tompa. Grammar inference for web documents. In WebDB, 2011.Google Scholar
- M. Kohlhase and I. A. S Ayucan. A search engine for mathematical formulae. InArtificial Intelligence and Symbolic Computation, pages 241--253. Springer, 2006. Google ScholarDigital Library
- P. Lewicki and T. Hill.Statistics : Methods and Applications. StatSoft, 2006.Google Scholar
- B. Liu and Y. Zhai. NET - a system for extracting web data from at and nested data records. In WISE, pages 487--495, 2005. Google ScholarDigital Library
- D. C. Montgomery and G. C. Runger.Applied Statistics and Probability for Engineers. John Wiley and Sons, 2010.Google Scholar
- D. Shen, Y. Li, X. Li, and D. Zhou. Product query classification. InCIKM, pages 741--750, 2009. Google ScholarDigital Library
- S. Stamou and E. N. Efthimiadis. Queries without clicks: Successful or failed searches. In SIGIR Workshop on the Future of IR Evaluation, pages 13--14, 2009.Google Scholar
- S. Stamou and E. N. Efthimiadis. Interpreting user inactivity on search results. In ECIR, pages 100--113, 2010. Google ScholarDigital Library
- R. G. Steel and J. H. Torrie. Principles and Procedures of Statistics.Google Scholar
- P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison-Wesley, 2005. Google ScholarDigital Library
- J.-R. Wen and H. Zhang. Query clustering in the web context. In Clustering and Information Retrieval, pages 195--226. 2003.Google Scholar
Index Terms
- Answering math queries with search engines
Recommendations
Identifying popular search goals behind search queries to improve web search ranking
AIRS'11: Proceedings of the 7th Asia conference on Information Retrieval TechnologyWeb users usually have a certain search goal before they submit a search query. However, many laypersons can't transform their search goals into suitable queries. Thus, understanding original search goals behind a query is very important for search ...
Query routing for Web search engines: architecture and experiments
AbstractGeneral-purpose search engines such as AltaVista and Lycos are notorious for returning irrelevant results in response to user queries. Consequently, thousands of specialized, topic-specific search engines (from VacationSpot.com to ...
Evaluating leading web search engines on children's queries
HCII'11: Proceedings of the 14th international conference on Human-computer interaction: users and applications - Volume Part IVThis study compared retrieved results, relevance ranking, and overlap across Google, Yahoo!, Bing, Yahoo Kids!, and Ask Kids on 15 queries constructed by middle school children. Queries included one word, two words, and multiple words/phrases/natural ...
Comments