ABSTRACT
Mathematical Information Retrieval concerns retrieving information related to a particular mathematical concept. The NTCIR-11 Math Task develops an evaluation test collection for document sections retrieval of scientific articles based on human generated topics. Those topics involve a combination of formula patterns and keywords. In addition, the optional Wikipedia Task provides a test collection for retrieval of individual mathematical formula from Wikipedia based on search topics that contain exactly one formula pattern. We developed a framework for automatic query generation and immediate evaluation. This paper discusses our dataset preparation, topic generation and evaluation methods, and summarizes the results of the participants, with a special focus on the Wikipedia Task.
- Formats for topics and submissions for the math2 task at ntcir-11. Technical report, NTCIR, 2014.Google Scholar
- Akiko Aizawa, Michael Kohlhase, and Iadh Ounis. NTCIR-10 Math Pilot Task Overview. In Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, pages 654--661, Tokyo, Japan, 2013.Google Scholar
- Akiko Aizawa, Michael Kohlhase, Iadh Ounis, and Moritz Schubotz. NTCIR-11 Math-2 Task Overview. In Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, pages 88--98, 2014.Google Scholar
- Michael Kohlhase, Helena Mihaljevic-Brandt, Wolfram Sperber, and Olaf Teschke. Mathematical Formula Search. pages 56--57, September 2013.Google Scholar
- Michael Kohlhase, Corneliu Prodescu, and Christian Liguda. Xlsearch: A search engine for spreadsheets. In Simon Thorne et. al, editor, Proceedings of the EuSpRIG 2013 Conference "Spreadsheet Risk Management". July 4--5, London, United Kingdom, pages 47--58. Five Star Printing Ldt, Claydon, 2013.Google Scholar
- Matthias S. Reichenbach, Anurag Agarwal, and Richard Zanibbi. Rendering expressions to improve accuracy of relevance assessment for math search. Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval - SIGIR '14, pages 851--854, 2014. Google ScholarDigital Library
- Moritz Schubotz and Gabriel Wicke. Mathoid: Robust, scalable, fast and accessible math rendering for wikipedia. In Stephen Watt et al., editor, Intelligent Computer Mathematics, volume 8543 of Lecture Notes in Computer Science, pages 224--235. Springer International Publishing, 2014.Google ScholarCross Ref
- Heinrich Stamerjohanns, Michael Kohlhase, Deyan Ginev, Catalin David, and Bruce Miller. Transforming large collections of scientific publications to xml. Mathematics in Computer Science, 3(3):299--307, 2010.Google ScholarCross Ref
- Ellen M. Voorhees. The TREC-8 Question Answering Track Report. TREC, 1999.Google Scholar
- Keita Del Valle Wangari, Richard Zanibbi, and Anurag Agarwal. Discovering real-world use cases for a multimodal math search interface. Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval - SIGIR '14, pages 947--950, 2014. Google ScholarDigital Library
Recommendations
Semantification of Identifiers in Mathematics for Better Math Information Retrieval
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information RetrievalMathematical formulae are essential in science, but face challenges of ambiguity, due to the use of a small number of identifiers to represent an immense number of concepts. Corresponding to word sense disambiguation in Natural Language Processing, we ...
Tangent-CFT: An Embedding Model for Mathematical Formulas
ICTIR '19: Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information RetrievalWhen searching for mathematical content, accurate measures of formula similarity can help with tasks such as document ranking, query recommendation, and result set clustering. While there have been many attempts at embedding words and graphs, formula ...
Layout and Semantics: Combining Representations for Mathematical Formula Search
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information RetrievalMath-aware search engines need to support formulae in queries. Mathematical expressions are typically represented as trees defining their operational semantics or visual layout. We propose searching both formula representations using a three-layer ...
Comments