Improving Entity Ranking for Keyword Queries

Authors:
John Foley

University of Massachusetts Amherst, Amherst, MA, USA

University of Massachusetts Amherst, Amherst, MA, USA
View Profile

,
Brendan O'Connor

University of Massachusetts Amherst, Amherst, MA, USA

University of Massachusetts Amherst, Amherst, MA, USA
View Profile

,
James Allan

University of Massachusetts Amherst, Amherst, MA, USA

University of Massachusetts Amherst, Amherst, MA, USA
View Profile

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementOctober 2016Pages 2061–2064https://doi.org/10.1145/2983323.2983909

Published:24 October 2016Publication History

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

Pages 2061–2064

ABSTRACT

Knowledge bases about entities are an important part of modern information retrieval systems. A strong ranking of entities can be used to enhance query understanding and document retrieval or can be presented as another vertical to the user. Given a keyword query, our task is to provide a ranking of the entities present in the collection of interest. We are particularly interested in approaches to this problem that generalize to different knowledge bases and different collections. In the past, this kind of problem has been explored in the enterprise domain through Expert Search. Recently, a dataset was introduced for entity ranking from news and web queries from more general TREC collections.

Approaches from prior work leverage a wide variety of lexical resources: e.g., natural language processing and relations in the knowledge base. We address the question of whether we can achieve competitive performance with minimal linguistic resources.

We propose a set of features that do not require index-time entity linking, and demonstrate competitive performance on the new dataset. As this paper is the first non-introductory work to leverage this new dataset, we also find and correct certain aspects of the benchmark. To support a fair evaluation, we collect 38% more judgments and contribute annotator agreement information.

References

S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. Dbpedia: A nucleus for a web of open data. Springer, 2007.Google Scholar
K. Balog, L. Azzopardi, and M. De Rijke. Formal models for expert finding in enterprise corpora. In SIGIR'06, pages 43--50. Google ScholarDigital Library
K. Balog, Y. Fang, M. de Rijke, P. Serdyukov, and L. Si. Expertise retrieval. Foundations and Trends in Information Retrieval, 6(2--3):127--256, 2012. Google ScholarDigital Library
K. Balog, P. Serdyukov, and A. P. d. Vries. Overview of the trec 2010 entity track. Technical report, DTIC Document, 2010.Google Scholar
A. Boldyrev, G. Weikum, and M. Theobald. Dictionary-Based Named Entity Recognition. PhD thesis, Universitat des Saarlandes Saarbrücken, 2013.Google Scholar
D. Carmel, M.-W. Chang, E. Gabrilovich, B.-J. P. Hsu, and K. Wang. ERD'14: entity recognition and disambiguation challenge. In ACM SIGIR Forum, 2014. Google ScholarDigital Library
N. Craswell, A. P. de Vries, and I. Soboroff. Overview of the trec 2005 enterprise track. In TREC, pages 199--205, 2005.Google Scholar
W. B. Croft, D. Metzler, and T. Strohman. Search engines: Information retrieval in practice. Addison-Wesley Reading, 2010. Google ScholarDigital Library
J. Dalton, L. Dietz, and J. Allan. Entity query feature expansion using knowledge base links. In SIGIR'14, pages 365--374. Google ScholarDigital Library
V. Dang. Ranklib. https://sourceforge.net/p/lemur/wiki/RankLib, 2015.Google Scholar
G. Demartini, T. Iofciu, and A. P. De Vries. Overview of the INEX 2009 entity ranking track. In Focused Retrieval and Evaluation, pages 254--264. Springer, 2010. Google ScholarDigital Library
J. Dunietz and D. Gillick. A new entity salience task with millions of training examples. EACL'14, page 205.Google Scholar
P. Ferragina and U. Scaiella. Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In CIKM'10, pages 1625--1628. Google ScholarDigital Library
E. Gabrilovich, M. Ringgaard, and A. Subramanya. Freebase annotation of clueweb corpora. http://lemurproject.org/clueweb09/FACC1/, June 2013.Google Scholar
F. Hasibi, K. Balog, and S. E. Bratsberg. On the reproducibility of the tagme entity linking system. In ECIR'16, pages 436--449.Google Scholar
J. Hoffart, D. Milchevski, and G. Weikum. Stics: searching with strings, things, and cats. In SIGIR'14, pages 1247--1248. Google ScholarDigital Library
J. Hoffart, F. M. Suchanek, K. Berberich, and G. Weikum. Yago2: A spatially and temporally enhanced knowledge base from wikipedia. Artificial Intelligence, 194:28--61, 2013. Google ScholarDigital Library
Y. Hong, D. Lu, D. Yu, X. Pan, X. Wang, Y. Chen, L. Huang, and H. Ji. RPI Blender TAC-KBP2015 system description. In Text Analysis Conference, 2015.Google Scholar
H. Ji, J. Nothman, B. Hachey, and R. Florian. Overview of TAC-KBP2015 Tri-lingual Entity Discovery and Linking. 2015.Google Scholar
V. Lavrenko and W. B. Croft. Relevance based language models. In SIGIR 2001, pages 120--127, New York, NY, USA, 2001. ACM. Google ScholarDigital Library
X. Liu and H. Fang. Latent entity space: a novel retrieval approach for entity-bearing queries. Information Retrieval Journal, 18(6):473--503, 2015. Google ScholarDigital Library
C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The Stanford CoreNLP natural language processing toolkit. In ACL'14 Demo.Google Scholar
P. McNamee and H. T. Dang. Overview of the TAC 2009 knowledge base population track. In TAC'09, volume 17, pages 111--113, 2009.Google Scholar
D. Petkova and W. B. Croft. Proximity-based document representation for named entity retrieval. In CIKM'07, pages 731--740. Google ScholarDigital Library
M. Schuhmacher, L. Dietz, and S. Ponzetto. Ranking entities for web queries through text and knowledge. In CIKM'15. Google ScholarDigital Library
C. Xiong and J. Callan. EsdRank: Connecting Query and Documents through External Semi-Structured Data. In CIKM'15. Google ScholarDigital Library
N. Zhiltsov, A. Kotov, and F. Nikolaev. Fielded sequential dependence model for ad-hoc entity retrieval in the web of data. In SIGIR'15, pages 253--262. Google ScholarDigital Library
G. Zuccon, B. Koopman, and P. Bruza. Exploiting inference from semantic annotations for information retrieval: Reflections from medical IR. In ESAIR'14, pages 43--45, 2014. Google ScholarDigital Library

Index Terms

Improving Entity Ranking for Keyword Queries
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank

Recommendations

Identifying popular search goals behind search queries to improve web search ranking
AIRS'11: Proceedings of the 7th Asia conference on Information Retrieval Technology

Web users usually have a certain search goal before they submit a search query. However, many laypersons can't transform their search goals into suitable queries. Thus, understanding original search goals behind a query is very important for search ...
Read More
Improving Semantic Search through Entity-Based Document Ranking
WIMS '15: Proceedings of the 5th International Conference on Web Intelligence, Mining and Semantics

Traditional keyword-based IR approaches take into account the document context only in a limited manner. In our paper we present a novel document ranking approach based on the semantic relationships between named entities. In the first step we annotate ...
Read More
Evaluating leading web search engines on children's queries
HCII'11: Proceedings of the 14th international conference on Human-computer interaction: users and applications - Volume Part IV

This study compared retrieved results, relevance ranking, and overlap across Google, Yahoo!, Bing, Yahoo Kids!, and Ask Kids on 15 queries constructed by middle school children. Queries included one word, two words, and multiple words/phrases/natural ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
October 2016
2566 pages
ISBN:9781450340731
DOI:10.1145/2983323
General Chairs:
Snehasis Mukhopadhyay
Indiana University Purdue University Indianapolis, USA
,
ChengXiang Zhai
University of Illinois at Urbana-Champaign, USA
,
Program Chairs:
Elisa Bertino
Purdue University
,
Fabio Crestani
University of Lugano
,
Javed Mostafa
University of North Carolina
,
Jie Tang
Tsinghua University
,
Luo Si
Alibaba Group Inc & Purdue University
,
Xiaofang Zhou
University of Queensland
,
Yi Chang
Yahoo Research
,
Yunyao Li
IBM Research - Almaden
,
Parikshit Sondhi
WalmartLabs
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 October 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
information retrieval
knowledge-bases
Qualifiers
- short-paper
Conference

Acceptance Rates
CIKM '16 Paper Acceptance Rate160of701submissions,23%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 423
  Total Downloads
- Downloads (Last 12 months)37
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Improving Entity Ranking for Keyword Queries

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Identifying popular search goals behind search queries to improve web search ranking

Improving Semantic Search through Entity-Based Document Ranking

Evaluating leading web search engines on children's queries