research-article

Short Text Entity Linking with Fine-grained Topics

Authors:
Lihan Chen

Fudan University & CETC Big Data Research Institute Co.,Ltd., Shanghai, China

Fudan University & CETC Big Data Research Institute Co.,Ltd., Shanghai, China
View Profile

,
Jiaqing Liang

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

,
Chenhao Xie

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

,
Yanghua Xiao

Fudan University, Alibaba Group, Shanghai, China

Fudan University, Alibaba Group, Shanghai, China
View Profile

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge ManagementOctober 2018Pages 457–466https://doi.org/10.1145/3269206.3271809

Published:17 October 2018Publication History

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

Pages 457–466

ABSTRACT

A wide range of web corpora are in the form of short text, such as QA queries, search queries and news titles. Entity linking for these short texts is quite important. Most of supervised approaches are not effective for short text entity linking. The training data for supervised approaches are not suitable for short text and insufficient for low-resourced languages. Previous unsupervised methods are incapable of handling the sparsity and noisy problem of short text. We try to solve the problem by mapping the sparse short text to a topic space. We notice that the concepts of entities have rich topic information and characterize entities in a very fine-grained granularity. Hence, we use the concepts of entities as topics to explicitly represent the context, which helps improve the performance of entity linking for short text. We leverage our linking approach to segment the short text semantically, and build a system for short entity text recognition and linking. Our entity linking approach exhibits the state-of-the-art performance on several datasets for the realistic short text entity linking problem.

References

Nitish Aggarwal and Paul Buitelaar. 2014. Wikipedia-based Distributional Semantics for Entity Relatedness.Google Scholar
Baidu. 2017. Baidu Entity Annotation. https://ai.baidu.com/tech/cognitive/entity_ annotation. Accessed January 4, 2018.Google Scholar
Denilson Barbosa. 2017. Robust Named Entity Disambiguation with Random Walks.Google Scholar
Sumit Bhatia and Anshu Jain. 2016. Context Sensitive Entity Linking of Search Queries in Enterprise Knowledge Graphs. In ESWC.Google Scholar
Roi Blanco, Giuseppe Ottaviano, and Edgar Meij. 2015. Fast and Space-Efficient Entity Linking for Queries. In WSDM. Google ScholarDigital Library
Xiao Cheng and Dan Roth. 2013. Relational Inference for Wikification. In EMNLP.Google Scholar
Leon Derczynski, Diana Maynard, Guiseppe Rizzo, Marieke van Erp, Genevieve Gorrell, Raphaël Troncy, Johann Petrak, and Kalina Bontcheva. 2015. Analysis of named entity recognition and linking for tweets. Inf. Process. Manage. 51 (2015), 32--49.Google ScholarCross Ref
Thomas Emerson. 2005. The Second International Chinese Word Segmentation Bakeoff. In SIGHAN@IJCNLP 2005.Google Scholar
Yansong Feng, Zhe Han, and Kun Zhang. 2015. Overview of the NLPCC 2015 Shared : Entity Recognition and Linking in Search Queries. In NLPCC. Google ScholarDigital Library
Paolo Ferragina and Ugo Scaiella. 2010. TAGME: on-the-fly annotation of short text fragments (by wikipedia entities). In CIKM. Google ScholarDigital Library
Octavian-Eugen Ganea and Thomas Hofmann. 2017. Deep Joint Entity Disambiguation with Local Neural Attention. In EMNLP.Google Scholar
Stephen Guo, Ming-Wei Chang, and Emre Kiciman. 2013. To Link or Not to Link? A Study on End-to-End Tweet Entity Linking. In HLT-NAACL.Google Scholar
Nitish Gupta, Sameer Singh, and Dan Roth. 2017. Entity Linking via Joint Encoding of Types, Descriptions, and Context. In EMNLP.Google Scholar
Xianpei Han and Le Sun. 2012. An Entity-Topic Model for Entity Linking. In EMNLP-CoNLL. Google ScholarDigital Library
Zhengyan He, Shujie Liu, Mu Li, Ming Zhou, Longkai Zhang, and HoufengWang. 2013. Learning Entity Representation for Entity Disambiguation. In ACL.Google Scholar
Johannes Hoffart, Stephan Seufert, Dat Ba Nguyen, Martin Theobald, and Gerhard Weikum. 2012. KORE: keyphrase overlap relatedness for entity disambiguation. In CIKM. Google ScholarDigital Library
Johannes Hoffart, Fabian M. Suchanek, Klaus Berberich, and Gerhard Weikum. 2013. YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194 (2013), 28--61. Google ScholarDigital Library
Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust Disambiguation of Named Entities in Text. In EMNLP. Google ScholarDigital Library
Wen Hua, Zhongyuan Wang, Haixun Wang, Kai Zheng, and Xiaofang Zhou. 2015. Short text understanding through lexical-semantic analysis. 2015 IEEE 31st International Conference on Data Engineering (2015), 495--506.Google ScholarCross Ref
isnowfy. 2013--2014. SnowNLP. https://github.com/isnowfy/snownlp. Accessed May 4, 2018.Google Scholar
Sun Junyi. 2013. jieba segmentation. https://github.com/fxsjy/jieba. Accessed May 4, 2018.Google Scholar
Dongwoo Kim, Haixun Wang, and Alice H. Oh. 2013. Context-Dependent Conceptualization. In IJCAI. Google ScholarDigital Library
Nevena Lazic, Amarnag Subramanya, Michael Ringgaard, and Fernando Pereira. 2015. Plato: A Selective Context Model for Entity Resolution. TACL 3 (2015), 503--515.Google ScholarCross Ref
Xiaohua Liu, Yitong Li, Haocheng Wu, Ming Zhou, Furu Wei, and Yi Lu. 2013. Entity Linking for Tweets. In ACL.Google Scholar
Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In ACL.Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS. Google ScholarDigital Library
David N. Milne and Ian H. Witten. 2008. Learning to link with wikipedia. In CIKM. Google ScholarDigital Library
Erwan Moreau, François Yvon, and Olivier Cappé. 2008. Robust Similarity Measures for Named Entities Matching. In COLING. Google ScholarDigital Library
Andrea Moro, Alessandro Raganato, and Roberto Navigli. 2014. Entity Linking meetsWord Sense Disambiguation: a Unified Approach. TACL 2 (2014), 231--244.Google ScholarCross Ref
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP.Google Scholar
Jonathan Raiman and Olivier Raiman. 2018. DeepType: Multilingual Entity Linking by Neural Type System Evolution. CoRR abs/1802.01021 (2018).Google Scholar
Lev-Arie Ratinov, Dan Roth, Doug Downey, and Mike Anderson. 2011. Local and Global Algorithms for Disambiguation to Wikipedia. In ACL. Google ScholarDigital Library
Stephen E Robertson, SteveWalker, Susan Jones, Micheline M Hancock-Beaulieu, Mike Gatford, et al. 1995. Okapi at TREC-3. Nist Special Publication Sp 109 (1995), 109.Google Scholar
Denis Savenkov and Eugene Agichtein. 2016. When a Knowledge Base Is Not Enough: Question Answering over Knowledge Bases with External Text Data. In SIGIR. Google ScholarDigital Library
Valentin I. Spitkovsky and Angel X. Chang. 2012. A Cross-Lingual Dictionary for English Wikipedia Concepts. In LREC.Google Scholar
Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In WWW. Google ScholarDigital Library
Shuyan Tech. 2018. CNProbase concept api. http://shuyantech.com/api/ cnprobase/concept. Accessed May 22, 2018.Google Scholar
Kun Wang, Chengqing Zong, and Keh-Yih Su. 2009. Which is More Suitable for Chinese Word Segmentation, the Generative Model or the Discriminative One?. In PACLIC.Google Scholar
Zhongyuan Wang, Haixun Wang, and Zhirui Hu. 2014. Head, modifier, and constraint detection in short texts. 2014 IEEE 30th International Conference on Data Engineering (2014), 280--291.Google ScholarCross Ref
Zhongyuan Wang, Kejun Zhao, Haixun Wang, Xiaofeng Meng, and Ji-Rong Wen. 2015. Query Understanding through Knowledge-Based Conceptualization. In IJCAI. Google ScholarDigital Library
Bo Xu, Yong Xu, Jiaqing Liang, Chenhao Xie, Bin Liang, Wanyun Cui, and Yanghua Xiao. 2017. CN-DBpedia: A Never-Ending Chinese Knowledge Extraction System. In IEA/AIE.Google Scholar
Ikuya Yamada, Hiroyuki Shindo, Hideaki Takeda, and Yoshiyasu Takefuji. 2016. Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation. In CoNLL.Google Scholar

Index Terms

Short Text Entity Linking with Fine-grained Topics
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Information extraction

Recommendations

Chinese Short Text Entity Linking Model Based on PET
WSSE '22: Proceedings of the 4th World Symposium on Software Engineering

Existing Chinese short text entity link models are less, and the short text is limited and handled by the context missing and the processing noise. There is still a lot of space to improve the accuracy. This paper proposes a Chinese short text entity ...
Read More
Improving Entity Linking by Encoding Type Information into Entity Embeddings
Chinese Computational Linguistics
Abstract
Entity Linking (EL) refers to the task of linking entity mentions in the text to the correct entities in the Knowledge Base (KB) in which entity embeddings play a vital and challenging role because of the subtle differences between entities. ...
Read More
Entity Difference Modeling Based Entity Linking for Question Answering over Knowledge Graphs
Natural Language Processing and Chinese Computing
Abstract
Entity linking plays a vital role in Question Answering over Knowledge Graphs (KGQA), and the representation of entities is a fundamental component of entity linking for user questions. In order to alleviate the problem of entity descriptions that ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
October 2018
2362 pages
ISBN:9781450360142
DOI:10.1145/3269206
General Chair:
Alfredo Cuzzocrea
University of Trieste, Italy
,
Program Chairs:
James Allan
University of Massachusetts, USA
,
Norman Paton
University of Manchester, United Kingdom
,
Divesh Srivastava
AT&T Labs Research, USA
,
Rakesh Agrawal
Data Insights Lab, USA
,
Andrei Broder
Google Research, USA
,
Mohammed Zaki
Rensselaer Polytechnic Institute, USA
,
Selcuk Candan
Arizona State University, USA
,
Alexandros Labrinidis
University of Pittsburgh, USA
,
Assaf Schuster
Technion, Israel
,
Haixun Wang
Google Research, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
concepts
entity linking
fine-grained topics
short text
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '18 Paper Acceptance Rate147of826submissions,18%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 22
  Total Citations
  View Citations
- 994
  Total Downloads
- Downloads (Last 12 months)46
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Short Text Entity Linking with Fine-grained Topics

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Chinese Short Text Entity Linking Model Based on PET

Improving Entity Linking by Encoding Type Information into Entity Embeddings

Entity Difference Modeling Based Entity Linking for Question Answering over Knowledge Graphs