skip to main content
10.1145/3269206.3271809acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Short Text Entity Linking with Fine-grained Topics

Authors Info & Claims
Published:17 October 2018Publication History

ABSTRACT

A wide range of web corpora are in the form of short text, such as QA queries, search queries and news titles. Entity linking for these short texts is quite important. Most of supervised approaches are not effective for short text entity linking. The training data for supervised approaches are not suitable for short text and insufficient for low-resourced languages. Previous unsupervised methods are incapable of handling the sparsity and noisy problem of short text. We try to solve the problem by mapping the sparse short text to a topic space. We notice that the concepts of entities have rich topic information and characterize entities in a very fine-grained granularity. Hence, we use the concepts of entities as topics to explicitly represent the context, which helps improve the performance of entity linking for short text. We leverage our linking approach to segment the short text semantically, and build a system for short entity text recognition and linking. Our entity linking approach exhibits the state-of-the-art performance on several datasets for the realistic short text entity linking problem.

References

  1. Nitish Aggarwal and Paul Buitelaar. 2014. Wikipedia-based Distributional Semantics for Entity Relatedness.Google ScholarGoogle Scholar
  2. Baidu. 2017. Baidu Entity Annotation. https://ai.baidu.com/tech/cognitive/entity_ annotation. Accessed January 4, 2018.Google ScholarGoogle Scholar
  3. Denilson Barbosa. 2017. Robust Named Entity Disambiguation with Random Walks.Google ScholarGoogle Scholar
  4. Sumit Bhatia and Anshu Jain. 2016. Context Sensitive Entity Linking of Search Queries in Enterprise Knowledge Graphs. In ESWC.Google ScholarGoogle Scholar
  5. Roi Blanco, Giuseppe Ottaviano, and Edgar Meij. 2015. Fast and Space-Efficient Entity Linking for Queries. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Xiao Cheng and Dan Roth. 2013. Relational Inference for Wikification. In EMNLP.Google ScholarGoogle Scholar
  7. Leon Derczynski, Diana Maynard, Guiseppe Rizzo, Marieke van Erp, Genevieve Gorrell, Raphaël Troncy, Johann Petrak, and Kalina Bontcheva. 2015. Analysis of named entity recognition and linking for tweets. Inf. Process. Manage. 51 (2015), 32--49.Google ScholarGoogle ScholarCross RefCross Ref
  8. Thomas Emerson. 2005. The Second International Chinese Word Segmentation Bakeoff. In SIGHAN@IJCNLP 2005.Google ScholarGoogle Scholar
  9. Yansong Feng, Zhe Han, and Kun Zhang. 2015. Overview of the NLPCC 2015 Shared : Entity Recognition and Linking in Search Queries. In NLPCC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Paolo Ferragina and Ugo Scaiella. 2010. TAGME: on-the-fly annotation of short text fragments (by wikipedia entities). In CIKM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Octavian-Eugen Ganea and Thomas Hofmann. 2017. Deep Joint Entity Disambiguation with Local Neural Attention. In EMNLP.Google ScholarGoogle Scholar
  12. Stephen Guo, Ming-Wei Chang, and Emre Kiciman. 2013. To Link or Not to Link? A Study on End-to-End Tweet Entity Linking. In HLT-NAACL.Google ScholarGoogle Scholar
  13. Nitish Gupta, Sameer Singh, and Dan Roth. 2017. Entity Linking via Joint Encoding of Types, Descriptions, and Context. In EMNLP.Google ScholarGoogle Scholar
  14. Xianpei Han and Le Sun. 2012. An Entity-Topic Model for Entity Linking. In EMNLP-CoNLL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Zhengyan He, Shujie Liu, Mu Li, Ming Zhou, Longkai Zhang, and HoufengWang. 2013. Learning Entity Representation for Entity Disambiguation. In ACL.Google ScholarGoogle Scholar
  16. Johannes Hoffart, Stephan Seufert, Dat Ba Nguyen, Martin Theobald, and Gerhard Weikum. 2012. KORE: keyphrase overlap relatedness for entity disambiguation. In CIKM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Johannes Hoffart, Fabian M. Suchanek, Klaus Berberich, and Gerhard Weikum. 2013. YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194 (2013), 28--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust Disambiguation of Named Entities in Text. In EMNLP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Wen Hua, Zhongyuan Wang, Haixun Wang, Kai Zheng, and Xiaofang Zhou. 2015. Short text understanding through lexical-semantic analysis. 2015 IEEE 31st International Conference on Data Engineering (2015), 495--506.Google ScholarGoogle ScholarCross RefCross Ref
  20. isnowfy. 2013--2014. SnowNLP. https://github.com/isnowfy/snownlp. Accessed May 4, 2018.Google ScholarGoogle Scholar
  21. Sun Junyi. 2013. jieba segmentation. https://github.com/fxsjy/jieba. Accessed May 4, 2018.Google ScholarGoogle Scholar
  22. Dongwoo Kim, Haixun Wang, and Alice H. Oh. 2013. Context-Dependent Conceptualization. In IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Nevena Lazic, Amarnag Subramanya, Michael Ringgaard, and Fernando Pereira. 2015. Plato: A Selective Context Model for Entity Resolution. TACL 3 (2015), 503--515.Google ScholarGoogle ScholarCross RefCross Ref
  24. Xiaohua Liu, Yitong Li, Haocheng Wu, Ming Zhou, Furu Wei, and Yi Lu. 2013. Entity Linking for Tweets. In ACL.Google ScholarGoogle Scholar
  25. Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In ACL.Google ScholarGoogle Scholar
  26. Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. David N. Milne and Ian H. Witten. 2008. Learning to link with wikipedia. In CIKM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Erwan Moreau, François Yvon, and Olivier Cappé. 2008. Robust Similarity Measures for Named Entities Matching. In COLING. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Andrea Moro, Alessandro Raganato, and Roberto Navigli. 2014. Entity Linking meetsWord Sense Disambiguation: a Unified Approach. TACL 2 (2014), 231--244.Google ScholarGoogle ScholarCross RefCross Ref
  30. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP.Google ScholarGoogle Scholar
  31. Jonathan Raiman and Olivier Raiman. 2018. DeepType: Multilingual Entity Linking by Neural Type System Evolution. CoRR abs/1802.01021 (2018).Google ScholarGoogle Scholar
  32. Lev-Arie Ratinov, Dan Roth, Doug Downey, and Mike Anderson. 2011. Local and Global Algorithms for Disambiguation to Wikipedia. In ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Stephen E Robertson, SteveWalker, Susan Jones, Micheline M Hancock-Beaulieu, Mike Gatford, et al. 1995. Okapi at TREC-3. Nist Special Publication Sp 109 (1995), 109.Google ScholarGoogle Scholar
  34. Denis Savenkov and Eugene Agichtein. 2016. When a Knowledge Base Is Not Enough: Question Answering over Knowledge Bases with External Text Data. In SIGIR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Valentin I. Spitkovsky and Angel X. Chang. 2012. A Cross-Lingual Dictionary for English Wikipedia Concepts. In LREC.Google ScholarGoogle Scholar
  36. Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Shuyan Tech. 2018. CNProbase concept api. http://shuyantech.com/api/ cnprobase/concept. Accessed May 22, 2018.Google ScholarGoogle Scholar
  38. Kun Wang, Chengqing Zong, and Keh-Yih Su. 2009. Which is More Suitable for Chinese Word Segmentation, the Generative Model or the Discriminative One?. In PACLIC.Google ScholarGoogle Scholar
  39. Zhongyuan Wang, Haixun Wang, and Zhirui Hu. 2014. Head, modifier, and constraint detection in short texts. 2014 IEEE 30th International Conference on Data Engineering (2014), 280--291.Google ScholarGoogle ScholarCross RefCross Ref
  40. Zhongyuan Wang, Kejun Zhao, Haixun Wang, Xiaofeng Meng, and Ji-Rong Wen. 2015. Query Understanding through Knowledge-Based Conceptualization. In IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Bo Xu, Yong Xu, Jiaqing Liang, Chenhao Xie, Bin Liang, Wanyun Cui, and Yanghua Xiao. 2017. CN-DBpedia: A Never-Ending Chinese Knowledge Extraction System. In IEA/AIE.Google ScholarGoogle Scholar
  42. Ikuya Yamada, Hiroyuki Shindo, Hideaki Takeda, and Yoshiyasu Takefuji. 2016. Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation. In CoNLL.Google ScholarGoogle Scholar

Index Terms

  1. Short Text Entity Linking with Fine-grained Topics

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
      October 2018
      2362 pages
      ISBN:9781450360142
      DOI:10.1145/3269206

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 October 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '18 Paper Acceptance Rate147of826submissions,18%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader