research-article

Do We Need Entity-Centric Knowledge Bases for Entity Disambiguation?

Authors:
Stefan Zwicklbauer

University of Passau, Passau, 94032 Germany

University of Passau, Passau, 94032 Germany
View Profile

,
Christin Seifert

University of Passau, Passau, 94032 Germany

University of Passau, Passau, 94032 Germany
View Profile

,
Michael Granitzer

University of Passau, Passau, 94032 Germany

University of Passau, Passau, 94032 Germany
View Profile

i-Know '13: Proceedings of the 13th International Conference on Knowledge Management and Knowledge TechnologiesSeptember 2013Article No.: 4Pages 1–8https://doi.org/10.1145/2494188.2494198

Published:04 September 2013Publication History

i-Know '13: Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies

Pages 1–8

ABSTRACT

Entity Disambiguation has been studied extensively in the last 10 years with authors reporting increasingly well performing systems. However, most studies focused on general purpose knowledge bases like Wikipedia or DBPedia and left out the question how those results generalize to more specialized domains. This is especially important in the context of Linked Open Data which forms an enormous resource for disambiguation. However, the influence of domain heterogeneity, size and quality of the knowledge base remains largely unanswered. In this paper we present an extensive set of experiments on special purpose knowledge bases from the biomedical domain where we evaluate the disambiguation performance along four variables: (i) the representation of the knowledge base as being either entity-centric or document-centric, (ii) the size of the knowledge base in terms of entities covered, (iii) the semantic heterogeneity of a domain and (iv) the quality and completeness of a knowledge base. Our results show that for special purpose knowledge bases (i) document-centric disambiguation significantly outperforms entity-centric disambiguation, (ii) document-centric disambiguation does not depend on the size of the knowledge-base, while entity-centric approaches do, and (iii) disambiguation performance varies greatly across domains. These results suggest that domain-heterogeneity, size and knowledge base quality have to be carefully considered for the design of entity disambiguation systems and that for constructing knowledge bases user-annotated texts are preferable to carefully constructed knowledge bases.

References

R. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In Proceedings of EACL, volume 6, pages 9--16, 2006.Google Scholar
S. Cucerzan. Large-scale named entity disambiguation based on Wikipedia data. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 708--716, Prague, Czech Republic, June 2007. Association for Computational Linguistics.Google Scholar
M. Dredze, P. McNamee, D. Rao, A. Gerber, and T. Finin. Entity disambiguation for knowledge base population. In Proceedings of the 23rd International Conference on Computational Linguistics, pages 277--285. Association for Computational Linguistics, 2010. Google ScholarDigital Library
A. L. Gentile, Z. Zhang, L. Xia, and J. Iria. Graph-based Semantic Relatedness for Named Entity Disambiguation. In 1st International Conference on Software, Services and Semantic Technologies, 2009.Google Scholar
H. Han, L. Giles, H. Zha, C. Li, and K. Tsioutsiouliklis. Two supervised learning approaches for name disambiguation in author citations. In Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries, JCDL '04, pages 296--305, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
X. Han and J. Zhao. Named entity disambiguation by leveraging wikipedia semantic knowledge. In Proceedings of the 18th ACM conference on Information and knowledge management, CIKM '09, pages 215--224, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
K. S. Jones, S. Walker, and S. E. Robertson. A probabilistic model of information retrieval: development and comparative experiments. Inf. Process. Manage., 36(6), 2000. Google ScholarDigital Library
S. Kafkas, I. Lewin, D. Milward, E. van Mulligen, J. Kors, U. Hahn, and D. Rebholz-Schuhmann. Calbc: Releasing the final corpora. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), Istanbul, Turkey, May 2012.Google Scholar
R. Kern, M. Zechner, and M. Granitzer. Model selection strategies for author disambiguation. In Database and Expert Systems Applications (DEXA), 2011 22nd International Workshop on, pages 155--159. IEEE, 2011. Google ScholarDigital Library
T.-Y. Liu. Learning to Rank for Information Retrieval. Springer, 2011.Google ScholarCross Ref
A. Luberg, M. Granitzer, H. Wu, P. Järv, and T. Tammet. Information retrieval and deduplication for tourism recommender sightsplanner. In Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics, page 50. ACM, 2012. Google ScholarDigital Library
C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008. Google ScholarDigital Library
A. McCray, A. Burgun, and O. Bodenreider. Aggregating umls semantic types for reducing conceptual complexity. Proceedings of Medinfo, 10(pt 1):216--20, 2001.Google Scholar
P. N. Mendes, M. Jakob, A. García-Silva, and C. Bizer. Dbpedia spotlight: shedding light on the web of documents. In Proceedings of the 7th International Conference on Semantic Systems, I-Semantics '11, pages 1--8, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
R. Mihalcea and A. Csomai. Wikify!: linking documents to encyclopedic knowledge. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages 233--242. ACM, 2007. Google ScholarDigital Library
Y. Peng, D. He, and M. Mao. Geographic named entity disambiguation with automatic profile generation. In Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, WI '06, pages 522--525, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarDigital Library
M. F. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, July 1980.Google ScholarCross Ref
L. Ratinov, D. Roth, D. Downey, and M. Anderson. Local and global algorithms for disambiguation to wikipedia. In Proceedings of the Annual Meeting of the Association of Computational Linguistics, 2011. Google ScholarDigital Library
G. Salton. Automatic Information Organization and Retrieval. McGraw Hill Text, 1968. Google ScholarDigital Library
G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Communications of the ACM, 18(11):613--620, Nov. 1975. Google ScholarDigital Library
J. Wang, G. Li, J. X. Yu, and J. Feng. Entity matching: How similar is similar. PVLDB, 4(10):622--633, 2011. Google ScholarDigital Library

Recommendations

Search-based entity disambiguation with document-centric knowledge bases
i-KNOW '15: Proceedings of the 15th International Conference on Knowledge Technologies and Data-driven Business

Entity disambiguation is the task of mapping ambiguous terms in natural-language text to its entities in a knowledge base. One possibility to describe these entities within a knowledge base is via entity-annotated documents (document-centric knowledge ...
Read More
Entity Disambiguation with Linkless Knowledge Bases
WWW '16: Proceedings of the 25th International Conference on World Wide Web

Named Entity Disambiguation is the task of disambiguating named entity mentions in natural language text and link them to their corresponding entries in a reference knowledge base (e.g. Wikipedia). Such disambiguation can help add semantics to plain ...
Read More
Context Aware Named Entity Disambiguation
WI-IAT '12: Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01

Recently, named entity recognition tools tend to disambiguate recognized named entities on a very detailed level. Instead of elementary types (e.g. Person or Location), they assign concrete identifiers, trying to distinguish even different entities ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
i-Know '13: Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies
September 2013
271 pages
ISBN:9781450323000
DOI:10.1145/2494188
Editors:
Stefanie Lindstaedt
Know-Center Graz & Graz University of Technology, Austria
,
Michael Granitzer
University of Passau, Germany
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 September 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Entity Disambiguation
Linked Data
Text Annotation
Qualifiers
- research-article
Conference

Acceptance Rates
i-Know '13 Paper Acceptance Rate27of87submissions,31%Overall Acceptance Rate77of238submissions,32%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 153
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Do We Need Entity-Centric Knowledge Bases for Entity Disambiguation?

i-Know '13: Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies

ABSTRACT

References

Cited By

Recommendations

Search-based entity disambiguation with document-centric knowledge bases

Entity Disambiguation with Linkless Knowledge Bases

Context Aware Named Entity Disambiguation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Do We Need Entity-Centric Knowledge Bases for Entity Disambiguation?

i-Know '13: Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies

ABSTRACT

References

Cited By

Recommendations

Search-based entity disambiguation with document-centric knowledge bases

Entity Disambiguation with Linkless Knowledge Bases

Context Aware Named Entity Disambiguation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media