research-article

Don't compare Apples to Oranges: Extending GERBIL for a fine grained NEL evaluation

Authors:

Jörg Waitelonis,

Henrik Jürges,

Harald SackAuthors Info & Claims

SEMANTiCS 2016: Proceedings of the 12th International Conference on Semantic Systems

Pages 65 - 72

https://doi.org/10.1145/2993318.2993334

Published: 12 September 2016 Publication History

Abstract

In recent years, named entity linking (NEL) tools were primarily developed as general approaches, whereas today numerous tools are focusing on specific domains such as e.g. the mapping of persons and organizations only, or the annotation of locations or events in microposts. However, the available benchmark datasets used for the evaluation of NEL tools do not reflect this focalizing trend. We have analyzed the evaluation process applied in the NEL benchmarking framework GERBIL [16] and its benchmark datasets. Based on these insights we extend the GERBIL framework to enable a more fine grained evaluation and in deep analysis of the used benchmark datasets according to different emphases. In this paper, we present the implementation of an adaptive filter for arbitrary entities as well as a system to automatically measure benchmark dataset properties, such as the extent of content-related ambiguity and diversity. The implementation as well as a result visualization are integrated in the publicly available GERBIL framework.

References

[1]

M. Cornolti, P. Ferragina, and M. Ciaramita. A framework for benchmarking entity-annotation systems. In 22nd World Wide Web Conference. ACM, 2013.

Digital Library

[2]

B. Hachey, J. Nothman, and W. Radford. Cheap and easy entity evaluation. In 52nd Annual Meeting of the Association for Computational Linguistics, pages 464--469. ACL, 2014.

[3]

J. Hoffart, S. Seufert, D. B. Nguyen, M. Theobald, and G. Weikum. Kore: Keyphrase overlap relatedness for entity disambiguation. In 21st ACM Int. Conf. on Information and Knowledge Management, pages 545--554, New York, NY, USA, 2012. ACM.

Digital Library

[4]

X. Ling, S. Singh, and D. S. Weld. Design Challenges for Entity Linking. Transactions of the Association for Computational Linguistics, 3:315--28, 2015.

[5]

L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab, 1999.

[6]

S. Pradhan, X. Luo, M. Recasens, E. H. Hovy, V. Ng, and M. Strube. Scoring coreference partitions of predicted mentions: A reference implementation. In 52nd Annual Meeting of the Association for Computational Linguistics, pages 30--35. ACL, 2014.

[7]

D. Reddy, M. Knuth, and H. Sack. DBpedia GraphMeasures. Hasso Plattner Institute, Potsdam, July 2014, http://s16a.org/node/6.

[8]

G. Rizzo, A. E. C. Basave, B. Pereira, and A. Varga. Making sense of microposts (#microposts2015) named entity recognition and linking (NEEL) challenge. In 5th Workshop on Making Sense of Microposts at 24th Int. World Wide Web Conference, volume 1395 of CEUR-WS, pages 44--53, 2015.

[9]

G. Rizzo and R. Troncy. NERD: A framework for unifying named entity recognition and disambiguation web extraction tools, Eurecom 3677, Avignon, France, 2012.

Digital Library

[10]

G. Rizzo, M. van Erp, and R. Troncy. Benchmarking the extraction and disambiguation of named entities on the semantic web. In 9th Int. Conf. on Language Resources and Evaluation. ELRA, 2014.

[11]

M. Röder, R. Usbeck, and A.-C. Ngonga Ngomo. Gerbil's new stunts: Semantic annotation benchmarking improved. Technical report, Leipzig University, 2016.

[12]

W. Shen, J. Wang, and J. Han. Entity linking with a knowledge base: Issues, techniques, and solutions. IEEE Transactions on Knowledge and Data Engineering, 27(2):443--460, Feb 2015.

[13]

A. Singhal. Introducing the knowledge graph: things, not strings. Official Google Blog, May, 2012.

[14]

N. Steinmetz, M. Knuth, and H. Sack. Statistical analyses of named entity disambiguation benchmarks. In Proc. of NLP & DBpedia 2013 workshop at 12th Int. Semantic Web Conference. CEUR-WS, 2013.

Digital Library

[15]

T. Tietz, J. Waitelonis, J. Jäger, and H. Sack. Smart Media Navigator: Visualizing recommendations based on Linked Data. In 13th Int. Semantic Web Conference, Industry Track, pages 48--51, 2014.

[16]

R. Usbeck et al. GERBIL -- general entity annotation benchmark framework. In 24th World Wide Web Conf. ACM, 2015.

Digital Library

[17]

M. van Erp, P. Mendes, H. Paulheim, F. Ilievski, J. Plu, G. Rizzo, and J. Waitelonis. Evaluating entity linking: An analysis of current benchmark datasets and a roadmap for doing a better job. In 10th edition of the Language Resources and Evaluation Conference. ELRA, 2016.

[18]

J. Waitelonis, C. Exeler, and H. Sack. Linked Data Enabled Generalized Vector Space Model to Improve Document Retrieval. In NLP & DBpedia 2015 workshop at 14th Int. Semantic Web Conf. CEUR-WS, 2015.

Cited By

Guellil IGarcia-Dominguez ALewis PHussain SSmith G(2024)Entity linking for English and other languages: a surveyKnowledge and Information Systems10.1007/s10115-023-02059-266:7(3773-3824)Online publication date: 2-Apr-2024
https://doi.org/10.1007/s10115-023-02059-2
Rosales-Méndez HHogan APoblete B(2020)Fine-Grained Entity LinkingJournal of Web Semantics10.1016/j.websem.2020.100600(100600)Online publication date: Aug-2020
https://doi.org/10.1016/j.websem.2020.100600
Waitelonis JJürges HSack HNgonga Ngomo AFundulaki IKrithara A(2019)Remixing entity linking evaluation datasets for focused benchmarkingSemantic Web10.3233/SW-18033410:2(385-412)Online publication date: 1-Jan-2019
https://dl.acm.org/doi/10.3233/SW-180334
Show More Cited By

Recommendations

Apples to oranges: evaluating image annotations from natural language processing systems
NAACL HLT '12: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

We examine evaluation methods for systems that automatically annotate images using co-occurring text. We compare previous datasets for this task using a series of baseline measures inspired by those used in information retrieval, computer vision, and ...
U-Compare

Summary: Due to the increasing number of text mining resources (tools and corpora) available to biologists, interoperability issues between these resources are becoming significant obstacles to using them effectively. UIMA, the Unstructured Information ...
Apples and oranges: a comparison of RDF benchmarks and real RDF datasets
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

The widespread adoption of the Resource Description Framework (RDF) for the representation of both open web and enterprise data is the driving force behind the increasing research interest in RDF data management. As RDF data management systems ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

SEMANTiCS 2016: Proceedings of the 12th International Conference on Semantic Systems

September 2016

207 pages

ISBN:9781450347525

DOI:10.1145/2993318

Editors:
Anna Fensel
STI Innsbruck, University of Innsbruck, Austria
,
Amrapali Zaveri
Stanford University, USA
,
Sebastian Hellmann
AKSW/KILT, Institute for Applied Informatics (InfAI), Leipzig, Germany
,
Tassilo Pellegrini
University of Applied Sciences St. Poelten, Austria

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Ghent University: Ghent University
AIT: Austrian Institute of Technology
Stanford University: Stanford University
Wolters Kluwer: Wolters Kluwer, Germany
Semantic Web Company: Semantic Web Company

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 September 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

SEMANTiCS 2016

SEMANTiCS 2016: 12th International Conference on Semantic Systems

September 12 - 15, 2016

Leipzig, Germany

Acceptance Rates

SEMANTiCS 2016 Paper Acceptance Rate 18 of 85 submissions, 21%;

Overall Acceptance Rate 40 of 182 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
73
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Guellil IGarcia-Dominguez ALewis PHussain SSmith G(2024)Entity linking for English and other languages: a surveyKnowledge and Information Systems10.1007/s10115-023-02059-266:7(3773-3824)Online publication date: 2-Apr-2024
https://doi.org/10.1007/s10115-023-02059-2
Rosales-Méndez HHogan APoblete B(2020)Fine-Grained Entity LinkingJournal of Web Semantics10.1016/j.websem.2020.100600(100600)Online publication date: Aug-2020
https://doi.org/10.1016/j.websem.2020.100600
Waitelonis JJürges HSack HNgonga Ngomo AFundulaki IKrithara A(2019)Remixing entity linking evaluation datasets for focused benchmarkingSemantic Web10.3233/SW-18033410:2(385-412)Online publication date: 1-Jan-2019
https://dl.acm.org/doi/10.3233/SW-180334
Röder MUsbeck RNgonga Ngomo A(2018)GERBIL – Benchmarking Named Entity Recognition and Linking consistentlySemantic Web10.3233/SW-1702869:5(605-625)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.3233/SW-170286
Moussallem DUsbeck RRöeder MNgomo A(2017)MAGProceedings of the 9th Knowledge Capture Conference10.1145/3148011.3148024(1-8)Online publication date: 4-Dec-2017
https://dl.acm.org/doi/10.1145/3148011.3148024
Lauscher ARuiz Fabo PNanni FPonzetto S(2016)Entities as Topic Labels: Combining Entity Linking and Labeled LDA to Improve Topic Interpretability and EvaluabilityItalian Journal of Computational Linguistics10.4000/ijcol.3922:2(67-87)Online publication date: 1-Dec-2016
https://doi.org/10.4000/ijcol.392

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten