short-paper

Extracting text keywords using WordNet

Authors:
Catalin Cerbulescu

University of Craiova, Romania

University of Craiova, Romania
View Profile

,
Georgiana Silvia Leotescu

University of Craiova, Romania

University of Craiova, Romania
View Profile

BCI '17: Proceedings of the 8th Balkan Conference in InformaticsSeptember 2017Article No.: 14Pages 1–4https://doi.org/10.1145/3136273.3136280

Published:20 September 2017Publication History

BCI '17: Proceedings of the 8th Balkan Conference in Informatics

Pages 1–4

ABSTRACT

Summarizing, extracting keywords, sorting and filtering large quantities of texts were subject of various algorithms based on lexical databases because this was a difficult and time consuming task for a human operator. Our proposed algorithm is based on WordNet lexical database. It eliminates the connection words. For the remaining words (only nouns or verbs) we build a tree with several levels of more generic terms like hypernym or lexicographer file. Using custom weights for each tree level and statistical analysis, we extract a restricted number of words which are used to define the keywords of a document. These results can be used to sort and filter a document based on relevance.

References

Cabré, M. T., Terminology: Theory, Methods, and Applications, ed. by Juan C. Sager, John Benj amins Publishing Company, Amsterdam, 1999Google Scholar
Cătălin-Constantin Cerbulescu, Claudia-Monica Cerbulescu, Wordnet And Custom User Profile In Grouping Messages By Relevance, ICCC 2007Google Scholar
E. Agirre, and O. Lopez, 2003. Clustering wordnet word senses Proceedings of the Conference on Recent Advances on Natural Language (RANLP'03)Google Scholar
Enrico Giacinto Caldarola, Antonio M. Rinaldi, Improving the Visualization of WordNet Large Lexical Database through Semantic Tag Clouds, BigData Congress, 2016Google Scholar
Fellbaum, C. ed. (1998). WordNet: An Electronic lexical database, Cambridge, MA: The MIT Press (Language, speech, and communication series), 1998, xxii+423 pp; hardbound, ISBN 0-262-06197-XGoogle Scholar
Finegan, E., Language: Its Structure and Use, Fifth Edition, Thomson Wadsworth, USA, 2008Google Scholar
Gamallo, P., Gasperin, C., Agustini, A., Lopes, G.P. 2001: Syntactic-based methods for measuring word similarity. In TSD-01, Springer-Verlag (2001) Google ScholarDigital Library
Grefenstette, G. 1995: Evaluation Techniques for Automatic Semantic Extraction: Comparing Syntatic and Window Based Approaches. Corpus processing for Lexical Aquisition, MIT Press, Branimir Boguraev and James Pustejovsky (eds.) (1995) 205--216 Google ScholarDigital Library
JRG Pulido, R Herrera, M Arechiga, A Block, R Acosta, S Legrand, 2006, Identifying Ontology Components From Digital Archives For The Semantic Web, Advances In Computer Science And Technology, Puerto Vallarta. Google ScholarDigital Library
Jyoti Yadav, Yogesh Kumar Meena, Use of fuzzy logic and wordnet for improving performance of extractive automatic text summarization, ICACCI 2016, 21-24 Sept. 2016Google ScholarCross Ref
J. C. Lee, Yu-N Cheah, Paraphrase detection using semantic relatedness based on Synset Shortest Path in WordNet, Advanced Informatics: Concepts, Theory And Application (ICAICTA), 2016 Int. Conf. On, ISBN: 978-1-5090-1636-5Google Scholar
Mohamed Ben Aouicha, Mohamed Ali Hadj Taieb, Sameh Beyaoui, Distributional semantics study using the co-occurrence computed from collaborative resources and WordNet, INISTA, 2016Google Scholar
Saint-Dizier, P. (ed.) Predicative Forms in Natural Language and in Lexical Knowledge Bases, Springer-Science+Business Media Dordrecht, B.V., 1999.Google Scholar
Sneha S. Desai, J. A. Laxminarayana, WordNet and Semantic similarity based approach for document clustering, CSITSS, 6-8 Oct. 2016.Google Scholar
Steve Legrand, 2006, Word Sense Disambiguation with Basic-Level Categories, Advances in NLP Research in Computing Science 18, 2006, pp. 71--82Google Scholar
Steve Legrand, JRG Pulido, 2004, A Hybrid Approach to Word Sense Disambiguation: Neural Clustering with Class Labeling. ECML and PKDD Pisa, Italy September 24, 2004.Google Scholar
https://wordnet.princeton.edu/Google Scholar
MIT Java Wordnet Interface, 2015, http://projects.csail.mit.edu/jwi/Google Scholar
Alberto J. Cañas et al., 2003, Using WordNet for Word Sense Disambiguation to Support Concept Map Construction, Lecture Notes in CS 2857.Google Scholar

Index Terms

Extracting text keywords using WordNet
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Lexical semantics

Recommendations

Acquisition of Hypernymy-Hyponymy Relation between Nouns for WordNet Building
IALP '10: Proceedings of the 2010 International Conference on Asian Language Processing

Automatic extraction of hypernym-hyponym pairs has been done in many researches. But none is described as an automatic method to incorporate the result to Word Net or on Word Net building. This paper proposes a method to automatically acquire hypernym-...
Read More
Hypernymy in WordNet, Its Role in WSD, and Its Limitations
CICSYN '15: Proceedings of the 2015 7th International Conference on Computational Intelligence, Communication Systems and Networks

This paper closely analysed and depicted the role of the hyponymy that is used in the lexical database called the Word Net to relate the words semantically with a relation "... Is a kind of...". The relation has very important role for disambiguating ...
Read More
Exploiting noun phrases and semantic relationships for text document clustering

Text document clustering plays an important role in providing better document retrieval, document browsing, and text mining. Traditionally, clustering techniques do not consider the semantic relationships between words, such as synonymy and hypernymy. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BCI '17: Proceedings of the 8th Balkan Conference in Informatics
September 2017
181 pages
ISBN:9781450352857
DOI:10.1145/3136273
General Chair:
Katerina Zdravkova,
Program Chairs:
George Eleftherakis,
Petros Kefalas
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 September 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Hypernymy
Keyword Weight
Lexicographer file
Text Summarization
Word sense analysis
WordNet
Qualifiers
- short-paper
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate97of250submissions,39%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 106
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Extracting text keywords using WordNet

BCI '17: Proceedings of the 8th Balkan Conference in Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Acquisition of Hypernymy-Hyponymy Relation between Nouns for WordNet Building

Hypernymy in WordNet, Its Role in WSD, and Its Limitations

Exploiting noun phrases and semantic relationships for text document clustering