skip to main content
10.1145/3136273.3136280acmotherconferencesArticle/Chapter ViewAbstractPublication PagesbciConference Proceedingsconference-collections
short-paper

Extracting text keywords using WordNet

Published:20 September 2017Publication History

ABSTRACT

Summarizing, extracting keywords, sorting and filtering large quantities of texts were subject of various algorithms based on lexical databases because this was a difficult and time consuming task for a human operator. Our proposed algorithm is based on WordNet lexical database. It eliminates the connection words. For the remaining words (only nouns or verbs) we build a tree with several levels of more generic terms like hypernym or lexicographer file. Using custom weights for each tree level and statistical analysis, we extract a restricted number of words which are used to define the keywords of a document. These results can be used to sort and filter a document based on relevance.

References

  1. Cabré, M. T., Terminology: Theory, Methods, and Applications, ed. by Juan C. Sager, John Benj amins Publishing Company, Amsterdam, 1999Google ScholarGoogle Scholar
  2. Cătălin-Constantin Cerbulescu, Claudia-Monica Cerbulescu, Wordnet And Custom User Profile In Grouping Messages By Relevance, ICCC 2007Google ScholarGoogle Scholar
  3. E. Agirre, and O. Lopez, 2003. Clustering wordnet word senses Proceedings of the Conference on Recent Advances on Natural Language (RANLP'03)Google ScholarGoogle Scholar
  4. Enrico Giacinto Caldarola, Antonio M. Rinaldi, Improving the Visualization of WordNet Large Lexical Database through Semantic Tag Clouds, BigData Congress, 2016Google ScholarGoogle Scholar
  5. Fellbaum, C. ed. (1998). WordNet: An Electronic lexical database, Cambridge, MA: The MIT Press (Language, speech, and communication series), 1998, xxii+423 pp; hardbound, ISBN 0-262-06197-XGoogle ScholarGoogle Scholar
  6. Finegan, E., Language: Its Structure and Use, Fifth Edition, Thomson Wadsworth, USA, 2008Google ScholarGoogle Scholar
  7. Gamallo, P., Gasperin, C., Agustini, A., Lopes, G.P. 2001: Syntactic-based methods for measuring word similarity. In TSD-01, Springer-Verlag (2001) Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Grefenstette, G. 1995: Evaluation Techniques for Automatic Semantic Extraction: Comparing Syntatic and Window Based Approaches. Corpus processing for Lexical Aquisition, MIT Press, Branimir Boguraev and James Pustejovsky (eds.) (1995) 205--216 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. JRG Pulido, R Herrera, M Arechiga, A Block, R Acosta, S Legrand, 2006, Identifying Ontology Components From Digital Archives For The Semantic Web, Advances In Computer Science And Technology, Puerto Vallarta. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jyoti Yadav, Yogesh Kumar Meena, Use of fuzzy logic and wordnet for improving performance of extractive automatic text summarization, ICACCI 2016, 21-24 Sept. 2016Google ScholarGoogle ScholarCross RefCross Ref
  11. J. C. Lee, Yu-N Cheah, Paraphrase detection using semantic relatedness based on Synset Shortest Path in WordNet, Advanced Informatics: Concepts, Theory And Application (ICAICTA), 2016 Int. Conf. On, ISBN: 978-1-5090-1636-5Google ScholarGoogle Scholar
  12. Mohamed Ben Aouicha, Mohamed Ali Hadj Taieb, Sameh Beyaoui, Distributional semantics study using the co-occurrence computed from collaborative resources and WordNet, INISTA, 2016Google ScholarGoogle Scholar
  13. Saint-Dizier, P. (ed.) Predicative Forms in Natural Language and in Lexical Knowledge Bases, Springer-Science+Business Media Dordrecht, B.V., 1999.Google ScholarGoogle Scholar
  14. Sneha S. Desai, J. A. Laxminarayana, WordNet and Semantic similarity based approach for document clustering, CSITSS, 6-8 Oct. 2016.Google ScholarGoogle Scholar
  15. Steve Legrand, 2006, Word Sense Disambiguation with Basic-Level Categories, Advances in NLP Research in Computing Science 18, 2006, pp. 71--82Google ScholarGoogle Scholar
  16. Steve Legrand, JRG Pulido, 2004, A Hybrid Approach to Word Sense Disambiguation: Neural Clustering with Class Labeling. ECML and PKDD Pisa, Italy September 24, 2004.Google ScholarGoogle Scholar
  17. https://wordnet.princeton.edu/Google ScholarGoogle Scholar
  18. MIT Java Wordnet Interface, 2015, http://projects.csail.mit.edu/jwi/Google ScholarGoogle Scholar
  19. Alberto J. Cañas et al., 2003, Using WordNet for Word Sense Disambiguation to Support Concept Map Construction, Lecture Notes in CS 2857.Google ScholarGoogle Scholar

Index Terms

  1. Extracting text keywords using WordNet

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      BCI '17: Proceedings of the 8th Balkan Conference in Informatics
      September 2017
      181 pages
      ISBN:9781450352857
      DOI:10.1145/3136273

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 September 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • short-paper
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate97of250submissions,39%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader