ABSTRACT
Summarizing, extracting keywords, sorting and filtering large quantities of texts were subject of various algorithms based on lexical databases because this was a difficult and time consuming task for a human operator. Our proposed algorithm is based on WordNet lexical database. It eliminates the connection words. For the remaining words (only nouns or verbs) we build a tree with several levels of more generic terms like hypernym or lexicographer file. Using custom weights for each tree level and statistical analysis, we extract a restricted number of words which are used to define the keywords of a document. These results can be used to sort and filter a document based on relevance.
- Cabré, M. T., Terminology: Theory, Methods, and Applications, ed. by Juan C. Sager, John Benj amins Publishing Company, Amsterdam, 1999Google Scholar
- Cătălin-Constantin Cerbulescu, Claudia-Monica Cerbulescu, Wordnet And Custom User Profile In Grouping Messages By Relevance, ICCC 2007Google Scholar
- E. Agirre, and O. Lopez, 2003. Clustering wordnet word senses Proceedings of the Conference on Recent Advances on Natural Language (RANLP'03)Google Scholar
- Enrico Giacinto Caldarola, Antonio M. Rinaldi, Improving the Visualization of WordNet Large Lexical Database through Semantic Tag Clouds, BigData Congress, 2016Google Scholar
- Fellbaum, C. ed. (1998). WordNet: An Electronic lexical database, Cambridge, MA: The MIT Press (Language, speech, and communication series), 1998, xxii+423 pp; hardbound, ISBN 0-262-06197-XGoogle Scholar
- Finegan, E., Language: Its Structure and Use, Fifth Edition, Thomson Wadsworth, USA, 2008Google Scholar
- Gamallo, P., Gasperin, C., Agustini, A., Lopes, G.P. 2001: Syntactic-based methods for measuring word similarity. In TSD-01, Springer-Verlag (2001) Google ScholarDigital Library
- Grefenstette, G. 1995: Evaluation Techniques for Automatic Semantic Extraction: Comparing Syntatic and Window Based Approaches. Corpus processing for Lexical Aquisition, MIT Press, Branimir Boguraev and James Pustejovsky (eds.) (1995) 205--216 Google ScholarDigital Library
- JRG Pulido, R Herrera, M Arechiga, A Block, R Acosta, S Legrand, 2006, Identifying Ontology Components From Digital Archives For The Semantic Web, Advances In Computer Science And Technology, Puerto Vallarta. Google ScholarDigital Library
- Jyoti Yadav, Yogesh Kumar Meena, Use of fuzzy logic and wordnet for improving performance of extractive automatic text summarization, ICACCI 2016, 21-24 Sept. 2016Google ScholarCross Ref
- J. C. Lee, Yu-N Cheah, Paraphrase detection using semantic relatedness based on Synset Shortest Path in WordNet, Advanced Informatics: Concepts, Theory And Application (ICAICTA), 2016 Int. Conf. On, ISBN: 978-1-5090-1636-5Google Scholar
- Mohamed Ben Aouicha, Mohamed Ali Hadj Taieb, Sameh Beyaoui, Distributional semantics study using the co-occurrence computed from collaborative resources and WordNet, INISTA, 2016Google Scholar
- Saint-Dizier, P. (ed.) Predicative Forms in Natural Language and in Lexical Knowledge Bases, Springer-Science+Business Media Dordrecht, B.V., 1999.Google Scholar
- Sneha S. Desai, J. A. Laxminarayana, WordNet and Semantic similarity based approach for document clustering, CSITSS, 6-8 Oct. 2016.Google Scholar
- Steve Legrand, 2006, Word Sense Disambiguation with Basic-Level Categories, Advances in NLP Research in Computing Science 18, 2006, pp. 71--82Google Scholar
- Steve Legrand, JRG Pulido, 2004, A Hybrid Approach to Word Sense Disambiguation: Neural Clustering with Class Labeling. ECML and PKDD Pisa, Italy September 24, 2004.Google Scholar
- https://wordnet.princeton.edu/Google Scholar
- MIT Java Wordnet Interface, 2015, http://projects.csail.mit.edu/jwi/Google Scholar
- Alberto J. Cañas et al., 2003, Using WordNet for Word Sense Disambiguation to Support Concept Map Construction, Lecture Notes in CS 2857.Google Scholar
Index Terms
- Extracting text keywords using WordNet
Recommendations
Acquisition of Hypernymy-Hyponymy Relation between Nouns for WordNet Building
IALP '10: Proceedings of the 2010 International Conference on Asian Language ProcessingAutomatic extraction of hypernym-hyponym pairs has been done in many researches. But none is described as an automatic method to incorporate the result to Word Net or on Word Net building. This paper proposes a method to automatically acquire hypernym-...
Hypernymy in WordNet, Its Role in WSD, and Its Limitations
CICSYN '15: Proceedings of the 2015 7th International Conference on Computational Intelligence, Communication Systems and NetworksThis paper closely analysed and depicted the role of the hyponymy that is used in the lexical database called the Word Net to relate the words semantically with a relation "... Is a kind of...". The relation has very important role for disambiguating ...
Exploiting noun phrases and semantic relationships for text document clustering
Text document clustering plays an important role in providing better document retrieval, document browsing, and text mining. Traditionally, clustering techniques do not consider the semantic relationships between words, such as synonymy and hypernymy. ...
Comments