skip to main content
10.1145/3109761.3158401acmotherconferencesArticle/Chapter ViewAbstractPublication PagesimlConference Proceedingsconference-collections
research-article

Semantic search extension based on polish wordnet relations in business document exploration

Published:17 October 2017Publication History

ABSTRACT

This paper addresses the problem of building a specialized semantic search engine for documents collected in small or medium-sized enterprises. It presents the results of a project that brought together computer scientists and entrepreneurs for the purpose of providing a common perspective regarding the implementation in company practice of a search engine based on the Polish version of Word-Net semantic relations. The core functionality of the search engine module is provided along with a discussion on how to arrange semantic similarity structures so as to ensure the efficient generation of relevant search engine results. Some patterns and similarity coefficients for hyperonymy, hyponymy, holonymy and meronymy relations are presented and analyzed for the purpose of producing relationship structures. Finally, the architecture of the system that can be implemented in a company is outlined.

References

  1. Chris Buckley. 1985. Implementation of the SMART Information Retrieval System. Technical Report. Ithaca, NY, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Scott Deerwester, Susan T Dumais, George W Furnas, Thomas K Landauer, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American society for information science 41, 6 (1990), 391.Google ScholarGoogle ScholarCross RefCross Ref
  3. Hui Fang. 2008. A Re-examination of Query Expansion Using Lexical Resources.. In ACL, Vol. 2008. Citeseer, 139--147.Google ScholarGoogle Scholar
  4. Julio Gonzalo, Felisa Verdejo, Irina Chugur, and Juan Cigarran. 1998. Indexing with WordNet synsets can improve text retrieval. arXiv preprint cmp-lg/9808002 (1998).Google ScholarGoogle Scholar
  5. Trey Grainger, Timothy Potter, and Yonik Seeley. 2014. Solr in action. Manning Cherry Hill. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Angelos Hliaoutakis, Giannis Varelas, Epimenidis Voutsakis, Euripides GM Petrakis, and Evangelos Milios. 2006. Information retrieval by semantic similarity. International journal on semantic Web and information systems (IJSWIS) 2, 3 (2006), 55--73.Google ScholarGoogle Scholar
  7. Jay J Jiang and David W Conrath. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. arXiv preprint cmp-lg/9709008 (1997).Google ScholarGoogle Scholar
  8. Yuhua Li, Zuhair A Bandar, and David McLean. 2003. An approach for measuring semantic similarity between words using multiple information sources. IEEE Transactions on knowledge and data engineering 15, 4 (2003), 871--882. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dekang Lin et al. 1998. An information-theoretic definition of similarity.. In ICML, Vol. 98. Citeseer, 296--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Shuang Liu, Fang Liu, Clement Yu, and Weiyi Meng. 2004. An effective approach to document retrieval via utilizing WordNet and recognizing phrases. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 266--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Marek Maziarz, Maciej Piasecki, and Stan Szpakowicz. 2012. Approaching plWord-Net 2.0. In Proceedings of 6th International Global Wordnet Conference, The Global WordNet Association. 189--196.Google ScholarGoogle Scholar
  12. George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM 38, 11 (1995), 39--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. George A Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine J Miller. 1990. Introduction to WordNet: An on-line lexical database. International journal of lexicography 3, 4 (1990), 235--244.Google ScholarGoogle Scholar
  14. George A Miller, Martin Chodorow, Shari Landes, Claudia Leacock, and Robert G Thomas. 1994. Using a semantic concordance for sense identification. In Proceedings of the workshop on Human Language Technology. Association for Computational Linguistics, 240--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Pavel Moravec, Michal Kolovrat, and Vaclav Snasel. 2004. LSI vs. Wordnet Ontology in Dimension Reduction for Information Retrieval.. In Dateso. 18--26.Google ScholarGoogle Scholar
  16. Dipasree Pal, Mandar Mitra, and Kalyankumar Datta. 2014. Improving query expansion using WordNet. Journal of the Association for Information Science and Technology 65, 12 (2014), 2469--2478. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Maciej Piasecki. 2007. Polish Tagger TaKIPI: Rule Based Construction and Optimisation. Task Quarterly 11, 1-2 (2007), 151--167.Google ScholarGoogle Scholar
  18. Piotr Potiopa. 2014. Similarity analysis of Polish legal documents using WordNets semantic relations. In Information Technology and Law. Jagiellonian University Press, 67--77.Google ScholarGoogle Scholar
  19. Roy Rada, Hafedh Mili, Ellen Bicknell, and Maria Blettner. 1989. Development and application of a metric on semantic nets. IEEE transactions on systems, man, and cybernetics 19, 1 (1989), 17--30.Google ScholarGoogle Scholar
  20. Philip Resnik. 1995. Using information content to evaluate semantic similarity in a taxonomy. arXiv preprint cmp-lg/9511007 (1995). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Stephen E Robertson and Steve Walker. 1999. Okapi/Keenbow at TREC-8.. In TREC, Vol. 8. 151--162.Google ScholarGoogle Scholar
  22. Nuno Seco, Tony Veale, and Jer Hayes. 2004. An intrinsic information content metric for semantic similarity in WordNet. In Proceedings of the 16th European conference on artificial intelligence. IOS Press, 1089--1090. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Vaclav Snasel, Pavel Moravec, and Jaroslav Pokorny. 2005. WordNet ontology based model for web retrieval. In Web Information Retrieval and Integration, 2005. WIRI'05. Proceedings. International Workshop on Challenges in. IEEE, 220--225. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Anita Subalalitha, Abhilash Dorle, and Karthick Venkatesh. 2017. Semantic Search Using Latent Semantic Indexing And WordNet. (2017), 551--555.Google ScholarGoogle Scholar
  25. Amos Tversky. 1977. Features of similarity. Psychological review 84, 4 (1977), 327.Google ScholarGoogle Scholar
  26. Ellen M Voorhees. 1994. Query expansion using lexical-semantic relations. In Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval. Springer-Verlag New York, Inc., 61--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jiuling Zhang, Beixing Deng, and Xing Li. 2009. Concept based query expansion using wordnet. In Proceedings of the 2009 international e-conference on advanced science and technology. IEEE Computer Society, 52--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zili Zhou, Yanna Wang, and Junzhong Gu. 2008. New model of semantic similarity measuring in wordnet. In Intelligent System and Knowledge Engineering, 2008. ISKE 2008. 3rd International Conference on, Vol. 1. IEEE, 256--261.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Semantic search extension based on polish wordnet relations in business document exploration

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          IML '17: Proceedings of the 1st International Conference on Internet of Things and Machine Learning
          October 2017
          581 pages
          ISBN:9781450352437
          DOI:10.1145/3109761

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 17 October 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader