skip to main content
10.1145/2797143.2797167acmotherconferencesArticle/Chapter ViewAbstractPublication PageseannConference Proceedingsconference-collections
research-article

Exploring Weights of Hierarchical and Equivalency Relationship in General Persian Texts

Authors Info & Claims
Published:25 September 2015Publication History

ABSTRACT

A thesaurus is a reference work that lists words grouped together according to similarity of meaning (containing synonyms and sometimes antonyms), in contrast to a dictionary, which contains definitions and pronunciations. Three kinds of relationships used in a thesaurus includes: (1) equivalency, (2) hierarchy, and finally (2) association. This paper proposes a novel method to develop a classification task in general Persian context while it employs a thesaurus. Two kinds of word relationships are employed in our used thesaurus: (1) equivalency, and (2) hierarchy. Each of these kinds has a weight that can be tuned. The paper explores all possible weights for the proper ones. After that a feature selection mechanism is also employed. A host of machine learning algorithms are employed as the classifier over the frequency based features. Experimental results indicate the usage of the best weights for these relationships; can lead to a good result.

References

  1. American Society of Indexers. Frequently Asked Questions Indexing. Index review in Books, Ireland. Available: http://www.asindexing.org/site/indfaq.shtmlGoogle ScholarGoogle Scholar
  2. Strehl A. and Ghosh J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3(Dec):583--617, (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Hamshahri newspaper, http://www.hamshahrionline.irGoogle ScholarGoogle Scholar
  4. Yousefi, A.: Principles and methods for computerized indexing. Journal Books. Volume 9, Number 2., (2010) (in Persian)Google ScholarGoogle Scholar
  5. Turney, P.D.: Learning Algorithms for Keyphrase Extraction. Information Retrieval, 2(4), pp. 306--336, (1999). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Frank, E.: Domain-Based Extraction of Technical Keyphrases. International Joint Conference on Artificial Intelligence, India, (1999). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Liu, Y. and Ciliax, B.J., Borges, K., Dasigi, V., Ram, A., Navathe, S.B., and ingledine, R.: Comparison of two schemes for automatic keyword extraction from MEDLINE for functional gene clustering. Computational Systems Bioinformatics Conference, Stanford, (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Frantzi, K., Ananiadou, S., and Mima, H.: Automatic Recognition of Multi-word Terms: the C-value/NC-value Method. Digital Libraries, 3(2), pp. 115--130, (2002).Google ScholarGoogle ScholarCross RefCross Ref
  9. Freitas, N., and Kaestner, A.: Automatic text summarization using a machine learning approach. Brazilian Symposium on Artificial Intelligence (SBIA), Brazil, (2005).Google ScholarGoogle Scholar
  10. Zhang, Y., Heywood, N.Z., and Milios, E.: World Wide Web Site Summarization Web Intelligence and Agent Systems. Technical Report, CS-2002-8, (2006).Google ScholarGoogle Scholar
  11. Hult, A.: Improved automatic keyword extraction given more linguistic knowledge. 8th Conference on Empirical Methods in Natural Language Processing, (2003). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Deegan, M.: Keyword Extraction with Thesauri and Content Analysis. URL: http://www.rlg.org/en/page.php?Page_ID=17068Google ScholarGoogle Scholar
  13. Hyun, D.: Automatic Keyword Extraction Using Category Correlation of Data. Heidelberg, pp. 224--230, (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Witten, W. and Medley, I.H.: Thesaurus based automatic keyphrase indexing. 6th ACM/IEEE-CS JCDL '06 (Joint Conference on Digital Libraries) Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Klein, M. and Steenbergen, W.V.: Thesaurus-based Retrieval of Case Law. 19th International JURIX conference, Paris, (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Martinez, J.L.: Automatic Keyword Extraction for News Finder. Heidelberg, pp. 405--427, (2008).Google ScholarGoogle Scholar
  17. Shahabi, A.M.: abstract construction in Persian literature. Second International Conference on Cognitive Science, page 56, Tehran, (2002) (in Persian)Google ScholarGoogle Scholar
  18. Bahar, M.T.: Persian Grammar. Chapter IV, page 111, (1962). (in Persian)Google ScholarGoogle Scholar
  19. Khalouei, M.: indexing machine. Journal Books. Volume 6, Number 3. (2009) (in Persian)Google ScholarGoogle Scholar
  20. Karimi, Z. and Shamsfard, M.: Automatic summarization systems Persian literature. 12th International Conference of Computer Society of Iran, (2005). (in Persian)Google ScholarGoogle Scholar
  21. Parvin, H., Minaei-Bidgoli, B., and Dahbashi, A.: Improving Persian Text Classification Using Persian Thesaurus. Iberoamerican Congress on Pattern Recognition, pp. 391--398, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Hori, E.: A Manual to make and develop a multilingual thesaurus, Scientific Documentation Center, (2003). (in Persian)Google ScholarGoogle Scholar
  23. Daryabari M., Minaei-Bidgoli B., and Parvin H.: Localizing Program Logical Errors Using Extraction of Knowledge from Invariants. LNCS 6630: 124--135, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Fouladgar M.H., Minaei-Bidgoli B., and Parvin H.: On Possibility of Conditional Invariant Detection. 6881(2): 214--224, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Minaei-Bidgoli B., Parvin H., Alinejad-Rokny H., Alizadeh H., and Punch W.F.: Effects of resampling method and adaptation on clustering ensemble efficacy, Online, (2011).Google ScholarGoogle Scholar
  26. Parvin H. and Minaei-Bidgoli B.: Linkage Learning Based on Local Optima. LNCS 6922(1): 163--172, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Parvin, H., Helmi, H., and Minaei-Bidgoli, B., Alinejad-Rokny, H. and Shirgahi H.: Linkage Learning Based on Differences in Local Optimums of Building Blocks with One Optima. International Journal of the Physical Sciences 6(14): 3419--3425, (2011).Google ScholarGoogle Scholar
  28. Parvin H., Minaei-Bidgoli M., and Alizadeh H.: A New Clustering Algorithm with the Convergence Proof. LNCS 6881(1): 21--31, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Parvin H., Minaei-Bidgoli B., Alizadeh H., and Beigi A.: A Novel Classifier Ensemble Method Based on Class Weightening in Huge Dataset. LNCS 6676 (2): 144--150, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Parvin H., Minaei-Bidgoli B., and Alizadeh H.: Detection of Cancer Patients Using an Innovative Method for Learning at Imbalanced Datasets. LNCS 6954: 376--381, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Parvin H., Minaei-Bidgoli B., and Ghaffarian H.: An Innovative Feature Selection Using Fuzzy Entropy. LNCS 6677 (3): 576--585, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Parvin H., Minaei-Bidgoli B., and Parvin S.: A Metric to Evaluate a Cluster by Eliminating Effect of Complement Cluster. LNCS 7006: 246--254, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Parvin, H., Minaei-Bidgoli, B., Ghatei, S. and Alinejad-Rokny, H.: An Innovative Combination of Particle Swarm Optimization, Learning Automaton and Great Deluge Algorithms for Dynamic Environments. International Journal of the Physical Sciences 6(22): 5121 -- 5127, (2011).Google ScholarGoogle Scholar
  34. Parvin H., Minaei-Bidgoli B., Karshenas H., and Beigi A.: A New N-gram Feature Extraction-Selection Method for Malicious Code. LNCS 6594(2): 98--107, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Qodmanan H.R., Nasiri M., Minaei-Bidgoli B.: Multi objective association rule mining with genetic algorithm without specifying minimum support and minimum confidence, Expert Systems with Applications, 38(1): 288--298, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Bi Y., Bell D., Wang H., Guo G., and Guan J.: Combining multiple classifiers using dempster's rule text caractrization, Applied Artificial Intelligence: An International Journal, 21(3):211--239, (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Tan S.: An effective refinement strategy for KNN text classifier, Expert Systems with Applications, 30(2):290--298, (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Liao Y. and Vemuri V.R.: Use of K-Nearest Neighbor classifier for intrusion detection, Computers & Security, 21(5):439--448, (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Chikh M.A., Saidi M., and Settouti N.: Diagnosis of Diabetes Diseases Using an Artificial Immune Recognition System2 (AIRS2) with Fuzzy K-nearest Neighbor, Journal of Medical Systems, Online, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Liu D.Y., Chen H.L., Yang B., Lv X.E., Li L.N., and Liu J.: Design of an Enhanced Fuzzy k-nearest Neighbor Classifier Based Computer Aided Diagnostic System for Thyroid Disease, Journal of Medical Systems, Online, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Arif M., Malagore I.A., and Afsar F.A.: Detection and Localization of Myocardial Infarction using K-nearest Neighbor Classifier, Journal of Medical Systems, 36(1): 279--289, (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Mejdoub M. and Amar C.B.: Classification improvement of local feature vectors over the KNN algorithm, Multimedia Tools and Applications, Online, (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Aronson A.R.: Exploiting a Large Thesaurus for Information Retrieval. RIAO: 197--217, (1994).Google ScholarGoogle Scholar
  44. Scott S. and Matwin S.: Text Classification Using WordNet Hypernyms, USE OF WORDNET IN NATURAL LANGUAGE PROCESSING SYSTEMS, pp. 38--44, (1998).Google ScholarGoogle Scholar
  45. Yang, T.: Computational Verb Decision Trees. International Journal of Computational Cognition, pp. 34--46, (2006).Google ScholarGoogle Scholar
  46. Munkres, J.: Algorithms for the Assignment and Transportation Problems. Journal of the Society for Industrial and Applied Mathematics, 5(1):32--38 (1957).Google ScholarGoogle Scholar
  47. Tsatsaronis G., Varlamis I., and Vazirgiannis M.: Text relatedness based on a word thesaurus. Journal of Artificial Intelligence Research 37, no. 1: 1--40 (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Lloréns J., and Astudillo H.: Automatic generation of hierarchical taxonomies from free text using linguistic algorithms. Advances in Object-Oriented Information Systems. Springer Berlin Heidelberg, 74--83 (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Tashakori M., Meybodi M.R., and Oroumchian F.: Bon: The Persian Stemmer. First EurAsian Conference on Information and Communication Technology, pp. 487--494 (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploring Weights of Hierarchical and Equivalency Relationship in General Persian Texts

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              EANN '15: Proceedings of the 16th International Conference on Engineering Applications of Neural Networks (INNS)
              September 2015
              266 pages
              ISBN:9781450335805
              DOI:10.1145/2797143

              Copyright © 2015 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 25 September 2015

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited

              Acceptance Rates

              EANN '15 Paper Acceptance Rate36of60submissions,60%Overall Acceptance Rate36of60submissions,60%
            • Article Metrics

              • Downloads (Last 12 months)0
              • Downloads (Last 6 weeks)0

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader