skip to main content
10.1145/1321440.1321450acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Randomized metric induction and evolutionary conceptual clustering for semantic knowledge bases

Published:06 November 2007Publication History

ABSTRACT

We present an evolutionary clustering method which can be applied to multi-relational knowledge bases storing semantic resource annotations expressed in the standard languages for the Semantic Web. The method exploits an effective and language-independent semi-distance measure defined for the space of individual resources, that is based on a finite number of dimensions corresponding to a committee of features represented by a group of concept descriptions (discriminating features). We show how to obtain a maximally discriminating group of features through a feature construction method based on genetic programming. The algorithm represents the possible clusterings as strings of central elements (medoids, w.r.t. the given metric) of variable length. Hence, the number of clusters is not needed as a parameter since the method can optimize it by means of the mutation operators and of a proper fitness function. We also show how to assign each cluster with a newly constructed intensional definition in the employed concept language. An experimentation with some ontologies proves the feasibility of our method and its effectiveness in terms of clustering validity indices.

References

  1. Baader, F., Calvanese, D., McGuinness, D., Nardi, D., and Patel-Schneider, P., Eds. The Description Logic Handbook. Cambridge University Press, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bezdek, J., and Pal, N. Some new indexes of cluster validity. IEEE Transactions on Systems, Man, and Cybernetics 28, 3(1998), 301--315. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Borgida, A., Walsh, T., and Hirsh, H. Towards measuring similarity in description logics. In Working Notes of the International Description Logics Workshop (Edinburgh, UK, 2005), I. Horrocks, U. Sattler, and F. Wolter, Eds., vol. 147 of CEUR Workshop Proceedings.Google ScholarGoogle Scholar
  4. Burke, E., and Kendall, G., Eds. Search Methodologies. Springer, 2005, ch. 7. Simulated Annealing, pp. 187--210.Google ScholarGoogle ScholarCross RefCross Ref
  5. d'Amato, C., Fanizzi, N., and Esposito, F. Reasoning by analogy in description logics through instance-based learning. In Proceedings of Semantic Web Applications and Perspectives, 3rd Italian Semantic Web Workshop, SWAP2006 (Pisa, Italy, 2006), G. Tummarello, P. Bouquet, and O. Signore, Eds., vol. 201 of CEUR Workshop Proceedings.Google ScholarGoogle Scholar
  6. Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. A density-based algorithm for discovering clusters in large spatial databases. In Proceedings of the 2nd Conference of ACM SIGKDD (1996), pp. 226--231.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Fanizzi, N., d'Amato, C., and Esposito, F. Induction of optimal semi-distances for individuals based on feature sets. In Working Notes of the 20th International Description Logics Workshop, DL2007 (Bressanone, Italy, 2007), D. Calvanese, E. Franconi, V. Haarslev, D. Lembo, B. Motik, A.-Y. Turhan, and S. Tessaris, Eds., vol. 250 of CEUR Workshop Proceedings.Google ScholarGoogle Scholar
  8. Fanizzi, N., Iannone, L., Palmisano, I., and Semeraro, G. Concept formation in expressive Description Logics. In Proceedings of the 15th European Conference on Machine Learning, ECML2004 (2004), J.-F. Boulicaut, F. Esposito, F. Giannotti, and D. Pedreschi, Eds., vol. 3201 of LNAI, Springer, pp. 99--113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ghozeil, A., and Fogel, D. Discovering patterns in spatial data using evolutionary programming. In Genetic Programming 1996: Proceedings of the First Annual Conference (Stanford University, CA, USA, 1996), J. R. Koza, D. E. Goldberg, D. B. Fogel, and R. L. Riolo, Eds., MIT Press, pp. 521--527. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Halkidi, M., Batistakis, Y., and Vazirgiannis, M. On clustering validation techniques. Journal of Intelligent Information Systems 17, 2-3 (2001), 107--145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hall, L. O., Özyurt, I. B., and Bezdek, J. C. Clustering with a genetically optimized approach. IEEE Trans. Evolutionary Computation 3, 2 (1999), 103--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hirano, S., and Tsumoto, S. An indiscernibility-based clustering method. In 2005 IEEE International Conference on Granular Computing (2005), X. Hu, Q. Liu, A. Skowron, T. Y. Lin, R. Yager, and B. Zhang, Eds., IEEE, pp. 468--473.Google ScholarGoogle ScholarCross RefCross Ref
  13. Iannone, L., Palmisano, I., and Fanizzi, N. An algorithm based on counterfactuals for concept learning in the semantic web. Applied Intelligence 26, 2 (2007), 139--159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jain, A., Murty, M., and Flynn, P. Data clustering: A review. ACM Computing Surveys 31, 3 (1999), 264--323. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kaufman, L., and Rousseeuw, P. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, 1990.Google ScholarGoogle Scholar
  16. Kietz, J.-U., and Morik, K. A polynomial approach to the constructive induction of structural knowledge. Machine Learning 14, 2 (1994), 193--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Lee, C.-Y., and Antonsson, E. K. Variable length genomes for evolutionary algorithms. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO00 (2000), L. Whitley, D. Goldberg, E. Cantú-Paz, L. Spector, I. Parmee, and H.-G. Beyer, Eds., Morgan Kaufmann, p. 806.Google ScholarGoogle Scholar
  18. Lehmann, J. Concept learning in description logics. Master's thesis, Dresden University of Technology, 2006.Google ScholarGoogle Scholar
  19. Nasraoui, O., and Krishnapuram, R. One step evolutionary mining of context sensitive associations and web navigation patterns. In Proceedings of the SIAM conference on Data Mining (Arlington, VA, 2002), pp. 531--547.Google ScholarGoogle ScholarCross RefCross Ref
  20. Ng, R., and Han, J. Efficient and effective clustering method for spatial data mining. In Proceedings of the 20th Conference on Very Large Databases, VLDB94 (1994), pp. 144--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Nienhuys-Cheng, S.-H. Distances and limits on herbrand interpretations. In Proceedings of the 8th International Workshop on Inductive Logic Programming, ILP98 (1998), D. Page, Ed., vol. 1446 of LNAI, Springer, pp. 250--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Pawlak, Z. Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishers, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Sebag, M. Distance induction in first order logic. In Proceedings of the 7th International Workshop on Inductive Logic Programming, ILP97 (1997), S. Džeroski and N. Lavrač, Eds., vol. 1297 of LNAI, Springer, pp. 264--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Stepp, R. E., and Michalski, R. S. Conceptual clustering of structured objects: A goal-oriented approach. Artificial Intelligence 28, 1 (Feb. 1986), 43--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Zezula, P., Amati, G., Dohnal, V., and Batko, M. Similarity Search The Metric Space Approach. Advances in database Systems. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Randomized metric induction and evolutionary conceptual clustering for semantic knowledge bases

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader