skip to main content
10.1145/1321440.1321450acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Randomized metric induction and evolutionary conceptual clustering for semantic knowledge bases

Published: 06 November 2007 Publication History

Abstract

We present an evolutionary clustering method which can be applied to multi-relational knowledge bases storing semantic resource annotations expressed in the standard languages for the Semantic Web. The method exploits an effective and language-independent semi-distance measure defined for the space of individual resources, that is based on a finite number of dimensions corresponding to a committee of features represented by a group of concept descriptions (discriminating features). We show how to obtain a maximally discriminating group of features through a feature construction method based on genetic programming. The algorithm represents the possible clusterings as strings of central elements (medoids, w.r.t. the given metric) of variable length. Hence, the number of clusters is not needed as a parameter since the method can optimize it by means of the mutation operators and of a proper fitness function. We also show how to assign each cluster with a newly constructed intensional definition in the employed concept language. An experimentation with some ontologies proves the feasibility of our method and its effectiveness in terms of clustering validity indices.

References

[1]
Baader, F., Calvanese, D., McGuinness, D., Nardi, D., and Patel-Schneider, P., Eds. The Description Logic Handbook. Cambridge University Press, 2003.
[2]
Bezdek, J., and Pal, N. Some new indexes of cluster validity. IEEE Transactions on Systems, Man, and Cybernetics 28, 3(1998), 301--315.
[3]
Borgida, A., Walsh, T., and Hirsh, H. Towards measuring similarity in description logics. In Working Notes of the International Description Logics Workshop (Edinburgh, UK, 2005), I. Horrocks, U. Sattler, and F. Wolter, Eds., vol. 147 of CEUR Workshop Proceedings.
[4]
Burke, E., and Kendall, G., Eds. Search Methodologies. Springer, 2005, ch. 7. Simulated Annealing, pp. 187--210.
[5]
d'Amato, C., Fanizzi, N., and Esposito, F. Reasoning by analogy in description logics through instance-based learning. In Proceedings of Semantic Web Applications and Perspectives, 3rd Italian Semantic Web Workshop, SWAP2006 (Pisa, Italy, 2006), G. Tummarello, P. Bouquet, and O. Signore, Eds., vol. 201 of CEUR Workshop Proceedings.
[6]
Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. A density-based algorithm for discovering clusters in large spatial databases. In Proceedings of the 2nd Conference of ACM SIGKDD (1996), pp. 226--231.
[7]
Fanizzi, N., d'Amato, C., and Esposito, F. Induction of optimal semi-distances for individuals based on feature sets. In Working Notes of the 20th International Description Logics Workshop, DL2007 (Bressanone, Italy, 2007), D. Calvanese, E. Franconi, V. Haarslev, D. Lembo, B. Motik, A.-Y. Turhan, and S. Tessaris, Eds., vol. 250 of CEUR Workshop Proceedings.
[8]
Fanizzi, N., Iannone, L., Palmisano, I., and Semeraro, G. Concept formation in expressive Description Logics. In Proceedings of the 15th European Conference on Machine Learning, ECML2004 (2004), J.-F. Boulicaut, F. Esposito, F. Giannotti, and D. Pedreschi, Eds., vol. 3201 of LNAI, Springer, pp. 99--113.
[9]
Ghozeil, A., and Fogel, D. Discovering patterns in spatial data using evolutionary programming. In Genetic Programming 1996: Proceedings of the First Annual Conference (Stanford University, CA, USA, 1996), J. R. Koza, D. E. Goldberg, D. B. Fogel, and R. L. Riolo, Eds., MIT Press, pp. 521--527.
[10]
Halkidi, M., Batistakis, Y., and Vazirgiannis, M. On clustering validation techniques. Journal of Intelligent Information Systems 17, 2-3 (2001), 107--145.
[11]
Hall, L. O., Özyurt, I. B., and Bezdek, J. C. Clustering with a genetically optimized approach. IEEE Trans. Evolutionary Computation 3, 2 (1999), 103--112.
[12]
Hirano, S., and Tsumoto, S. An indiscernibility-based clustering method. In 2005 IEEE International Conference on Granular Computing (2005), X. Hu, Q. Liu, A. Skowron, T. Y. Lin, R. Yager, and B. Zhang, Eds., IEEE, pp. 468--473.
[13]
Iannone, L., Palmisano, I., and Fanizzi, N. An algorithm based on counterfactuals for concept learning in the semantic web. Applied Intelligence 26, 2 (2007), 139--159.
[14]
Jain, A., Murty, M., and Flynn, P. Data clustering: A review. ACM Computing Surveys 31, 3 (1999), 264--323.
[15]
Kaufman, L., and Rousseeuw, P. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, 1990.
[16]
Kietz, J.-U., and Morik, K. A polynomial approach to the constructive induction of structural knowledge. Machine Learning 14, 2 (1994), 193--218.
[17]
Lee, C.-Y., and Antonsson, E. K. Variable length genomes for evolutionary algorithms. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO00 (2000), L. Whitley, D. Goldberg, E. Cantú-Paz, L. Spector, I. Parmee, and H.-G. Beyer, Eds., Morgan Kaufmann, p. 806.
[18]
Lehmann, J. Concept learning in description logics. Master's thesis, Dresden University of Technology, 2006.
[19]
Nasraoui, O., and Krishnapuram, R. One step evolutionary mining of context sensitive associations and web navigation patterns. In Proceedings of the SIAM conference on Data Mining (Arlington, VA, 2002), pp. 531--547.
[20]
Ng, R., and Han, J. Efficient and effective clustering method for spatial data mining. In Proceedings of the 20th Conference on Very Large Databases, VLDB94 (1994), pp. 144--155.
[21]
Nienhuys-Cheng, S.-H. Distances and limits on herbrand interpretations. In Proceedings of the 8th International Workshop on Inductive Logic Programming, ILP98 (1998), D. Page, Ed., vol. 1446 of LNAI, Springer, pp. 250--260.
[22]
Pawlak, Z. Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishers, 1991.
[23]
Sebag, M. Distance induction in first order logic. In Proceedings of the 7th International Workshop on Inductive Logic Programming, ILP97 (1997), S. Džeroski and N. Lavrač, Eds., vol. 1297 of LNAI, Springer, pp. 264--272.
[24]
Stepp, R. E., and Michalski, R. S. Conceptual clustering of structured objects: A goal-oriented approach. Artificial Intelligence 28, 1 (Feb. 1986), 43--69.
[25]
Zezula, P., Amati, G., Dohnal, V., and Batko, M. Similarity Search The Metric Space Approach. Advances in database Systems. Springer, 2007.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
November 2007
1048 pages
ISBN:9781595938039
DOI:10.1145/1321440
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 November 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. conceptual clustering
  2. description logics
  3. evolutionary algorithms
  4. genetic programming
  5. metric learning
  6. randomized optimization
  7. unsupervised learning

Qualifiers

  • Research-article

Conference

CIKM07

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Creative AISemantic Web10.3233/SW-19037711:1(69-78)Online publication date: 1-Jan-2020
  • (2013)Representing Uncertain Concepts in Rough Description Logics via Contextual Indiscernibility RelationsUncertainty Reasoning for the Semantic Web II10.1007/978-3-642-35975-0_16(300-314)Online publication date: 2013
  • (2009)Metric-based stochastic conceptual clustering for ontologiesInformation Systems10.1016/j.is.2009.03.00834:8(792-806)Online publication date: Dec-2009
  • (2009)Connectionist Models for Formal Knowledge AdaptationProceedings of the 19th International Conference on Artificial Neural Networks: Part II10.1007/978-3-642-04277-5_47(465-474)Online publication date: 2-Oct-2009
  • (2008)Conceptual clustering and its application to concept drift and novelty detectionProceedings of the 5th European semantic web conference on The semantic web: research and applications10.5555/1789394.1789426(318-332)Online publication date: 1-Jun-2008
  • (2008)Evolutionary Clustering in Description LogicsProceedings of the 19th international conference on Database and Expert Systems Applications10.1007/978-3-540-85654-2_73(808-821)Online publication date: 1-Sep-2008
  • (undefined)Induction of Robust Classifiers for Web Ontologies Through Kernel MachinesSSRN Electronic Journal10.2139/ssrn.3198934

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media