ABSTRACT
Clustering is a widely used unsupervised data mining technique. In density-based clustering, a cluster is defined as a connected dense component and grows in the direction set by the density. In this paper we present a software system called DBStrata that implements the density-based clustering architecture together with several extensions able to boost the clustering performances and to efficiently identify outliers.
- Ester, M.; Kriegel, H. P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. Proc. KDD. 1996. 226--231. Vol. 96.Google Scholar
- Jin, W.; Tung, A.; Han, J.; Wang, W. Ranking outliers using symmetric neighborhood relationship. Advances in Knowledge Discovery and Data Mining. Springer, 2006. p. 577--593. Google ScholarDigital Library
- Ankerst, M., Breunig, M. M., Kriegel, H. P. and J. Sander. Optics: Ordering points to identify the clustering structure. ACM SIGMOD Record, 28(2):60, 1999. Google ScholarDigital Library
- Sander, J.; Qin, X.; Lu, Z.; Niu, N.; Kovarsky, A. Automatic extraction of clusters from hierarchical clustering representations. Advances in Knowledge Discovery and Data Mining. Springer, 2003. p. 567. Google ScholarDigital Library
- M. Daszykowski, B. Walczak, D. L. Massart, Looking for natural patterns in analytical data. Part 2. Tracing local density with OPTICS, Journal of Chemical Information and Computer Sciences 42 (2002) 500--507.Google ScholarCross Ref
- Berkhin, P. A survey of clustering data mining techniques. Grouping Multidimensional Data. Springer, 2006. p. 25--71.Google ScholarCross Ref
- C. Cassisi, A. Ferro, R. Giugno, G. Pigola, A. Pulvirenti. Higher Space Embedding to Enhance Density-Based Clustering: Parameter Reduction and Outlier Detection. Submitted 2011.Google Scholar
- Bentley, J. Multidimensional binary search trees used for associative searching. Commun. ACM, 1975. 18 (9), pp. 509--517. Google ScholarDigital Library
- A. Hinneburg, D. A. Keim: An Efficient Approach to Clustering in Large Multimedia Databases with Noise. Proceedings of the 4 th International Conference on Knowledge Discovery and Data Mining, New York, 1998.Google Scholar
Index Terms
- DBStrata: a system for density-based clustering and outlier detection based on stratification
Recommendations
A new hybrid method based on partitioning-based DBSCAN and ant clustering
Clustering problem is an unsupervised learning problem. It is a procedure that partition data objects into matching clusters. The data objects in the same cluster are quite similar to each other and dissimilar in the other clusters. Density-based ...
AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities
Clustering is a typical data mining technique that partitions a dataset into multiple subsets of similar objects according to similarity metrics. In particular, density-based algorithms can find clusters of different shapes and sizes while remaining ...
Density-based semi-supervised clustering
Semi-supervised clustering methods guide the data partitioning and grouping process by exploiting background knowledge, among else in the form of constraints. In this study, we propose a semi-supervised density-based clustering method. Density-based ...
Comments