|
ABSTRACT
We propose a new clustering algorithm, called SyMP, which is based on synchronization of pulse-coupled oscillators. SyMP represents each data point by an Integrate-and-Fire oscillator and uses the relative similarity between the points to model the interaction between the oscillators. SyMP is robust to noise and outliers, determines the number of clusters in an unsupervised manner, identifies clusters of arbitrary shapes, and can handle very large data sets. The robustness of SyMP is an intrinsic property of the synchronization mechanism. To determine the optimum number of clusters, SyMP uses a dynamic resolution parameter. To identify clusters of various shapes, SyMP models each cluster by multiple Gaussian components. The number of components is automatically determined using a dynamic intra-cluster resolution parameter. Clusters with simple shapes would be modeled by few components while clusters with more complex shapes would require a larger number of components. The scalable version of SyMP uses an efficient incremental approach that requires a simple pass through the data set. The proposed clustering approach is empirically evaluated with several synthetic and real data sets, and its performance is compared with CURE.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Rakesh Agrawal , Johannes Gehrke , Dimitrios Gunopulos , Prabhakar Raghavan, Automatic subspace clustering of high dimensional data for data mining applications, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.94-105, June 01-04, 1998, Seattle, Washington, United States
|
| |
2
|
P. Bradley, U. Fayyad, and C. Reina. Scaling clustering algorithms to large databases. In Proc. of the ACM SIGKDD, 1998.
|
| |
3
|
P. Bradley, U. Fayyad, and C. Reina. Scaling EM clustering to large databases. Technical Report MSR-TR-98-35, Microsoft Research, 1998.
|
| |
4
|
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39(1):1--38, 1977.
|
| |
5
|
M. Ester, H. P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proc. of the ACM SIGKDD, 1996.
|
 |
6
|
|
| |
7
|
|
 |
8
|
Sudipto Guha , Rajeev Rastogi , Kyuseok Shim, CURE: an efficient clustering algorithm for large databases, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.73-84, June 01-04, 1998, Seattle, Washington, United States
|
 |
9
|
|
| |
10
|
A. Hinneburg and D. Keim. An efficient approach to clustering in large multimedia databases with noise. In Proc. of the ACM SIGKDD, 1998.
|
| |
11
|
L. Kaufman and P. Rousseeuw. Finding Groups in Data. John Wiley and Sons, 1989.
|
| |
12
|
L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. Addison Wesley, NEW York, 1990.
|
| |
13
|
R. Krishnapuram, H. Frigui, and O. Nasraoui. Fuzzy and possibilistic shell clustering algorithms and their application to boundary detection and surface approximation I. IEEE Trans. FS, 3(1):29--43, 1995.
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
R.O.Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Wiley and Sons, 1973.
|
| |
18
|
|
| |
19
|
|
| |
20
|
|
 |
21
|
Tian Zhang , Raghu Ramakrishnan , Miron Livny, BIRCH: an efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.103-114, June 04-06, 1996, Montreal, Quebec, Canada
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE conference on Design automation
Gwo-Dong Chen
, Daniel D. Gajski
|