ACM Home Page
Please provide us with feedback. Feedback
SyMP: an efficient clustering approach to identify clusters of arbitrary shapes in large data sets
Full text PdfPdf (696 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Edmonton, Alberta, Canada
POSTER SESSION: Poster papers table of contents
Pages: 507 - 512  
Year of Publication: 2002
ISBN:1-58113-567-X
Author
Hichem Frigui  University of Memphis
Sponsors
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
: AAAI
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 37,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/775047.775121
What is a DOI?

ABSTRACT

We propose a new clustering algorithm, called SyMP, which is based on synchronization of pulse-coupled oscillators. SyMP represents each data point by an Integrate-and-Fire oscillator and uses the relative similarity between the points to model the interaction between the oscillators. SyMP is robust to noise and outliers, determines the number of clusters in an unsupervised manner, identifies clusters of arbitrary shapes, and can handle very large data sets. The robustness of SyMP is an intrinsic property of the synchronization mechanism. To determine the optimum number of clusters, SyMP uses a dynamic resolution parameter. To identify clusters of various shapes, SyMP models each cluster by multiple Gaussian components. The number of components is automatically determined using a dynamic intra-cluster resolution parameter. Clusters with simple shapes would be modeled by few components while clusters with more complex shapes would require a larger number of components. The scalable version of SyMP uses an efficient incremental approach that requires a simple pass through the data set. The proposed clustering approach is empirically evaluated with several synthetic and real data sets, and its performance is compared with CURE.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
P. Bradley, U. Fayyad, and C. Reina. Scaling clustering algorithms to large databases. In Proc. of the ACM SIGKDD, 1998.
 
3
P. Bradley, U. Fayyad, and C. Reina. Scaling EM clustering to large databases. Technical Report MSR-TR-98-35, Microsoft Research, 1998.
 
4
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39(1):1--38, 1977.
 
5
M. Ester, H. P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proc. of the ACM SIGKDD, 1996.
6
 
7
8
9
 
10
A. Hinneburg and D. Keim. An efficient approach to clustering in large multimedia databases with noise. In Proc. of the ACM SIGKDD, 1998.
 
11
L. Kaufman and P. Rousseeuw. Finding Groups in Data. John Wiley and Sons, 1989.
 
12
L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. Addison Wesley, NEW York, 1990.
 
13
R. Krishnapuram, H. Frigui, and O. Nasraoui. Fuzzy and possibilistic shell clustering algorithms and their application to boundary detection and surface approximation I. IEEE Trans. FS, 3(1):29--43, 1995.
 
14
 
15
 
16
 
17
R.O.Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Wiley and Sons, 1973.
 
18
 
19
 
20
21



Peer to Peer - Readers of this Article have also read: