ACM Home Page
Please provide us with feedback. Feedback
CLUC: a natural clustering algorithm for categorical datasets based on cohesion
Full text PdfPdf (69 KB)
Source Symposium on Applied Computing archive
Proceedings of the 2006 ACM symposium on Applied computing table of contents
Dijon, France
SESSION: Poster papers table of contents
Pages: 637 - 638  
Year of Publication: 2006
ISBN:1-59593-108-2
Authors
Aida Nemalhabib  Concordia University, Montreal, Quebec, Canada
Nematollaah Shiri  Concordia University, Montreal, Quebec, Canada
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 0,   Downloads (12 Months): 25,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1141277.1141422
What is a DOI?

ABSTRACT

We propose a clustering algorithm for categorical datasets, called CLUC (CLUstering with Cohesion), which uses a novel similarity measure, called cohesion, to determine the degree with which items/objects stick to clusters. We have implemented CLUC and carried out extensive experiments on real-life and synthetic datasets. The results of experiments and their analyses indicate that CLUC generates high quality clusters in that they conform to expert's opinion. Our experiments on large synthetic data confirm that CLUC is scalable when the dataset grows in the number of objects and/or dimensions. We also repeated the experiments with different orders of the items in the datasets. The results show that the proposed algorithm is order insensitive


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
P. Andritson, P. Tsaparas, R. J. Miller, and K. C. Seveik. LIMBO: Scalable clustering of categorical data. In EDBT, 2004.
 
2
D. Barbara, Y. Li, and J. Couto. Coolcat: An entropy-based algorithm for categorical clustering. In ACM Press, pages 582--589, 2002.
 
3
S. Guha, R. Rastogi, and K. Shim. Rock: A robust clustering algorithm for categorical attributes. In ICDE, 1999.
 
4
 
5
 
6
Y. Yiling, G. Xudong, and Y. Jinyuan, CLOPE: A fast and effective clustering algorithm for transactional data, In. KDD, pp 682--687, 2002.
 
7
UCI Machine Learning Repository, http://www.ics.uci.edu/~mlear/MLRepository.html.
 
8
Datagen http://www2.dcc.ufmg.br/~meira/ch/tp2/datgen

Collaborative Colleagues:
Aida Nemalhabib: colleagues
Nematollaah Shiri: colleagues