| CLOPE: a fast and effective clustering algorithm for transactional data |
| Full text |
Pdf
(622 KB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Edmonton, Alberta, Canada
POSTER SESSION: Poster papers
table of contents
Pages: 682 - 687
Year of Publication: 2002
ISBN:1-58113-567-X
|
|
Authors
|
|
Yiling Yang
|
Shanghai Jiao Tong University, Shanghai, P.R.China
|
|
Xudong Guan
|
Shanghai Jiao Tong University, Shanghai, P.R.China
|
|
Jinyuan You
|
Shanghai Jiao Tong University, Shanghai, P.R.China
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 48, Citation Count: 6
|
|
|
ABSTRACT
This paper studies the problem of categorical data clustering, especially for transactional data characterized by high dimensionality and large volume. Starting from a heuristic method of increasing the height-to-width ratio of the cluster histogram, we develop a novel algorithm -- CLOPE, which is very fast and scalable, while being quite effective. We demonstrate the performance of our algorithm on two real world datasets, and compare CLOPE with the state-of-art algorithms.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Rakesh Agrawal , Johannes Gehrke , Dimitrios Gunopulos , Prabhakar Raghavan, Automatic subspace clustering of high dimensional data for data mining applications, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.94-105, June 01-04, 1998, Seattle, Washington, United States
|
 |
2
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
| |
3
|
Cooley, R., Mohasher, B., and Srivastava, J. Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems, 1(1):5--32, 1999.
|
| |
4
|
Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. A densitybased algorithm for discovering clusters in large spatial databases with noise. In Proc. KDD'96, Portland, Oregon, 1996.
|
 |
5
|
Venkatesh Ganti , Johannes Gehrke , Raghu Ramakrishnan, CACTUS—clustering categorical data using summaries, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.73-83, August 15-18, 1999, San Diego, California, United States
[doi> 10.1145/312129.312201]
|
| |
6
|
|
| |
7
|
|
| |
8
|
Han. E.H., Karypis G., Kumar, V., and Mobashad, B. Clustering based on association rule hypergraphs. In Proc. SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, 1997.
|
 |
9
|
Jiawei Han , Jian Pei , Yiwen Yin, Mining frequent patterns without candidate generation, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.1-12, May 15-18, 2000, Dallas, Texas, United States
|
| |
10
|
Huang Z. A fast clustering algorithm to cluster very large categorical data sets in data mining. In Proc. SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, 1997.
|
| |
11
|
MacQueen, J.B. Some methods for classification and analysis of multivariate observations. In Proc. 5th Berkeley Symposium on Math. Stat. and Prob., 1967.
|
| |
12
|
|
 |
13
|
Ke Wang , Chu Xu , Bing Liu, Clustering transactions using large items, Proceedings of the eighth international conference on Information and knowledge management, p.483-490, November 02-06, 1999, Kansas City, Missouri, United States
[doi> 10.1145/319950.320054]
|
 |
14
|
Tian Zhang , Raghu Ramakrishnan , Miron Livny, BIRCH: an efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.103-114, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
15
|
Zhao, Y. and Karypis, G. Criterion functions for document clustering: experiments and analysis. Tech. Report #01--40, Department of Comp. Sci. & Eng., U. Minnesota, 2001. Avaliable as: http://www-users.itlabs.umn.edu/~karypis/publications/Papers/Postscript/vscluster.ps
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE conference on Design automation
Gwo-Dong Chen
, Daniel D. Gajski
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
|