ACM Home Page
Please provide us with feedback. Feedback
Deriving quantitative models for correlation clusters
Full text pdf formatPdf (972 KB)
Source Conference on Knowledge Discovery in Data archive
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Philadelphia, PA, USA
SESSION: Research track papers table of contents
Pages: 4 - 13  
Year of Publication: 2006
ISBN:1-59593-339-5
Authors
Elke Achtert  Ludwig-Maximilians-Universität München, Munich, Germany
Christian Böhm  Ludwig-Maximilians-Universität München, Munich, Germany
Hans-Peter Kriegel  Ludwig-Maximilians-Universität München, Munich, Germany
Peer Kröger  Ludwig-Maximilians-Universität München, Munich, Germany
Arthur Zimek  Ludwig-Maximilians-Universität München, Munich, Germany
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 139,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1150402.1150408
What is a DOI?

ABSTRACT

Correlation clustering aims at grouping the data set into correlation clusters such that the objects in the same cluster exhibit a certain density and are all associated to a common arbitrarily oriented hyperplane of arbitrary dimensionality. Several algorithms for this task have been proposed recently. However, all algorithms only compute the partitioning of the data into clusters. This is only a first step in the pipeline of advanced data analysis and system modelling. The second (post-clustering) step of deriving a quantitative model for each correlation cluster has not been addressed so far. In this paper, we describe an original approach to handle this second step. We introduce a general method that can extract quantitative information on the linear dependencies within a correlation clustering. Our concepts are independent of the clustering model and can thus be applied as a post-processing step to any correlation clustering algorithm. Furthermore, we show how these quantitative models can be used to predict the probability distribution that an object is created by these models. Our broad experimental evaluation demonstrates the beneficial impact of our method on several applications of significant practical importance.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
E. Achtert, C. Böhm, H.-P. Kriegel, P. Kröger, and A. Zimek. Robust, complete, and efficient correlation clustering. submitted.
2
3
4
 
5
6
 
7
 
8
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1--31, 1977.
 
9
M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd ACM International Conference on Knowledge Discovery and Data Mining (KDD), Portland, OR, 1996.
 
10
E. Georgii, L. Richter, U. Rückert, and S. Kramer. Analyzing microarray data using quantitative association rules. Bioinformatics, 21(Suppl. 2):ii1--ii8, 2005.
 
11
 
12
J. A. Hartigan. Direct clustering of a data matrix. Journal of the American Statistical Association, 67(337):123--129, 1972.
 
13
D. Husmeier. Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics, 19(17):2271--2282, 2003.
 
14
K. Kailing, H.-P. Kriegel, and P. Kröger. Density-connected subspace clustering for high-dimensional data. In Proceedings of the 4th SIAM International Conference on Data Mining (SDM), Orlando, FL, 2004.
 
15
 
16
17
 
18
19
20
21
22
 
23
 
24
 
25
T. Yuster. The reduced row echelon form of a matrix is unique: A simple proof. Mathematics Magazine, 57(2):93--94, 1984.


Collaborative Colleagues:
Elke Achtert: colleagues
Christian Böhm: colleagues
Hans-Peter Kriegel: colleagues
Peer Kröger: colleagues
Arthur Zimek: colleagues