ABSTRACT
In supervised kernel methods, it has been observed that the performance of the SVM classifier is poor in cases where the diagonal entries of the Gram matrix are large relative to the off-diagonal entries. This problem, referred to as diagonal dominance, often occurs when certain kernel functions are applied to sparse high-dimensional data, such as text corpora. In this paper we investigate the implications of diagonal dominance for unsupervised kernel methods, specifically in the task of document clustering. We propose a selection of strategies for addressing this issue, and evaluate their effectiveness in producing more accurate and stable clusterings.
- Cancedda, N., Gaussier, E., Goutte, C., & Renders, J. M. (2003). Word sequence kernels. J. Mach. Learn. Res., 3, 1059--1082. Google ScholarDigital Library
- Dhillon, I., Guan, Y., & Kulis, B. (2004). A unified view of kernel k-means, spectral clustering and graph cuts (Technical Report TR-04-25). UTCS.Google Scholar
- Dhillon, I. S., Guan, Y., & Kogan, J. (2002). Iterative clustering of high dimensional text data augmented by local search. Proceedings of The 2002 IEEE International Conference on Data Mining. Maebashi City, Japan. Google ScholarDigital Library
- Greene, D., & Cunningham, P. (2005). Producing accurate interpretable clusters from high-dimensional data (Technical Report TCD-CS-2005-42). Department of Computer Science, Trinity College Dublin.Google Scholar
- Roth, V., Braun, M., Lange, T., & Buhmann, J. (2002). A resampling approach to cluster validation. Proceedings of the 15th Symposium in Computational Statistics.Google ScholarCross Ref
- Saigo, H., Vert, J.-P., Ueda, N., & Akutsu, T. (2004). Protein homology detection using string alignment kernels. Bioinformatics, 20, 1682--1689. Google ScholarDigital Library
- Schölkopf, B., Smola, A., & Müüller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput., 10, 1299--1319. Google ScholarDigital Library
- Schölkopf, B., & Smola, A. J. (2001). Learning with kernels: Support vector machines, regularization, optimization, and beyond. Cambridge, MA, USA: MIT Press. Google ScholarDigital Library
- Schölkopf, B., Weston, J., Eskin, E., Leslie, C., & Noble, W. S. (2002). A kernel approach for learning from almost orthogonal patterns. ECML '02: Proceedings of the 13th European Conference on Machine Learning (pp. 511--528). London, UK: Springer-Verlag. Google ScholarDigital Library
- Smola, A. J., & Bartlett, P. J. (Eds.). (2000). Advances in large margin classifiers. Cambridge, MA, USA: MIT Press. Google ScholarDigital Library
- Strehl, A., & Ghosh, J. (2002). Cluster ensembles - a knowledge reuse framework for combining multiple partitions. JMLR, 3, 583--617. Google ScholarDigital Library
- Tao, Q., Scott, S., Vinodchandran, N. V., Osugi, T. T., & Mueller, B. (2004). An extended kernel for generalized multiple-instance learning. 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004) (pp. 272--277). Google ScholarDigital Library
- Wu, G., Chang, E., & Zhang, Z. (2005). An analysis of transformation on non-positive semidefinite similarity matrix for kernel machines (Technical Report). UCSB.Google Scholar
Index Terms
- Practical solutions to the problem of diagonal dominance in kernel document clustering
Recommendations
Multilevel ILU With Reorderings for Diagonal Dominance
This paper presents a preconditioning method based on combining two-sided permutations with a multilevel approach. The nonsymmetric permutation exploits a greedy strategy to put large entries of the matrix in the diagonal of the upper leading submatrix. ...
Diagonal dominance, Schur complements and some classes of H-matrices and P-matrices
We analyze a class of matrices generalizing strictly diagonally dominant matrices and included in the important class of H-matrices. Adequate pivoting strategies and the corresponding Schur complements are studied. A new class of matrices with all their ...
Comments