skip to main content
10.1145/3097983.3098156acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Robust Spectral Clustering for Noisy Data: Modeling Sparse Corruptions Improves Latent Embeddings

Published:13 August 2017Publication History

ABSTRACT

Spectral clustering is one of the most prominent clustering approaches. However, it is highly sensitive to noisy input data. In this work, we propose a robust spectral clustering technique able to handle such scenarios. To achieve this goal, we propose a sparse and latent decomposition of the similarity graph used in spectral clustering. In our model, we jointly learn the spectral embedding as well as the corrupted data - thus, enhancing the clustering performance overall. We propose algorithmic solutions to all three established variants of spectral clustering, each showing linear complexity in the number of edges. Our experimental analysis confirms the significant potential of our approach for robust spectral clustering. Supplementary material is available at www.kdd.in.tum.de/RSC.

References

  1. C. C. Aggarwal and C. K. Reddy. Data clustering: algorithms and applications. CRC Press, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  2. E. J. Candès, X. Li, Y. Ma, and J. Wright. Robust principal component analysis? JACM, 58(3):11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. H. Chang and D.-Y. Yeung. Robust path-based spectral clustering. Pattern Recognition, 41(1):191--203, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Condon and R. M. Karp. Algorithms for graph partitioning on the planted partition model. Random Structures and Algorithms, 18(2):116--140, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Davis and W. M. Kahan. The rotation of eigenvectors by a perturbation. iii. SIAM SINUM, 7(1):1--46, 1970. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. N. Günnemann, S. Günnemann, and C. Faloutsos. Robust multivariate autoregression for anomaly detection in dynamic product ratings. In WWW, pages 361--372, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Günnemann, I. Farber, S. Raubach, and T. Seidl. Spectral subspace clustering for graphs with feature vectors. In ICDM, pages 231--240, 2013. Google ScholarGoogle ScholarCross RefCross Ref
  8. S. Günnemann, N. Günnemann, and C. Faloutsos. Detecting anomalies in dynamic rating data. In KDD, pages 841--850, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. Huang, S. Yoo, H. Qin, and D. Yu. A robust clustering algorithm based on aggregated heat kernel mapping. In ICDM, pages 270--279, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. F. Jordan and F. Bach. Learning spectral clustering. Adv. Neural Inf. Process. Syst, 16:305--312, 2004.Google ScholarGoogle Scholar
  11. W. Kong, S. Hu, J. Zhang, and G. Dai. Robust and smart spectral clustering from normalized cut. Neural Computing and Applications, 23(5):1503--1512, 2013. Google ScholarGoogle ScholarCross RefCross Ref
  12. D. Lehmann, L. I. Oćallaghan, and Y. Shoham. Truth revelation in approximately efficient combinatorial auctions. Journal of the ACM, 49(5):577--602, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. X. Li, W. Hu, C. Shen, A. Dick, and Z. Zhang. Context-aware hypergraph construction for robust spectral clustering. TKDE, 26(10):2588--2597, 2014. Google ScholarGoogle ScholarCross RefCross Ref
  14. Z. Li, J. Liu, S. Chen, and X. Tang. Noise robust spectral clustering. In ICCV, pages 1--8, 2007. Google ScholarGoogle ScholarCross RefCross Ref
  15. B. A. Miller, M. S. Beard, P. J. Wolfe, and N. T. Bliss. A spectral framework for anomalous subgraph detection. Trans. on Signal Proc., 63(16):4191--4206, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Pfeiffer and F. Rothlauf. Analysis of greedy heuristics and weight-coded eas for multidimensional knapsack problems and multi-unit combinatorial auctions. In GECCO, pages 1529--1529, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. J. Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Computational and applied mathematics, 20:53--65, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. J. Rousseeuw and A. M. Leroy. Robust regression and outlier detection, volume 589. John Wiley, 2005.Google ScholarGoogle Scholar
  19. G. Stewart and J.-G. Sun. Matrix Perturbation Theory. Academic Press Boston, 1990.Google ScholarGoogle Scholar
  20. U. von Luxburg. A tutorial on spectral clustering. Statistics and Computing, 17(4):395--416, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. X. Wang, F. Nie, and H. Huang. Structured doubly stochastic matrix for graph based clustering. In KDD, pages 1245--1254, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L. Wu, X. Ying, X. Wu, and Z. Zhou. Line orthogonality in adjacency eigenspace with application to community partition. In IJCAI, pages 2349--2354, 2011.Google ScholarGoogle Scholar
  23. L. Zelnik-Manor and P. Perona. Self-tuning spectral clustering. In NIPS, pages 1601--1608, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. X. Zhu, C. Loy, and S. Gong. Constructing robust affinity graphs for spectral clustering. In CVPR, pages 1450--1457, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Robust Spectral Clustering for Noisy Data: Modeling Sparse Corruptions Improves Latent Embeddings

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
            August 2017
            2240 pages
            ISBN:9781450348874
            DOI:10.1145/3097983

            Copyright © 2017 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 13 August 2017

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            KDD '17 Paper Acceptance Rate64of748submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%

            Upcoming Conference

            KDD '24

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader