skip to main content
10.1145/3038912.3052653acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article
Public Access

Scalable Motif-aware Graph Clustering

Published: 03 April 2017 Publication History

Abstract

We develop new methods based on graph motifs for graph clustering, allowing more efficient detection of communities within networks. We focus on triangles within graphs, but our techniques extend to other clique motifs as well. Our intuition, which has been suggested but not formalized similarly in previous works, is that triangles are a better signature of community than edges. We therefore generalize the notion of conductance for a graph to triangle conductance, where the edges are weighted according to the number of triangles containing the edge. This methodology allows us to develop variations of several existing clustering techniques, including spectral clustering, that minimize triangles split by the cluster instead of edges cut by the cluster. We provide theoretical results in a planted partition model to demonstrate the potential for triangle conductance in clustering problems. We then show experimentally the effectiveness of our methods to multiple applications in machine learning and graph mining.

References

[1]
Mace. http://research.nii.ac.jp/~uno/codes.htm.
[2]
Stanford network analysis project. http://snap.stanford.edu/data/index.html.
[3]
B. Abrahao, S. Soundarajan, J. Hopcroft, and R. Kleinberg. On the separability of structural classes of communities. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, 2012.
[4]
B. Adamcsek, G. Palla, I. J. Farkas, I. Derényi, and T. Vicsek. Cfinder: locating cliques and overlapping modules in biological networks. Bioinformatics, 22(8):1021--1023, 2006.
[5]
N. Alon, Z. Galil, and V. D. Milman. Better expanders and superconcentrators. Journal of Algorithms, 8(3):337--347, 1987.
[6]
N. Alon and V. D. Milman. λ<sub>1</sub>, isoperimetric inequalities for graphs, and superconcentrators. Journal of Combinatorial Theory, Series B, 38(1):73--88, 1985.
[7]
S. Arora, S. Rao, and U. Vazirani. Expander flows, geometric embeddings and graph partitioning. Journal of the ACM (JACM), 56(2):5, 2009.
[8]
L. Backstrom and J. Kleinberg. Network bucket testing. In Proceedings of the 20th international conference on World wide web, pages 615--624. ACM, 2011.
[9]
A. R. Benson, D. F. Gleich, and J. Leskovec. Tensor spectral clustering for partitioning higher-order network structures. In Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver, BC, 2015.
[10]
A. R. Benson, D. F. Gleich, and J. Leskovec. Higher-order organization of complex networks. Science, 353(6295):163--166, 2016.
[11]
V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10):P10008, 2008.
[12]
A. Clauset, M. E. Newman, and C. Moore. Finding community structure in very large networks. Physical review E, 70(6):066111, 2004. Implementation available at https://www.cs.unm.edu/~aaron/research/fastmodularity.htm.
[13]
I. Derényi, G. Palla, and T. Vicsek. Clique percolation in random networks. Physical review letters, 94(16):160202, 2005.
[14]
S. v. Dongen. Graph clustering by flow simulation. 2000.
[15]
S. Fortunato. Community detection in graphs. Physics reports, 486(3):75--174, 2010.
[16]
D. Gavinsky, S. Lovett, M. Saks, and S. Srinivasan. A tail bound for read-k families of functions. Random Structures & Algorithms, 2014.
[17]
A. Gionis and C. E. Tsourakakis. Dense subgraph discovery: KDD 2015 tutorial. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015.
[18]
M. Girvan and M. E. J. Newman. Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12):7821--7826, 2002.
[19]
S. Hoory, N. Linial, and A. Wigderson. Expander graphs and their applications. Bulletin of the American Mathematical Society, 43(4):439--561, 2006.
[20]
M. Kolountzakis, G. Miller, R. Peng, C. Tsourakakis. Efficient triangle counting in large graphs via degree-based vertex partitioning. Internet Mathematics 8:161--185, 2012.
[21]
C. Klymko, D. Gleich, and T. G. Kolda. Using triangles to improve community detection in directed networks. arXiv preprint arXiv:1404.5874, 2014.
[22]
T. Leighton and S. Rao. Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. Journal of the ACM (JACM), 46(6):787--832, 1999.
[23]
J. Leskovec, K. Lang, A. Dasgupta, and M. W. Mahoney. Statistical properties of community structure in large social and information networks. In Proceeding of the 17th international conference on World Wide Web, pages 695--704. ACM, 2008.
[24]
A. Louis. Hypergraph markov operators, eigenvalues and approximation algorithms. In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC '15, pages 713--722, New York, NY, USA, 2015. ACM.
[25]
R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon. Network motifs: simple building blocks of complex networks. Science, 298(5594):824--827, 2002.
[26]
N. Mishra, R. Schreiber, I. Stanton, and R. E. Tarjan. Finding strongly knit clusters in social networks. Internet Mathematics, 5(1--2):155--174, 2008.
[27]
M. Mitzenmacher, J. Pachocki, R. Peng, C. E. Tsourakakis, and S. C. Xu. Scalable large near-clique detection in large-scale networks via sampling. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 815--824. ACM, 2015.
[28]
M. E. Newman. Modularity and community structure in networks. Proceedings of the national academy of sciences, 103(23):8577--8582, 2006.
[29]
A. Y. Ng, M. I. Jordan, Y. Weiss, et~al. On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems, 2:849--856, 2002.
[30]
R. Pagh and C. E. Tsourakakis. Colorful triangle counting and a mapreduce implementation. Information Processing Letters, 112(7):277--281, 2012.
[31]
M. Rosvall and C. T. Bergstrom. Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4):1118--1123, 2008.
[32]
V. Satuluri, S. Parthasarathy, and Y. Ruan. Local graph sparsification for scalable clustering. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 721--732. ACM, 2011.
[33]
S. E. Schaeffer. Graph clustering. Computer Science Review, 1(1):27--64, 2007.
[34]
O. Sporns and R. Kötter. Motifs in brain networks. PLoS Biol, 2(11):e369, 2004.
[35]
S. L. Tauro, C. Palmer, G. Siganos, and M. Faloutsos. A simple conceptual model for the internet topology. In Global Telecommunications Conference, 2001. GLOBECOM'01. IEEE, volume 3, pages 1667--1671. IEEE, 2001.
[36]
C. Tsourakakis, J. Pachocki, and M. Mitzenmacher. Scalable motif-aware graph clustering. arXiv preprint arXiv:1606.06235, 2016.
[37]
C. Tsourakakis. The k-clique densest subgraph problem. 24th International World Wide Web Conference (WWW), 2015.
[38]
T. Uno. An efficient algorithm for solving pseudo clique enumeration problem. Algorithmica, 56(1), 2010.
[39]
V. V. Vazirani. Approximation algorithms. Springer Science & Business Media, 2013.
[40]
U. Von Luxburg. A tutorial on spectral clustering. Statistics and computing, 17(4):395--416, 2007.
[41]
S. Wasserman and K. Faust. Social network analysis: Methods and applications, volume 8. Cambridge university press, 1994.
[42]
D. J. Watts and S. H. Strogatz. Collective dynamics of small-world networks. Nature, 393:440--442, 1998.
[43]
J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. Knowledge & Information Systems, 42(1):181--213, 2015.

Cited By

View all
  • (2025)Motif-aware curriculum learning for node classificationNeural Networks10.1016/j.neunet.2024.107089184(107089)Online publication date: Apr-2025
  • (2024)On the role of edge dependency in graph generative modelsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692314(6325-6345)Online publication date: 21-Jul-2024
  • (2024)Fast algorithms for hypergraph pagerank with applications to semi-supervised learningProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692125(1306-1330)Online publication date: 21-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '17: Proceedings of the 26th International Conference on World Wide Web
April 2017
1678 pages
ISBN:9781450349130

Sponsors

  • IW3C2: International World Wide Web Conference Committee

In-Cooperation

Publisher

International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 03 April 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. community detection
  2. expanders
  3. graph clustering
  4. large-scale graph mining

Qualifiers

  • Research-article

Funding Sources

Conference

WWW '17
Sponsor:
  • IW3C2

Acceptance Rates

WWW '17 Paper Acceptance Rate 164 of 966 submissions, 17%;
Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)253
  • Downloads (Last 6 weeks)39
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Motif-aware curriculum learning for node classificationNeural Networks10.1016/j.neunet.2024.107089184(107089)Online publication date: Apr-2025
  • (2024)On the role of edge dependency in graph generative modelsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692314(6325-6345)Online publication date: 21-Jul-2024
  • (2024)Fast algorithms for hypergraph pagerank with applications to semi-supervised learningProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692125(1306-1330)Online publication date: 21-Jul-2024
  • (2024)DeSCo: Towards Generalizable and Scalable Deep Subgraph CountingProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635788(218-227)Online publication date: 4-Mar-2024
  • (2024)Motif-Based Contrastive Learning for Community DetectionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2024.336787335:9(11706-11719)Online publication date: Sep-2024
  • (2024)MARML: Motif-Aware Deep Representation Learning in Multilayer NetworksIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.334134735:9(11649-11660)Online publication date: Sep-2024
  • (2024)Higher Order Fuzzy Membership in Motif Modularity OptimizationIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2024.348271732:12(7143-7156)Online publication date: Dec-2024
  • (2024)Higher-Order Directed Community Detection by A Multiobjective Evolutionary FrameworkIEEE Transactions on Artificial Intelligence10.1109/TAI.2024.34366595:12(6536-6550)Online publication date: Dec-2024
  • (2024)An Elemental Decomposition of DNS Name-to-IP GraphsIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621147(1661-1670)Online publication date: 20-May-2024
  • (2024)From Motif to Path: Connectivity and Homophily2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00227(2751-2764)Online publication date: 13-May-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media