skip to main content
10.1145/1772690.1772755acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Empirical comparison of algorithms for network community detection

Published: 26 April 2010 Publication History

Abstract

Detecting clusters or communities in large real-world graphs such as large social or information networks is a problem of considerable interest. In practice, one typically chooses an objective function that captures the intuition of a network cluster as set of nodes with better internal connectivity than external connectivity, and then one applies approximation algorithms or heuristics to extract sets of nodes that are related to the objective function and that "look like" good communities for the application of interest.
In this paper, we explore a range of network community detection methods in order to compare them and to understand their relative performance and the systematic biases in the clusters they identify. We evaluate several common objective functions that are used to formalize the notion of a network community, and we examine several different classes of approximation algorithms that aim to optimize such objective functions. In addition, rather than simply fixing an objective and asking for an approximation to the best cluster of any size, we consider a size-resolved version of the optimization problem. Considering community quality as a function of its size provides a much finer lens with which to examine community detection algorithms, since objective functions and approximation algorithms often have non-obvious size-dependent behavior.

References

[1]
Supporting website. http://snap.stanford.edu/ncp/.
[2]
R. Andersen, F. Chung, and K. Lang. Local graph partitioning using PageRank vectors. In FOCS '06: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, pages 475--486, 2006.
[3]
R. Andersen and K. Lang. Communities from seed sets. In WWW '06: Proceedings of the 15th International Conference on World Wide Web, pages 223--232, 2006.
[4]
S. Arora, S. Rao, and U. Vazirani. Expander flows, geometric embeddings and graph partitioning. In STOC '04: Proceedings of the 36th annual ACM Symposium on Theory of Computing, pages 222--231, 2004.
[5]
S. Burer and R. Monteiro. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Mathematical Programming (series B), 95(2):329--357, 2003.
[6]
F. Chung. Spectral graph theory, volume 92 of CBMS Regional Conference Series in Mathematics. American Mathematical Society, 1997.
[7]
A. Clauset. Finding local community structure in networks. Physical Review E, 72:026132, 2005.
[8]
A. Clauset, M. Newman, and C. Moore. Finding community structure in very large networks. Physical Review E, 70:066111, 2004.
[9]
I. Dhillon, Y. Guan, and B. Kulis. Weighted graph cuts without eigenvectors: A multilevel approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(11):1944--1957, 2007.
[10]
G. Flake, S. Lawrence, and C. Giles. Efficient identification of web communities. In KDD '00: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 150--160, 2000.
[11]
G. Flake, R. Tarjan, and K.Tsioutsiouliklis. Graph clustering and minimum cut trees. Internet Mathematics, 1(4):385--408, 2003.
[12]
S. Fortunato. Community detection in graphs. arXiv:0906.0612, June 2009.
[13]
S. Fortunato and M. Barthélemy. Resolution limit in community detection. Proceedings of the National Academy of Sciences of the United States of America, 104(1):36--41, 2007.
[14]
M. Gaertler. Clustering. In U. Brandes and T. Erlebach, editors, Network Analysis: Methodological Foundations, pages 178--215. Springer, 2005.
[15]
G. Gallo, M. Grigoriadis, and R. Tarjan. A fast parametric maximum flow algorithm and applications. SIAM Journal on Computing, 18(1):30--55, 1989.
[16]
M. Girvan and M. Newman. Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America, 99(12):7821--7826, 2002.
[17]
R. Guimerà, M. Sales-Pardo, and L. Amaral. Modularity from fluctuations in random graphs and complex networks. Physical Review E, 70:025101, 2004.
[18]
R. Kannan, S. Vempala, and A. Vetta. On clusterings: Good, bad and spectral. Journal of the ACM, 51(3):497--515, 2004.
[19]
B. Karrer, E. Levina, and M. Newman. Robustness of community structure in networks. Physical Review E, 77:046119, 2008.
[20]
G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 20:359--392, 1998.
[21]
A. Lancichinetti and S. Fortunato. Community detection algorithms: a comparative analysis. arXiv:0908.1062, August 2009.
[22]
K. Lang and S. Rao. Finding near-optimal cuts: an empirical evaluation. In SODA '93: Proceedings of the 4th annual ACM-SIAM Symposium on Discrete algorithms, pages 212--221, 1993.
[23]
K. Lang and S. Rao. A flow-based method for improving the expansion or conductance of graph cuts. In IPCO '04: Proceedings of the 10th International IPCO Conference on Integer Programming and Combinatorial Optimization, pages 325--337, 2004.
[24]
T. Leighton and S. Rao. An approximate max-flow min-cut theorem for uniform multicommodity flow problems with applications to approximation algorithms. In FOCS '88: Proceedings of the 28th Annual Symposium on Foundations of Computer Science, pages 422--431, 1988.
[25]
T. Leighton and S. Rao. Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. Journal of the ACM, 46(6):787--832, 1999.
[26]
J. Leskovec, K. Lang, A. Dasgupta, and M. Mahoney. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. arXiv:0810.1355, October 2008.
[27]
J. Leskovec, K. Lang, A. Dasgupta, and M. Mahoney. Statistical properties of community structure in large social and information networks. In WWW '08: Proceedings of the 17th International Conference on World Wide Web, pages 695--704, 2008.
[28]
M. Newman. Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America, 103(23):8577--8582, 2006.
[29]
M. Newman and M. Girvan. Finding and evaluating community structure in networks. Physical Review E, 69:026113, 2004.
[30]
F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, and D. Parisi. Defining and identifying communities in networks. Proceedings of the National Academy of Sciences of the United States of America, 101(9):2658--2663, 2004.
[31]
S. Schaeffer. Graph clustering. Computer Science Review, 1(1):27--64, 2007.
[32]
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transcations of Pattern Analysis and Machine Intelligence, 22(8):888--905, 2000.
[33]
D. Spielman and S.-H. Teng. Spectral partitioning works: Planar graphs and finite element meshes. In FOCS '96: Proceedings of the 37th Annual IEEE Symposium on Foundations of Computer Science, pages 96--107, 1996.
[34]
S. White and P. Smyth. A spectral clustering approach to finding communities in graphs. In SDM '05: Proceedings of the 5th SIAM International Conference on Data Mining, pages 76--84, 2005.

Cited By

View all
  • (2025)Adaptive Neural Message Passing for Inductive Learning on HypergraphsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.343448347:1(19-31)Online publication date: Jan-2025
  • (2025)Scalable High-Performance Community Detection Using Label Propagation in Massive NetworksSocial Networks Analysis and Mining10.1007/978-3-031-78541-2_1(3-19)Online publication date: 24-Jan-2025
  • (2025)VLP: A Label Propagation Algorithm for Community Detection in Complex NetworksSocial Networks Analysis and Mining10.1007/978-3-031-78538-2_30(343-353)Online publication date: 25-Jan-2025
  • Show More Cited By

Index Terms

  1. Empirical comparison of algorithms for network community detection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '10: Proceedings of the 19th international conference on World wide web
    April 2010
    1407 pages
    ISBN:9781605587998
    DOI:10.1145/1772690

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 April 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. community structure
    2. conductance
    3. flow-based methods
    4. graph partitioning
    5. spectral methods

    Qualifiers

    • Research-article

    Conference

    WWW '10
    WWW '10: The 19th International World Wide Web Conference
    April 26 - 30, 2010
    North Carolina, Raleigh, USA

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)220
    • Downloads (Last 6 weeks)22
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Adaptive Neural Message Passing for Inductive Learning on HypergraphsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.343448347:1(19-31)Online publication date: Jan-2025
    • (2025)Scalable High-Performance Community Detection Using Label Propagation in Massive NetworksSocial Networks Analysis and Mining10.1007/978-3-031-78541-2_1(3-19)Online publication date: 24-Jan-2025
    • (2025)VLP: A Label Propagation Algorithm for Community Detection in Complex NetworksSocial Networks Analysis and Mining10.1007/978-3-031-78538-2_30(343-353)Online publication date: 25-Jan-2025
    • (2024)Community Detection on Social Networks With Sentimental InteractionInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.34123220:1(1-23)Online publication date: 9-Apr-2024
    • (2024)Correlating Histopathological Microscopic Images of Creutzfeldt–Jakob Disease with Clinical Typology Using Graph Theory and Artificial IntelligenceMachine Learning and Knowledge Extraction10.3390/make60300996:3(2018-2032)Online publication date: 7-Sep-2024
    • (2024)Preliminary Studies to Bridge the Gap: Leveraging Informal Software Architecture Artifacts for Structured Model CreationInformation10.3390/info1510064215:10(642)Online publication date: 15-Oct-2024
    • (2024)Private Settlement in Blockchain SystemsSSRN Electronic Journal10.2139/ssrn.4808685Online publication date: 2024
    • (2024)Remote Work admist the Covid-19 outbreak: Insights from an Ensemble Community-Based Keyword Network AnalysisSSRN Electronic Journal10.2139/ssrn.4795978Online publication date: 2024
    • (2024)Remote Work Admist the Covid-19 outbreak: Insights from an Ensemble Community-Based Keyword Network AnalysisSSRN Electronic Journal10.2139/ssrn.4777889Online publication date: 2024
    • (2024)GetCom: An Efficient and Generalizable Framework for Community DetectionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679865(2650-2659)Online publication date: 21-Oct-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    EPUB

    View this article in ePub.

    ePub

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media