research-article

Detecting cohesive and 2-mode communities indirected and undirected networks

Authors:
Jaewon Yang

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

,
Julian McAuley

Stanford University, Stanford, USA

Stanford University, Stanford, USA
View Profile

,
Jure Leskovec

Stanford University, stanford, CA, USA

Stanford University, stanford, CA, USA
View Profile

WSDM '14: Proceedings of the 7th ACM international conference on Web search and data miningFebruary 2014Pages 323–332https://doi.org/10.1145/2556195.2556243

Published:24 February 2014Publication History

WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining

Pages 323–332

ABSTRACT

Networks are a general language for representing relational information among objects. An effective way to model, reason about, and summarize networks, is to discover sets of nodes with common connectivity patterns. Such sets are commonly referred to as network communities. Research on network community detection has predominantly focused on identifying communities of densely connected nodes in undirected networks.

In this paper we develop a novel overlapping community detection method that scales to networks of millions of nodes and edges and advances research along two dimensions: the connectivity structure of communities, and the use of edge directedness for community detection. First, we extend traditional definitions of network communities by building on the observation that nodes can be densely interlinked in two different ways: In cohesive communities nodes link to each other, while in 2-mode communities nodes link in a bipartite fashion, where links predominate between the two partitions rather than inside them. Our method successfully detects both 2-mode as well as cohesive communities, that may also overlap or be hierarchically nested. Second, while most existing community detection methods treat directed edges as though they were undirected, our method accounts for edge directions and is able to identify novel and meaningful community structures in both directed and undirected networks, using data from social, biological, and ecological domains.

References

L. Adamic and N. Glance. The political blogosphere and the 2004 U.S. election: divided they blog. In LinkKDD '05, 2005. Google ScholarDigital Library
Y.-Y. Ahn, J. Bagrow, and S. Lehmann. Link communities reveal multi-scale complexity in networks. Nature, 2010.Google Scholar
E. Airoldi, D. Blei, S. Fienberg, and E. Xing. Mixed membership stochastic blockmodels. JMLR, 2007. Google ScholarDigital Library
R. Andersen and K. Lang. Communities from seed sets. In WWW '06, 2006. Google ScholarDigital Library
R. Balasubramanyan and W. Cohen. Block-LDA: Jointly modeling entity-annotated text and entity-entity links. In SDM '11, 2011.Google ScholarCross Ref
B. Ball, B. Karrer, and M. Newman. Efficient and principled method for detecting communities in networks. In Phys. Rev. E, 2011.Google Scholar
E. Boyle et al. GO::TermFinder--open source software for accessing gene ontology information and finding significantly enriched gene ontology terms associated with a list of genes. Bioinformatics, 2004. Google ScholarDigital Library
L. Breiger. The duality of persons and groups. Social Forces, 1974.Google ScholarCross Ref
R. Burt. Cohesion versus structural equivalence as a basis for network subgroups. Sociological Methods and Research, 1978.Google ScholarCross Ref
D. Carney, B. Davies, and B. Horazdovsky. Vps9 domain-containing proteins: activators of Rab5 GTPases from yeast to neurons. Trends in Cell Biology, 2006.Google ScholarCross Ref
M. Coscia, G. Rossetti, F. Giannotti, and D. Pedreschi. Demon: a local-first discovery method for overlapping communities. In KDD '12, 2012. Google ScholarDigital Library
L. Danon, J. Duch, A. Diaz-Guilera, and A. Arenas. Comparing community structure identification. J. of Stat. Mech.: Theory and Experiment, 2005.Google Scholar
I. Dhillon, Y. Guan, and B. Kulis. Weighted graph cuts without eigenvectors: A multilevel approach. IEEE PAMI, 2007. Google ScholarDigital Library
S. Fortunato. Community detection in graphs. Physics Reports, 2010.Google ScholarCross Ref
D. Gleich and Seshadhri. Neighborhoods are good communities. In KDD '12, 2012.Google Scholar
P. Gopalan, D. Mimno, S. Gerrish, M. Freedman, and D. Blei. Scalable inference of overlapping communities. In NIPS '12, 2012.Google Scholar
K. Henderson, B. Gallagher, T. Eliassi-Rad, H. Tong, S. Basu, L. Akoglu, D. Koutra, C. Faloutsos, and L. Li. Rolx: structural role extraction & mining in large graphs. In KDD, 2012. Google ScholarDigital Library
P. W. Holland, K. B. Laskey, and S. Leinhardt. Stochastic blockmodels: First steps. Social Networks, 1983.Google Scholar
C.-J. Hsieh and I. S. Dhillon. Fast coordinate descent methods with variable selection for non-negative matrix factorization. In KDD '11, 2011. Google ScholarDigital Library
S. Johnson. Hierarchical clustering schemes. Psychometrika, 1967.Google ScholarCross Ref
G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. J. of Parallel and Distributed Computing, 1998. Google ScholarDigital Library
B. Klimt and Y. Yang. Introducing the enron corpus. In CEAS '04, 2004.Google Scholar
R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the web for emerging cyber-communities. Computer Networks, 1999. Google ScholarDigital Library
H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In WWW '10, 2010. Google ScholarDigital Library
S. Lattanzi and D. Sivakumar. Affiliation networks. In STOC '09, 2009. Google ScholarDigital Library
E. Leicht and M. E. Newman. Community structure in directed networks. Phys. Rev. Lett., 2008.Google Scholar
J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. In KDD '05, 2005. Google ScholarDigital Library
J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics, 2009.Google ScholarCross Ref
C-J. Lin. Projected Gradient Methods for Nonnegative Matrix Factorization. Neural Computation, 2007. Google ScholarDigital Library
F. Malliaros and M. Vazirgiannis. Clustering and Community Detection in Directed Networks: A Survey. Physics Reports, 2013.Google ScholarCross Ref
J. McAuley and J. Leskovec. Learning to discover social circles in ego networks. In NIPS '12, 2012.Google Scholar
A. McCallum, X. Wang, and A. Corrada-Emmanuel. Topic and role discovery in social networks with experiments on enron and academic email. JAIR, 2007. Google ScholarDigital Library
A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Measurement and analysis of online social networks. In IMC '07, 2007. Google ScholarDigital Library
G. Palla, I. Derényi, I. Farkas, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 2005.Google Scholar
S. Papadopoulos, Y. Kompatsiaris, A. Vakali, and P. Spyridonos. Community detection in social media. DMKD, 2011. Google ScholarDigital Library
S. Pinkert, J. Schultz, and J. Reichardt. Protein interaction networks---more than mere modules. PLoS CompBio, 2010.Google ScholarCross Ref
F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, and D. Parisi. Defining and identifying communities in networks. PNAS, 2004.Google ScholarCross Ref
M. Rosvall and C. Bergstrom. An information-theoretic framework for resolving community structure in complex networks. PNAS, 2007.Google ScholarCross Ref
M. Rosvall and C. T. Bergstrom. Maps of random walks on complex networks reveal community structure. PNAS, 2008.Google ScholarCross Ref
V. Satuluri and S. Parthasarathy. Scalable Graph Clustering using Stochastic Flows: Applications to Community Discovery. KDD '09, 2009. Google ScholarDigital Library
R. Ulanowicz, C. Bondavalli, and M. Egnotovich. Network analysis of trophic dynamics in south florida ecosystem, fy 97: The florida bay ecosystem. Ref. CBL98-123. Chesapeake Biological Laboratory, 1998.Google Scholar
J. Xie, S. Kelley, and B. K. Szymanski. Overlapping community detection in networks: the state of the art and comparative study. ACM Computing Surveys, 2013. Google ScholarDigital Library
J. Yang and J. Leskovec. Community-affiliation network model for overlapping community detection. In ICDM '12, 2012. Google ScholarDigital Library
J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. In ICDM '12, 2012. Google ScholarDigital Library
J. Yang and J. Leskovec. Overlapping community detection at scale: A non-negative factorization approach. In WSDM '13, 2013. Google ScholarDigital Library
H. Yu, P. Braun, M. A. Yildirim, I. Lemmens, K. Venkatesan, J. Sahalie, T. Hirozane-Kishikawa, F. Gebreab, N. Li, N. Simonis, and et al. High-quality binary protein interaction map of the yeast interactome network. Science, 2008.Google Scholar
E. Zheleva, H. Sharara, and L. Getoor. Co-evolution of social and affiliation networks. In KDD '09, 2009. Google ScholarDigital Library

Index Terms

Detecting cohesive and 2-mode communities indirected and undirected networks
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Overlapping community detection at scale: a nonnegative matrix factorization approach
WSDM '13: Proceedings of the sixth ACM international conference on Web search and data mining

Network communities represent basic structures for understanding the organization of real-world networks. A community (also referred to as a module or a cluster) is typically thought of as a group of nodes with more connections amongst its members than ...
Read More
Structure and Overlaps of Ground-Truth Communities in Networks
Special Issue on Linking Social Granularity and Functions

One of the main organizing principles in real-world networks is that of network communities, where sets of nodes organize into densely linked clusters. Even though detection of such communities is of great interest, understanding the structure ...
Read More
Defining and evaluating network communities based on ground-truth

Nodes in real-world networks organize into densely linked communities where edges appear with high concentration among the members of the community. Identifying such communities of nodes has proven to be a challenging task due to a plethora of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining
February 2014
712 pages
ISBN:9781450323512
DOI:10.1145/2556195
General Chairs:
Ben Carterette
University of Delaware, USA
,
Fernando Diaz
Microsoft Research, USA
,
Program Chairs:
Carlos Castillo
Qatar Computing Research Institute, Qatar
,
Donald Metzler
Google, USA
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 February 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
2-mode communities
network communities
overlapping community detection
Qualifiers
- research-article
Conference

Acceptance Rates
WSDM '14 Paper Acceptance Rate64of355submissions,18%Overall Acceptance Rate498of2,863submissions,17%
More
Upcoming Conference
WSDM '25

Sponsor:

sigir

sigir

sigir

sigir

The Eighteenth ACM International Conference on Web Search and Data Mining

April 7 - 11, 2025

Hannover , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 51
  Total Citations
  View Citations
- 482
  Total Downloads
- Downloads (Last 12 months)12
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Detecting cohesive and 2-mode communities indirected and undirected networks

WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Overlapping community detection at scale: a nonnegative matrix factorization approach

Structure and Overlaps of Ground-Truth Communities in Networks

Defining and evaluating network communities based on ground-truth