ABSTRACT
Finding dense subgraphs in large graphs is a key primitive in a variety of real-world application domains, encompassing social network analytics, event detection, biology, and finance. In most such applications, one typically aims at finding several (possibly overlapping) dense subgraphs which might correspond to communities in social networks or interesting events. While a large amount of work is devoted to finding a single densest subgraph, perhaps surprisingly, the problem of finding several dense subgraphs with limited overlap has not been studied in a principled way, to the best of our knowledge. In this work we define and study a natural generalization of the densest subgraph problem, where the main goal is to find at most $k$ subgraphs with maximum total aggregate density, while satisfying an upper bound on the pairwise Jaccard coefficient between the sets of nodes of the subgraphs. After showing that such a problem is NP-Hard, we devise an efficient algorithm that comes with provable guarantees in some cases of interest, as well as, an efficient practical heuristic. Our extensive evaluation on large real-world graphs confirms the efficiency and effectiveness of our algorithms.
- R. Andersen and K. Chellapilla. Finding dense subgraphs with size bounds. In WAW, 2009. Google ScholarDigital Library
- A. Angel, N. Sarkas, N. Koudas, and D. Srivastava. Dense subgraph maintenance under streaming edge weight updates for real-time story identification. PVLDB, 5(6), 2012. Google ScholarDigital Library
- A. Angel, N. Sarkas, N. Koudas, and D. Srivastava. Dense subgraph maintenance under streaming edge weight updates for real-time story identification. PVLDB, 5(6), 2012. Google ScholarDigital Library
- Y. Asahiro, R. Hassin, and K. Iwama. Complexity of finding dense subgraphs. Discr. Ap. Math., 121(1-3), 2002. Google ScholarDigital Library
- Y. Asahiro, K. Iwama, H. Tamaki, and T. Tokuyama. Greedily finding a dense subgraph. J. Algorithms, 34(2), 2000. Google ScholarDigital Library
- B. Bahmani, R. Kumar, and S. Vassilvitskii. Densest subgraph in streaming and mapreduce. PVLDB, 5(5), 2012. Google ScholarDigital Library
- A. Bhaskara, M. Charikar, E. Chlamtac, U. Feige, and A. Vijayaraghavan. Detecting high log-densities: an O(n 1/4) approximation for densest k-subgraph. In STOC, pages 201{210, 2010. Google ScholarDigital Library
- F. Bonchi, F. Gullo, A. Kaltenbrunner, and Y. Volkovich. Core decomposition of uncertain graphs. In KDD, 2014. Google ScholarDigital Library
- M. Charikar. Greedy approximation algorithms for finding dense components in a graph. In K. Jansen and S. Khuller, editors, APPROX. Springer, 2000. Google ScholarDigital Library
- J. Chen and Y. Saad. Dense subgraph extraction with application to community detection. TKDE, 24(7), 2012. Google ScholarDigital Library
- W. Cui, Y. Xiao, H. Wang, Y. Lu, and W. Wang. Online search of overlapping communities. In SIGMOD, 2013. Google ScholarDigital Library
- X. Du, R. Jin, L. Ding, V. E. Lee, and J. H. Thornton, Jr. Migration motif: a spatial - temporal pattern mining approach for financial markets. In KDD, 2009. Google ScholarDigital Library
- E. Fratkin, B. T. Naughton, D. L. Brutlag, and S. Batzoglou. MotifCut: regulatory motifs finding with maximum density subgraphs. In ISMB, 2006.Google ScholarDigital Library
- D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In VLDB, 2005. Google ScholarDigital Library
- A. V. Goldberg. Finding a maximum density subgraph. Technical report, University of California at Berkeley, 1984. Google ScholarDigital Library
- S. Khot. Ruling out PTAS for graph min-bisection, dense k-subgraph, and bipartite clique. J. Computing, 36(4), 2006. Google ScholarDigital Library
- S. Khuller and B. Saha. On finding dense subgraphs. In ICALP, 2009. Google ScholarDigital Library
- M. A. Langston and et al. A combinatorial approach to the analysis of differential gene expression data: The use of graph algorithms for disease prediction and screening. In Methods of Microarray Data Analysis IV. 2005.Google ScholarCross Ref
- V. E. Lee, N. Ruan, R. Jin, and C. C. Aggarwal. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data. 2010.Google ScholarCross Ref
- C. H. Papadimitriou and M. Yannakakis. Optimization, approximation, and complexity classes. J. Comput. Syst. Sci., 43(3), 1991.Google Scholar
- M. Sozio and A. Gionis. The community-search problem and how to plan a successful cocktail party. In KDD, pages 939{948, 2010. Google ScholarDigital Library
- N. Tatti and A. Gionis. Discovering nested communities. In ECML/PKDD (2), 2013.Google ScholarCross Ref
- C. Tsourakakis, F. Bonchi, A. Gionis, F. Gullo, and M. Tsiarli. Denser than the densest subgraph: Extracting optimal quasi-cliques with quality guarantees. In KDD, 2013. Google ScholarDigital Library
- E. Valari, M. Kontaki, and A. N. Papadopoulos. Discovery of top-k dense subgraphs in dynamic graph collections. In SSDBM, 2012. Google ScholarDigital Library
- N. Wang, J. Zhang, K.-L. Tan, and A. K. H. Tung. On triangulation-based dense neighborhood graph discovery. PVLDB, 4(2), 2010. Google ScholarDigital Library
Index Terms
Finding Subgraphs with Maximum Total Density and Limited Overlap
Recommendations
Complexity of finding dense subgraphs
The k-f(k) dense subgraph problem ((k, f(k))-DSP) asks whether there is a k-vertex subgraph of a given graph G which has at least f(k) edges. When f(k)=k(k - 1)/2, (k,f(k))-DSP is equivalent to the well-known k-clique problem. The main purpose of this ...
Finding Subgraphs with Maximum Total Density and Limited Overlap in Weighted Hypergraphs
Finding dense subgraphs in large (hyper)graphs is a key primitive in a variety of real-world application domains, encompassing social network analytics, event detection, biology, and finance. In most such applications, one typically aims at finding ...
Exact MIP-based approaches for finding maximum quasi-cliques and dense subgraphs
Given a simple graph and a constant $$\gamma \in (0,1]$$ (0,1], a $$\gamma $$ -quasi-clique is defined as a subset of vertices that induces a subgraph with an edge density of at least $$\gamma $$ . This well-known clique relaxation model arises in a ...
Comments