ABSTRACT
We propose two approximation algorithms for identifying communities in dynamic social networks. Communities are intuitively characterized as "unusually densely knit" subsets of a social network. This notion becomes more problematic if the social interactions change over time. Aggregating social networks over time can radically misrepresent the existing and changing community structure. Recently, we have proposed an optimization-based framework for modeling dynamic community structure. Also, we have proposed an algorithm for finding such structure based on maximum weight bipartite matching. In this paper, we analyze its performance guarantee for a special case where all actors can be observed at all times. In such instances, we show that the algorithm is a small constant factor approximation of the optimum. We use a similar idea to design an approximation algorithm for the general case where some individuals are possibly unobserved at times, and to show that the approximation factor increases twofold but remains a constant regardless of the input size. This is the first algorithm for inferring communities in dynamic networks with a provable approximation guarantee. We demonstrate the general algorithm on real data sets. The results confirm the efficiency and effectiveness of the algorithm in identifying dynamic communities.
Supplemental Material
- J. Aizen, D. Huttenlocher, J. Kleinberg, and A. Novak. Traffic-based feedback on the web. Proc. National Academy of Sciences, 101(Suppl.1):5254--5260, 2004.Google ScholarCross Ref
- L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: Membership, growth, and evolution. In Proc. 12th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2006. Google ScholarDigital Library
- J. Baumes, M. Goldberg, M. Magdon-Ismail, and W. Wallace. Discovering hidden groups in communication networks. In Proc. 2nd NSF/NIJ Symposium on Intelligence and Security Informatics, 2004.Google ScholarCross Ref
- T. Y. Berger-Wolf and J. Saia. A framework for analysis of dynamic social networks. In Proc. 12th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pages 523--528, New York, NY, USA, 2006. ACM Press. Google ScholarDigital Library
- K. Carley and M. Prietula, editors. Computational Organization Theory. Lawrence Erlbaum associates, Hillsdale, NJ, 2001. Google ScholarDigital Library
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. MIT Press, Cambridge, MA, USA, 2001. Google ScholarDigital Library
- D. P. Croft, R. James, P. Thomas, C. Hathaway, D. Mawdsley, K. Laland, and J. Krause. Social structure and co-operative interactions in a wild population of guppies (Poecilia reticulata). Behavioural Ecology and Sociobiology, In Press.Google Scholar
- P. C. Cross, J. O. Lloyd-Smith, and W. M. Getz. Disentangling association patterns in fission-fusion societies using African buffalo as an example. Animal Behaviour, 69:499--506, 2005.Google ScholarCross Ref
- A. Davis, B. B. Gardner, and M. R. Gardner. Deep South. The University of Chicago Press, Chicago, IL, 1941.Google Scholar
- R. Diestel. Graph Theory (Graduate Texts in Mathematics). Springer, August 2005.Google Scholar
- N. Eagle and A. Pentland. Reality mining: Sensing complex social systems. J. Personal and Ubiquitous Computing, 2006. Google ScholarDigital Library
- I. R. Fischhoff, S. R. Sundaresan, J. Cordingley, H. M. Larkin, M.-J. Sellier, and D. I. Rubenstein. Social relationships and reproductive state influence leadership roles in movements of plains zebra (Equus burchellii). Animal Behaviour, 73: 825--831, 2007.Google ScholarCross Ref
- I. R. Fischhoff, S. R. Sundaresan, J. Cordingley, and D. I. Rubenstein. Habitat use and movements of plains zebra (Equus burchelli) in response to predation danger from lions. Behavioral Ecology, 18(4): 725--729, 2007.Google ScholarCross Ref
- G. Flake, S. Lawrence, C. L. Giles, and F. Coetzee. Self-organization and identification of web communities. IEEE Computer, 35(3), 2002. Google ScholarDigital Library
- D. Franzblau and A. Raychaudhuri. Optimal Hamiltonian completions and path covers for trees, and a reduction to maximum flow. Anziam journal, 44(2):193--204, 2002.Google ScholarCross Ref
- L. Freeman. Finding social groups: A meta-analysis of the southern women data. In R. Breiger, K. Carley, and P. Pattison, editors, Dynamic Social Network Modeling and Analysis. The National Academies Press, Washington, D.C., 2003.Google Scholar
- L. C. Freeman. On the sociological concept of "group": a empirical test of two models. American Journal of Sociology, 98:152--166, 1993.Google ScholarCross Ref
- L. Getoor and C. Diehl. Link mining: A survey. SIGKDD Explorations Special Issue on Link Mining, 7(2), December 2005. Google ScholarDigital Library
- D. Gibson, J. M. Kleinberg, and P. Raghavan. Inferring web communities from link topology. In UK Conf. on Hypertext, pages 225--234, 1998. Google ScholarDigital Library
- M. Girvan and M. E. J. Newman. Community structure in social and biological networks. Proc. Natl. Acad. Sci., 99:8271--8276, 2002.Google ScholarCross Ref
- D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins. Information diffusion through blogspace. In Proc. 13th Intl. Conf. on World Wide Web, pages 491--501, New York, NY, USA, 2004. ACM Press. Google ScholarDigital Library
- J. Hopcroft, O. Khan, B. Kulis, and B. Selman. Natural communities in large linked networks. In Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pages 541--546, 2003. Google ScholarDigital Library
- D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2003. Google ScholarDigital Library
- J. Kleinberg and E. Tardos. Algorithm Design. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2005. Google ScholarDigital Library
- M. Kretzschmar and M. Morris. Measures of concurrency in networks and the spread of infectious disease. Math. Biosci., 133:165--195, 1996.Google ScholarCross Ref
- R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the web for emerging cyber-communities. In Proc. 8th Intl. World Wide Web Conf., May 1999. Google ScholarDigital Library
- M. Magdon-Ismail, M. Goldberg, W. Wallace, and D. Siebecker. Locating hidden groups in communication networks using hidden markov models. In Proc. Intl. Conf. on Intelligence and Security Informatics (ISI 2003), Tucson, AZ, 2003. Google ScholarDigital Library
- B. Malin. Data and collocation surveillance through location access patterns. In Proc. North American Association for Computational Social and Organizational Science Conf., Pittsburgh, PA, June 2004.Google Scholar
- L. A. Meyers, M. Newman, and B. Pourbohloul. Predicting epidemics on directed contact networks. Journal of Theoretical Biology, 240:400--418, 2006.Google ScholarCross Ref
- M. Newman, A.-L. Barabasi, and D. J. Watts, editors. The Structure and Dynamics of Networks. Princeton University Press, 2006. Google ScholarDigital Library
- J. M. Read and M. J. Keeling. Disease evolution on networks: the role of contact structure. Proc. R. Soc. Lond. B, 270:699--708, 2003.Google ScholarCross Ref
- E. M. Rogers. Diffusion of Innovations. Simon&Shuster, Inc., 5th edition, 2003.Google Scholar
- D. I. Rubenstein, S. Sundaresan, I. Fischhoff, and D. Saltz. Social networks in wild asses: Comparing patterns and processes among populations. In A. Stubbe, P. Kaczensky, R. Samjaa, K. Wesche, and M. Stubbe, editors, Exploration into the Biological Resources of Mongolia, volume 10. Martin-Luther-University Halle-Wittenberg, 2007. In press.Google Scholar
- J. Scott, R. Gass, J. Crowcroft, P. Hui, C. Diot, and A. Chaintreau. CRAWDAD trace cambridge/ haggle/ imote/ infocom (v. 2006-01-31). Downloaded from http://crawdad.cs.dartmouth.edu/cambridge/haggle/imote/infocom.Google Scholar
- J. Sun, C. Faloutsos, S. Papadimitriou, and P. S. Yu. Graphscope: parameter-free mining of large time-evolving graphs. In Proc. 13th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pages 687--696, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- S. R. Sundaresan, I. R. Fischhoff, J. Dushoff, and D. I. Rubenstein. Network metrics reveal differences in social organization between two fission-fusion species, Grevy's zebra and onager. Oecologia, 2006.Google Scholar
- C. Tantipathananandh, T. Berger-Wolf, and D. Kempe. A framework for community identification in dynamic social networks. In ope: parameter-free mining of large time-evolving graphs. In Proc. 13th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pages 717--726, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- S. Wasserman and F. K. Social Network Analysis. Cambridge University Press, Cambridge, MA, 1994.Google ScholarCross Ref
Index Terms
- Constant-factor approximation algorithms for identifying dynamic communities
Recommendations
A framework for community identification in dynamic social networks
KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data miningWe propose frameworks and algorithms for identifying communities in social networks that change over time. Communities are intuitively characterized as "unusually densely knit" subsets of a social network. This notion becomes more problematic if the ...
A Constant Factor Approximation Algorithm for the Storage Allocation Problem
We study the storage allocation problem (SAP) which is a variant of the unsplittable flow problem on paths (UFPP). A SAP instance consists of a path $$P = (V,E)$$P=(V,E) and a set J of tasks. Each edge $$e \in E$$e E has a capacity $$c_e$$ce and each ...
A Constant Factor Approximation Algorithm for Fault-Tolerant k-Median
In this article, we consider the fault-tolerant k-median problem and give the first constant factor approximation algorithm for it. In the fault-tolerant generalization of the classical k-median problem, each client j needs to be assigned to at least rj ...
Comments