ABSTRACT
Influence-driven diffusion of information is a fundamental process in social networks. Learning the latent variables of such process, i.e., the influence strength along each link, is a central question towards understanding the structure and function of complex networks, modeling information cascades, and developing applications such as viral marketing.
Motivated by modern microblogging platforms, such as twitter, in this paper we study the problem of learning influence probabilities in a data-stream scenario, in which the network topology is relatively stable and the challenge of a learning algorithm is to keep up with a continuous stream of tweets using a small amount of time and memory. Our contribution is a number of randomized approximation algorithms, categorized according to the available space (superlinear, linear, and sublinear in the number of nodes n) and according to different models (landmark and sliding window). Among several results, we show that we can learn influence probabilities with one pass over the data, using O(nlog n) space, in both the landmark model and the sliding-window model, and we further show that our algorithm is within a logarithmic factor of optimal.
For truly large graphs, when one needs to operate with sublinear space, we show that we can still learn influence probabilities in one pass, assuming that we restrict our attention to the most active users.
Our thorough experimental evaluation on large social graph demonstrates that the empirical performance of our algorithms agrees with that predicted by the theory.
- Z. Bar-Yossef, T. S. Jayram, R. Kumar, D. Sivakumar, and L. Trevisan. Counting distinct elements in a data stream. In RANDOM'02. Google ScholarDigital Library
- M. Blum, R. W. Floyd, V. R. Pratt, R. L. Rivest, and R. E. Tarjan. Time bounds for selection. J. Comput. Syst. Sci., 7(4):448--461, 1973. Google ScholarDigital Library
- C. Brase and C. Brase. Understandable Statistics: Concepts and Methods. Brooks/Cole, 2011.Google Scholar
- A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher. Min-wise independent permutations. J. Comput. Syst. Sci., 60(3):630--659, 2000. Google ScholarDigital Library
- L. Carter and M. N. Wegman. Universal classes of hash functions. J. Comput. Syst. Sci., 18(2):143--154, 1979.Google ScholarCross Ref
- M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. Theor. Comput. Sci., 312(1):3--15, 2004. Google ScholarDigital Library
- W. Chen, C. Wang, and Y. Wang. Scalable influence maximization for prevalent viral marketing in large-scale social networks. In KDD'10. Google ScholarDigital Library
- E. Cohen, M. Datar, S. Fujiwara, A. Gionis, P. Indyk, R. Motwani, J. D. Ullman, and C. Yang. Finding interesting associations without support pruning. IEEE Trans. Knowl. Data Eng., 13(1):64--78, 2001. Google ScholarDigital Library
- M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. SIAM J. Comput., 31(6):1794--1813, 2002. Google ScholarDigital Library
- P. Domingos and M. Richardson. Mining the network value of customers. In KDD'01. Google ScholarDigital Library
- J. Feigenbaum, S. Kannan, A. McGregor, and J. Zhang. On graph problems in a semi-streaming model. In ICALP, 2004.Google ScholarCross Ref
- G. Feigenblat, E. Porat, and A. Shiftan. Exponential time improvement for min-wise based algorithms. Inf. Comput., 209(4):737--747, 2011. Google ScholarDigital Library
- A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. Learning influence probabilities in social networks. In WSDM'10. Google ScholarDigital Library
- A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. A data-based approach to social influence maximization. PVLDB, 5(1):73--84, 2011. Google ScholarDigital Library
- B. Kalyanasundaram and G. Schnitger. The probabilistic communication complexity of set intersection. SIAM J. Discrete Math., 5(4):545--557, 1992. Google ScholarDigital Library
- D. Kempe, J. M. Kleinberg, and -- E. Tardos. Maximizing the spread of influence through a social network. In KDD'03. Google ScholarDigital Library
- J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. In KDD'05. Google ScholarDigital Library
- J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. S. Glance. Cost-effective outbreak detection in networks. In KDD'07. Google ScholarDigital Library
- R. Motwani and P. Raghavan. Randomized Algorithms. CRC Press, 1997. Google ScholarDigital Library
- M. Pǎtraşcu and M. Thorup. The power of simple tabulation hashing. J. ACM, 59(3):14, 2012. Google ScholarDigital Library
- K. Saito, R. Nakano, and M. Kimura. Prediction of information diffuusion probabilities for independent cascade model. In KES'08. Google ScholarDigital Library
Index Terms
STRIP: stream learning of influence probabilities
Recommendations
Using mixed-mode networks to disentangle multiple sources of social influence
SBP'12: Proceedings of the 5th international conference on Social Computing, Behavioral-Cultural Modeling and PredictionSocial network analysis has been widely used to model various forms of network influence on individual behavior. In network studies into peer influence on adolescent risk-taking health behavior, either friendship influence (using one-mode networks) or ...
Measuring Time-Constrained Influence to Predict Adoption in Online Social Networks
Recently, there has been strong interest in measuring influence in online social networks. Different measures have been proposed to predict when individuals will adopt a new behavior, given the influence produced by their friends. In this article, we ...
Homophily and social influence among online casual game players
We examine homophily and social influence processes among online game players.Players tend not to initiate ties, unless there are other desirable properties.Players tend to have reciprocated ties.No strong homophily and social influence processes were ...
Comments