skip to main content
10.1145/2487575.2487657acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

STRIP: stream learning of influence probabilities

Published:11 August 2013Publication History

ABSTRACT

Influence-driven diffusion of information is a fundamental process in social networks. Learning the latent variables of such process, i.e., the influence strength along each link, is a central question towards understanding the structure and function of complex networks, modeling information cascades, and developing applications such as viral marketing.

Motivated by modern microblogging platforms, such as twitter, in this paper we study the problem of learning influence probabilities in a data-stream scenario, in which the network topology is relatively stable and the challenge of a learning algorithm is to keep up with a continuous stream of tweets using a small amount of time and memory. Our contribution is a number of randomized approximation algorithms, categorized according to the available space (superlinear, linear, and sublinear in the number of nodes n) and according to different models (landmark and sliding window). Among several results, we show that we can learn influence probabilities with one pass over the data, using O(nlog n) space, in both the landmark model and the sliding-window model, and we further show that our algorithm is within a logarithmic factor of optimal.

For truly large graphs, when one needs to operate with sublinear space, we show that we can still learn influence probabilities in one pass, assuming that we restrict our attention to the most active users.

Our thorough experimental evaluation on large social graph demonstrates that the empirical performance of our algorithms agrees with that predicted by the theory.

References

  1. Z. Bar-Yossef, T. S. Jayram, R. Kumar, D. Sivakumar, and L. Trevisan. Counting distinct elements in a data stream. In RANDOM'02. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Blum, R. W. Floyd, V. R. Pratt, R. L. Rivest, and R. E. Tarjan. Time bounds for selection. J. Comput. Syst. Sci., 7(4):448--461, 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Brase and C. Brase. Understandable Statistics: Concepts and Methods. Brooks/Cole, 2011.Google ScholarGoogle Scholar
  4. A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher. Min-wise independent permutations. J. Comput. Syst. Sci., 60(3):630--659, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. Carter and M. N. Wegman. Universal classes of hash functions. J. Comput. Syst. Sci., 18(2):143--154, 1979.Google ScholarGoogle ScholarCross RefCross Ref
  6. M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. Theor. Comput. Sci., 312(1):3--15, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W. Chen, C. Wang, and Y. Wang. Scalable influence maximization for prevalent viral marketing in large-scale social networks. In KDD'10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. Cohen, M. Datar, S. Fujiwara, A. Gionis, P. Indyk, R. Motwani, J. D. Ullman, and C. Yang. Finding interesting associations without support pruning. IEEE Trans. Knowl. Data Eng., 13(1):64--78, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. SIAM J. Comput., 31(6):1794--1813, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Domingos and M. Richardson. Mining the network value of customers. In KDD'01. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Feigenbaum, S. Kannan, A. McGregor, and J. Zhang. On graph problems in a semi-streaming model. In ICALP, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  12. G. Feigenblat, E. Porat, and A. Shiftan. Exponential time improvement for min-wise based algorithms. Inf. Comput., 209(4):737--747, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. Learning influence probabilities in social networks. In WSDM'10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. A data-based approach to social influence maximization. PVLDB, 5(1):73--84, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. Kalyanasundaram and G. Schnitger. The probabilistic communication complexity of set intersection. SIAM J. Discrete Math., 5(4):545--557, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Kempe, J. M. Kleinberg, and -- E. Tardos. Maximizing the spread of influence through a social network. In KDD'03. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. In KDD'05. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. S. Glance. Cost-effective outbreak detection in networks. In KDD'07. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Motwani and P. Raghavan. Randomized Algorithms. CRC Press, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Pǎtraşcu and M. Thorup. The power of simple tabulation hashing. J. ACM, 59(3):14, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. K. Saito, R. Nakano, and M. Kimura. Prediction of information diffuusion probabilities for independent cascade model. In KES'08. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. STRIP: stream learning of influence probabilities

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2013
      1534 pages
      ISBN:9781450321747
      DOI:10.1145/2487575

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 August 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      KDD '13 Paper Acceptance Rate125of726submissions,17%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader