skip to main content
10.1145/2556195.2556229acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Scalable topic-specific influence analysis on microblogs

Published:24 February 2014Publication History

ABSTRACT

Social influence analysis on microblog networks, such as Twitter, has been playing a crucial role in online advertising and brand management. While most previous influence analysis schemes rely only on the links between users to find key influencers, they omit the important text content created by the users. As a result, there is no way to differentiate the social influence in different aspects of life (topics). Although a few prior works do support topic-specific influence analysis, they either separate the analysis of content from the analysis of network structure, or assume that content is the only cause of links, which is clearly an inappropriate assumption for microblog networks.

To address the limitations of the previous approaches, we propose a novel Followship-LDA (FLDA) model, which integrates both content topic discovery and social influence analysis in the same generative process. This model properly captures the content-related and content-independent reasons why a user follows another in a microblog network. We demonstrate that FLDA produces results with significantly better precision than existing approaches. Furthermore, we propose a distributed Gibbs sampling algorithm for FLDA, and demonstrate that it provides excellent scalability on large clusters. Finally, we incorporate the FLDA model in a general search framework for topic-specific influencers. A user freely expresses his/her interest by typing a few keywords, the search framework will return a ranked list of key influencers that satisfy the user's interest.

References

  1. A. Ahmed, M. Aly, J. Gonzalez, S. Narayanamurthy, and A. J. Smola. Scalable inference in latent variable models. In WSDM'12, pages 123--132, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Barbieri, F. Bonchi, and G. Manco. Topic-aware social influence propagation models. In Proceedings of the 2012 IEEE 12th International Conference on Data Mining, ICDM '12, pages 81--90, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Battré, S. Ewen, F. Hueske, O. Kao, V. Markl, and D. Warneke. Nephele/pacts: a programming model and execution framework for web-scale analytical processing. In SoCC, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, March 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. V. R. Borkar, M. J. Carey, R. Grover, N. Onose, and R. Vernica. Hyracks: A flexible and extensible foundation for data-intensive computing. In ICDE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In WWW'98, pages 107--117, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst. Haloop: efficient iterative data processing on large clusters. PVLDB, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Chakrabarti and C. Faloutsos. Graph mining: Laws, generators, and algorithms. ACM COMPUTING SURVEYS, 38(1):2, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. Chen, Y. Wang, and S. Yang. Efficient influence maximization in social networks. In KDD '09, pages 199--208, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Erosheva, S. Fienberg, and J. Lafferty. Mixed-membership models of scientific publications. Proceedings of the National Academy of Sciences, 101:5220--5227, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  11. S. Ghosh, N. Sharma, F. Benevenuto, N. Ganguly, and K. Gummadi. Cognos: crowdsourcing search for topic experts in microblogs. In SIGIR '12, pages 575--590, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K. Gimpel, N. Schneider, B. O'Connor, D. Das, D. Mills, J. Eisenstein, M. Heilman, D. Yogatama, J. Flanigan, and N. A. Smith. Part-of-speech tagging for twitter: Annotation, features, and experiments. In ACL, pages 42--47, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Girolami and A. Kabán. On an equivalence between plsi and lda. In SIGIR '03, pages 433--434, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H. Haramoto, M. Matsumoto, T. Nishimura, F. Panneton, and P. L'Ecuyer. Efficient Jump Ahead for 2-Linear Random Number Generators. INFORMS Journal on Computing, 20(3):385--390, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. H. Haveliwala. Topic-sensitive pagerank. In WWW '02, pages 517--526, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Java, P. Kolari, T. Finin, and T. Oates. Modeling the spread of influence on the blogosphere. In WWW 2006 Workshop on Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2006.Google ScholarGoogle Scholar
  17. D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In KDD '03, pages 137--146, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost-effective outbreak detection in networks. In KDD '07, pages 420--429, 2007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. Liu, J. Tang, J. Han, and S. Yang. Learning influence from heterogeneous social networks. Data Mining and Knowledge Discovery, 25:511--544, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  20. G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. D. Manning, P. Raghavan, and H. Schtze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Nallapati and W. W. Cohen. Link-plsa-lda: A new unsupervised model for topics and influence of blogs. In Proceedings of the Second International Conference on Weblogs and Social Media, 2008.Google ScholarGoogle Scholar
  23. D. Newman, A. Asuncion, P. Smyth, and M. Welling. Distributed algorithms for topic models. J. Mach. Learn. Res., 10:1801--1828, Dec. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Pal and S. Counts. Identifying topical authorities in microblogs. In WSDM '11, pages 45--54, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Smola and S. Narayanamurthy. An architecture for parallel topic models. PVLDB, 3(1-2):703--710, Sept. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in large-scale networks. In KDD '09, pages 807--816, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Twitter.com. Twitter turns six, 2012.Google ScholarGoogle Scholar
  28. J. Weng, E.-P. Lim, J. Jiang, and Q. He. Twitterrank: finding topic-sensitive influential twitterers. In WSDM '10, pages 261--270, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In NSDI'12, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Scalable topic-specific influence analysis on microblogs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining
      February 2014
      712 pages
      ISBN:9781450323512
      DOI:10.1145/2556195

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 February 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      WSDM '14 Paper Acceptance Rate64of355submissions,18%Overall Acceptance Rate498of2,863submissions,17%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader