research-article

Scalable topic-specific influence analysis on microblogs

Authors:
Bin Bi

University of California, Los Angeles, Los Angeles, CA, USA

University of California, Los Angeles, Los Angeles, CA, USA
View Profile

,
Yuanyuan Tian

IBM Almaden Research Center, San Jose, CA, USA

IBM Almaden Research Center, San Jose, CA, USA
View Profile

,
Yannis Sismanis

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Andrey Balmin

GraphSQL, Mountain View, CA, USA

GraphSQL, Mountain View, CA, USA
View Profile

,
Junghoo Cho

University of California, Los Angeles, Los Angeles, CA, USA

University of California, Los Angeles, Los Angeles, CA, USA
View Profile

WSDM '14: Proceedings of the 7th ACM international conference on Web search and data miningFebruary 2014Pages 513–522https://doi.org/10.1145/2556195.2556229

Published:24 February 2014Publication History

WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining

Pages 513–522

ABSTRACT

Social influence analysis on microblog networks, such as Twitter, has been playing a crucial role in online advertising and brand management. While most previous influence analysis schemes rely only on the links between users to find key influencers, they omit the important text content created by the users. As a result, there is no way to differentiate the social influence in different aspects of life (topics). Although a few prior works do support topic-specific influence analysis, they either separate the analysis of content from the analysis of network structure, or assume that content is the only cause of links, which is clearly an inappropriate assumption for microblog networks.

To address the limitations of the previous approaches, we propose a novel Followship-LDA (FLDA) model, which integrates both content topic discovery and social influence analysis in the same generative process. This model properly captures the content-related and content-independent reasons why a user follows another in a microblog network. We demonstrate that FLDA produces results with significantly better precision than existing approaches. Furthermore, we propose a distributed Gibbs sampling algorithm for FLDA, and demonstrate that it provides excellent scalability on large clusters. Finally, we incorporate the FLDA model in a general search framework for topic-specific influencers. A user freely expresses his/her interest by typing a few keywords, the search framework will return a ranked list of key influencers that satisfy the user's interest.

References

A. Ahmed, M. Aly, J. Gonzalez, S. Narayanamurthy, and A. J. Smola. Scalable inference in latent variable models. In WSDM'12, pages 123--132, 2012. Google ScholarDigital Library
N. Barbieri, F. Bonchi, and G. Manco. Topic-aware social influence propagation models. In Proceedings of the 2012 IEEE 12th International Conference on Data Mining, ICDM '12, pages 81--90, 2012. Google ScholarDigital Library
D. Battré, S. Ewen, F. Hueske, O. Kao, V. Markl, and D. Warneke. Nephele/pacts: a programming model and execution framework for web-scale analytical processing. In SoCC, 2010. Google ScholarDigital Library
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, March 2003. Google ScholarDigital Library
V. R. Borkar, M. J. Carey, R. Grover, N. Onose, and R. Vernica. Hyracks: A flexible and extensible foundation for data-intensive computing. In ICDE, 2011. Google ScholarDigital Library
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In WWW'98, pages 107--117, 1998. Google ScholarDigital Library
Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst. Haloop: efficient iterative data processing on large clusters. PVLDB, 2010. Google ScholarDigital Library
D. Chakrabarti and C. Faloutsos. Graph mining: Laws, generators, and algorithms. ACM COMPUTING SURVEYS, 38(1):2, 2006. Google ScholarDigital Library
W. Chen, Y. Wang, and S. Yang. Efficient influence maximization in social networks. In KDD '09, pages 199--208, 2009. Google ScholarDigital Library
E. Erosheva, S. Fienberg, and J. Lafferty. Mixed-membership models of scientific publications. Proceedings of the National Academy of Sciences, 101:5220--5227, 2004.Google ScholarCross Ref
S. Ghosh, N. Sharma, F. Benevenuto, N. Ganguly, and K. Gummadi. Cognos: crowdsourcing search for topic experts in microblogs. In SIGIR '12, pages 575--590, 2012. Google ScholarDigital Library
K. Gimpel, N. Schneider, B. O'Connor, D. Das, D. Mills, J. Eisenstein, M. Heilman, D. Yogatama, J. Flanigan, and N. A. Smith. Part-of-speech tagging for twitter: Annotation, features, and experiments. In ACL, pages 42--47, 2011. Google ScholarDigital Library
M. Girolami and A. Kabán. On an equivalence between plsi and lda. In SIGIR '03, pages 433--434, 2003. Google ScholarDigital Library
H. Haramoto, M. Matsumoto, T. Nishimura, F. Panneton, and P. L'Ecuyer. Efficient Jump Ahead for 2-Linear Random Number Generators. INFORMS Journal on Computing, 20(3):385--390, 2008. Google ScholarDigital Library
T. H. Haveliwala. Topic-sensitive pagerank. In WWW '02, pages 517--526, 2002. Google ScholarDigital Library
A. Java, P. Kolari, T. Finin, and T. Oates. Modeling the spread of influence on the blogosphere. In WWW 2006 Workshop on Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2006.Google Scholar
D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In KDD '03, pages 137--146, 2003. Google ScholarDigital Library
J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost-effective outbreak detection in networks. In KDD '07, pages 420--429, 2007 Google ScholarDigital Library
L. Liu, J. Tang, J. Han, and S. Yang. Learning influence from heterogeneous social networks. Data Mining and Knowledge Discovery, 25:511--544, 2012.Google ScholarCross Ref
G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In SIGMOD, 2010. Google ScholarDigital Library
C. D. Manning, P. Raghavan, and H. Schtze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008. Google ScholarDigital Library
R. Nallapati and W. W. Cohen. Link-plsa-lda: A new unsupervised model for topics and influence of blogs. In Proceedings of the Second International Conference on Weblogs and Social Media, 2008.Google Scholar
D. Newman, A. Asuncion, P. Smyth, and M. Welling. Distributed algorithms for topic models. J. Mach. Learn. Res., 10:1801--1828, Dec. 2009. Google ScholarDigital Library
A. Pal and S. Counts. Identifying topical authorities in microblogs. In WSDM '11, pages 45--54, 2011. Google ScholarDigital Library
A. Smola and S. Narayanamurthy. An architecture for parallel topic models. PVLDB, 3(1-2):703--710, Sept. 2010. Google ScholarDigital Library
J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in large-scale networks. In KDD '09, pages 807--816, 2009. Google ScholarDigital Library
Twitter.com. Twitter turns six, 2012.Google Scholar
J. Weng, E.-P. Lim, J. Jiang, and Q. He. Twitterrank: finding topic-sensitive influential twitterers. In WSDM '10, pages 261--270, 2010. Google ScholarDigital Library
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In NSDI'12, 2012. Google ScholarDigital Library

Index Terms

Scalable topic-specific influence analysis on microblogs
1. Information systems
  1. Information systems applications

Recommendations

A Topic Aware-based Approach to Maximize Social Influence
WebMedia '14: Proceedings of the 20th Brazilian Symposium on Multimedia and the Web

The use of social networks has shown great potential for information diffusion and formation of public opinion. One key problem that has attracted researchers interest is Topic-based Influence Maximization, that refers to finding a small set of users on ...
Read More
Extracting time series variation of topic popularity in microblogs
iiWAS2018: Proceedings of the 20th International Conference on Information Integration and Web-based Applications & Services

Extracting topics and their popularities in microblogs is a promising approach to discover popular topics in the world. To challenge this task, some methods that estimate popularity of topics based on Latent Dirichlet Allocation (LDA) has been proposed. ...
Read More
Topic-Level Bursty Study for Bursty Topic Detection in Microblogs
Advances in Knowledge Discovery and Data Mining
Abstract
Microblogging services, such as Twitter and Sina Weibo, have gained tremendous popularity in recent years. The huge amount of user-generated information is spread on microblogs. Such user-generated contents are a mixture of different bursty topics ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining
February 2014
712 pages
ISBN:9781450323512
DOI:10.1145/2556195
General Chairs:
Ben Carterette
University of Delaware, USA
,
Fernando Diaz
Microsoft Research, USA
,
Program Chairs:
Carlos Castillo
Qatar Computing Research Institute, Qatar
,
Donald Metzler
Google, USA
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 February 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
social influence analysis
Qualifiers
- research-article
Conference

Acceptance Rates
WSDM '14 Paper Acceptance Rate64of355submissions,18%Overall Acceptance Rate498of2,863submissions,17%
More
Upcoming Conference
WSDM '25

Sponsor:

sigir

sigir

sigir

sigir

The Eighteenth ACM International Conference on Web Search and Data Mining

April 7 - 11, 2025

Hannover , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 48
  Total Citations
  View Citations
- 623
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Scalable topic-specific influence analysis on microblogs

WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Topic Aware-based Approach to Maximize Social Influence

Extracting time series variation of topic popularity in microblogs

Topic-Level Bursty Study for Bursty Topic Detection in Microblogs