skip to main content
10.1145/1810617.1810642acmconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
research-article

Community-based ranking of the social web

Published: 13 June 2010 Publication History

Abstract

The rise of social interactions on the Web requires developing new methods of information organization and discovery. To that end, we propose a generative community-based probabilistic tagging model that can automatically uncover communities of users and their associated tags. We experimentally validate the quality of the discovered communities over the social bookmarking system Delicious. In comparison to an alternative generative model (Latent Dirichlet Allocation (LDA), we find that the proposed community-based model improves the empirical likelihood of held-out test data and discovers more coherent interest-based communities. Based on the community-based probabilistic tagging model, we develop a novel community-based ranking model for effective community-based exploration of socially-tagged Web resources. We compare community-based ranking to three state-of-the-art retrieval models: (i) BM25; (ii) Cluster-based retrieval using K-means clustering; and (iii) LDA-based retrieval. We find that the proposed ranking model results in a significant improvement over these alternatives (from 7% to 22%) in the quality of retrieved pages.

References

[1]
R. Albert and A. Barabási. Statistical mechanics of complex networks. Reviews of Modern Physics, 74(1):47--97, 2002.
[2]
R. Albert, H. Jeong, and A.-L. Barabasi. The diameter of the world wide web. Nature, 401:130, 1999.
[3]
S. Bao, G. Xue, X. Wu, Y. Yu, B. Fei, and Z. Su. Optimizing web search using social annotations. In WWW, 2007.
[4]
D. M. Blei, T. L. Griffiths, M. I. Jordan, and J. B. Tenenbaum. Hierarchical topic models and the nested chinese restaurant process. In NIPS, 2004.
[5]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. In JMLR, volume 3, pages 993--1022, April 2003.
[6]
C. H. Brooks and N. Montanez. Improved annotation of the blogosphere via autotagging and hierarchical clustering. In WWW, 2006.
[7]
C. Cattuto, A. Baldassarri, V. D. P. Servedio, and V. Loreto. Vocabulary growth in collaborative tagging systems, 2007.
[8]
C. Cattuto, A. Baldassarri, V. D. P. Servedio, and V. Loreto. Emergent community structure in social tagging systems. CoRR, 2008.
[9]
C. Cattuto, V. Loreto, and L. Pietronero. Collaborative tagging and semiotic dynamics, 2006.
[10]
A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Physical Review E, 70:066111, 2004.
[11]
D. Coppersmith, L. Fleischer, and A. Rudra. Ordering by weighted number of wins gives a good ranking for weighted tournaments. In SODA, pages 776--782, 2006.
[12]
S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. JASIST, 41(6):391--407, 1990.
[13]
M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. In SIGCOMM, 1999.
[14]
M. Girvan and M. E. J. Newman. Community structure in social and biological networks. PNAS, 99:7821, 2002.
[15]
S. Golder and B. A. Huberman. The structure of collaborative tagging systems, Aug 2005.
[16]
H. Halpin, V. Robu, and H. Shepherd. The complex dynamics of collaborative tagging. In WWW, 2007.
[17]
T. H. Haveliwala. Topic-sensitive pagerank. In WWW, pages 517--526, May 2002.
[18]
G. Heinrich. Parameter estimation for text analysis. Technical report, 2004.
[19]
P. Heymann, G. Koutrika, and H. Garcia-Molina. Can social bookmarking improve web search? In WSDM, 2008.
[20]
T. Hofmann. Probabilistic latent semantic indexing. In SIGIR, pages 50--57, 1999.
[21]
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. TOIS, 20(4):422--446, 2002.
[22]
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604--632, 1999.
[23]
H. Li, Z. Nie, W.-C. Lee, L. Giles, and J.-R. Wen. Scalable community discovery on textual data with relations. In CIKM, pages 1203--1212, 2008.
[24]
R. Li, S. Bao, Y. Yu, B. Fei, and Z. Su. Towards effective browsing of large scale social annotations. In WWW, 2007.
[25]
W. Li and A. Mccallum. Pachinko allocation: Dag-structured mixture models of topic correlations. In ICML, 2006.
[26]
X. Li, L. Guo, and Y. E. Zhao. Tag-based social interest discovery. In WWW, pages 675--684, 2008.
[27]
X. Liu and W. B. Croft. Cluster-based retrieval using language models. In SIGIR, 2004.
[28]
A. Mccallum, X. Wang, and A. Corrada-Emmanuel. Topic and role discovery in social networks. In IJCAI, 2005.
[29]
A. K. McCallum. Mallet: A machine learning for language toolkit. http://mallet.cs.umass.edu, 2002.
[30]
T. Minka and J. Lafferty. Expectation-propagation for the generative aspect model. In UAI, pages 352--359, 2003.
[31]
T. P. Minka. Estimating a dirichlet distribution. 2003.
[32]
L. Nie, B. D. Davison, and B. Wu. From whence does your authority come?: utilizing community relevance in ranking. In AAAI, pages 1421--1426, 2007.
[33]
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web, 1999.
[34]
A. Plangrasopchok and K. Lerman. Exploiting social annotation for automatic resource discovery. In IIWeb, 2007.
[35]
D. Ramage, P. Heymann, C. D. Manning, and H. G. Molina. Clustering the tagged web. In WSDM, 2009.
[36]
S. Robertson and H. Zaragoza. The probabilistic relevance method: Bm25 and beyond. In SIGIR Tutorial, 2007.
[37]
C. Veres. The language of folksonomies: What tags reveal about user classification. NLDB, pages 58--69, 2006.
[38]
D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature, (6684).
[39]
X. Wei and W. B. Croft. Lda-based document models for ad-hoc retrieval. In SIGIR, 2006.
[40]
F. Wilcoxon. Individual comparisons by ranking methods. Biometrics Bulletin, 1(6):80--83, 1945.
[41]
X. Wu, L. Zhang, and Y. Yu. Exploring social annotations for the semantic web. In WWW, 2006.
[42]
S. Xu, S. Bao, B. Fei, Z. Su, and Y. Yu. Exploring folksonomy for personalized search. In SIGIR, 2008.
[43]
Y. Yanbe, A. Jatowt, S. Nakamura, and K. Tanaka. Can social bookmarking enhance search in the web? In JCDL, pages 107--116, 2007.
[44]
D. Zhou, J. Bian, S. Zheng, H. Zha, and C. L. Giles. Exploring social annotations for information retrieval. In WWW, 2008.
[45]
Y. Zhou and J. Davis. Community discovery and analysis in blogspace. In WWW, pages 1017--1018, 2006.

Cited By

View all
  • (2018)Discovery of Web user communities and their role in personalizationUser Modeling and User-Adapted Interaction10.1007/s11257-011-9111-y22:1-2(151-175)Online publication date: 26-Dec-2018
  • (2014)Online abusive users analytics through visualizationProceedings of the 23rd International Conference on World Wide Web10.1145/2567948.2577019(155-158)Online publication date: 7-Apr-2014
  • (2014)The social distributional hypothesis: a pragmatic proxy for homophily in online social networksSocial Network Analysis and Mining10.1007/s13278-014-0216-24:1Online publication date: 22-Aug-2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HT '10: Proceedings of the 21st ACM conference on Hypertext and hypermedia
June 2010
328 pages
ISBN:9781450300414
DOI:10.1145/1810617
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. community
  2. ranking
  3. social
  4. tagging

Qualifiers

  • Research-article

Conference

HT '10
Sponsor:
HT '10: 21st ACM Conference on Hypertext and Hypermedia
June 13 - 16, 2010
Ontario, Toronto, Canada

Acceptance Rates

Overall Acceptance Rate 378 of 1,158 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Discovery of Web user communities and their role in personalizationUser Modeling and User-Adapted Interaction10.1007/s11257-011-9111-y22:1-2(151-175)Online publication date: 26-Dec-2018
  • (2014)Online abusive users analytics through visualizationProceedings of the 23rd International Conference on World Wide Web10.1145/2567948.2577019(155-158)Online publication date: 7-Apr-2014
  • (2014)The social distributional hypothesis: a pragmatic proxy for homophily in online social networksSocial Network Analysis and Mining10.1007/s13278-014-0216-24:1Online publication date: 22-Aug-2014
  • (2012)Early Detection of Policies Violations in a Social Media SiteProceedings of the 2012 IEEE International Symposium on Policies for Distributed Systems and Networks10.1109/POLICY.2012.19(45-52)Online publication date: 16-Jul-2012
  • (2012)Mining Divergent Opinion Trust Networks through Latent Dirichlet AllocationProceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)10.1109/ASONAM.2012.158(879-886)Online publication date: 26-Aug-2012
  • (2012)Topical community detection from mining user tagging behavior and interestJournal of the American Society for Information Science and Technology10.1002/asi.2274064:2(321-333)Online publication date: 21-Dec-2012
  • (2010)Community assessment using evidence networksProceedings of the 2010 international conference on Analysis of social media and ubiquitous data10.5555/2035637.2035642(79-98)Online publication date: 13-Jun-2010
  • (2010)Community assessment using evidence networksProceedings of the 2010th International Conference on Analysis of Social Media and Ubiquitous Data10.1007/978-3-642-23599-3_5(79-98)Online publication date: 13-Jun-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media