skip to main content
10.1145/2631775.2631797acmconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
research-article

Exploiting the wisdom of the crowds for characterizing and connecting heterogeneous resources

Published: 01 September 2014 Publication History

Abstract

Heterogeneous content is an inherent problem for cross-system search, recommendation and personalization. In this paper we investigate differences in topic coverage and the impact of topics in different kinds of Web services. We use entity extraction and categorization to create fingerprints that allow for meaningful comparison. As a basis taxonomy, we use the 23 main categories of Wikipedia Category Graph, which has been assembled over the years by the wisdom of the crowds. Following a proof of concept of our approach, we analyze differences in topic coverage and topic impact. The results show many differences between Web services like Twitter, Flickr and Delicious, which reflect users' behavior and the usage of each system. The paper concludes with a user study that demonstrates the benefits of fingerprints over traditional textual methods for recommendations of heterogeneous resources.

References

[1]
F. Abel, Q. Gao, G.-J. Houben, and K. Tao. Analyzing user modeling on twitter for personalized news recommendations. In International Conference on User Modeling, Adaptation and Personalization (UMAP), Girona, Spain. Springer, July 2011.
[2]
F. Abel, N. Henze, E. Herder, and D. Krause. Linkage, aggregation, alignment and enrichment of public user profiles with mypes. In A. Paschke, N. Henze, and T. Pellegrini, editors, Proceedings the 6the International Conference on Semantic Systems, I-SEMANTICS 2010, Graz, Austria, September 1--3, 2010, ACM International Conference Proceeding Series. ACM, September 2010.
[3]
M. Ames and M. Naaman. Why we tag: motivations for annotation in mobile and online media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '07, pages 971--980, New York, NY, USA, 2007. ACM.
[4]
P. Brusilovsky, A. Kobsa, and W. Nejdl, editors. The Adaptive Web, Methods and Strategies of Web Personalization, volume 4321 of Lecture Notes in Computer Science. Springer, 2007.
[5]
Z. Chen, J. Cao, Y. Song, Y. Zhang, and J. Li. Web video categorization based on wikipedia categories and content-duplicated open resources. In Proceedings of the international conference on Multimedia, MM '10, pages 1107--1110, New York, NY, USA, 2010. ACM.
[6]
D. J. Crandall, L. Backstrom, D. Huttenlocher, and J. Kleinberg. Mapping the world's photos. In Proceedings of the 18th international conference on World wide web, WWW '09, pages 761--770, New York, NY, USA, 2009. ACM.
[7]
M. Grineva, M. Grinev, D. Lizorkin, A. Boldakov, D. Turdakov, A. Sysoev, and A. Kiyko. Blognoon: exploring a topic in the blogosphere. In Proceedings of the 20th international conference companion on World wide web, WWW '11, pages 213--216, New York, NY, USA, 2011. ACM.
[8]
A. Hotho, R. Jaschke, C. Schmitz, and G. Stumme. Information retrieval in folksonomies: search and ranking. In Proceedings of the 3rd European conference on The Semantic Web: research and applications, ESWC'06, pages 411--426, Berlin, Heidelberg, 2006. Springer-Verlag.
[9]
R. Kawase, P. Siehndel, E. Herder, and W. Nejdl. Hyperlink of men. In Proceedings of the 2012 Latin American Web Congress (la-web 2012), LA-WEB '12, Washington, DC, USA, 2012. IEEE Computer Society.
[10]
A. Kittur, E. H. Chi, and B. Suh. What's in wikipedia?: mapping topics and conflict using socially annotated category structure. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '09, pages 1509--1512, New York, NY, USA, 2009. ACM.
[11]
B. Köhncke and W.-T. Balke. Using wikipedia categories for compact representations of chemical documents. In Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM '10, pages 1809--1812, New York, NY, USA, 2010. ACM.
[12]
Y. Koren. Collaborative filtering with temporal dynamics. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '09, pages 447--456, New York, NY, USA, 2009. ACM.
[13]
K. H. Lim and A. Datta. Finding twitter communities with common interests using following links of celebrities. In Proceedings of the 3rd international workshop on Modeling social media, MSM '12, pages 25--32, New York, NY, USA, 2012. ACM.
[14]
M. Michelson and S. A. Macskassy. Discovering users' topics of interest on twitter: a first look. In Proceedings of the fourth workshop on Analytics for noisy unstructured text data, AND '10, pages 73--80, New York, NY, USA, 2010. ACM.
[15]
D. Milne and I. H. Witten. Learning to link with wikipedia. In CIKM '08: Proceeding of the 17th ACM conference on Information and knowledge management, pages 509--518, New York, NY, USA, 2008. ACM.
[16]
G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, New York, 1983.
[17]
G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Commun. ACM, 18(11):613--620, Nov. 1975.
[18]
A. Sieg, B. Mobasher, and R. Burke. Web search personalization with ontological user profiles. In Proceedings of the sixteenth ACM Conference on information and knowledge management, CIKM '07, pages 525--534, New York, NY, USA, 2007. ACM.
[19]
F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: A core of semantic knowledge. In 16th international World Wide Web conference, New York, NY, USA, 2007. ACM Press.
[20]
S. Xu, S. Bao, B. Fei, Z. Su, and Y. Yu. Exploring folksonomy for personalized search. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '08, pages 155--162, New York, NY, USA, 2008. ACM.
[21]
J. Yu, J. A. Thom, and A. Tam. Ontology evaluation using wikipedia categories for browsing. In Proceedings of the sixteenth ACM Conference on information and knowledge management, CIKM '07, pages 223--232, New York, NY, USA, 2007. ACM.

Cited By

View all
  • (2016)Testing the stability of “wisdom of crowds” judgments of search results over time and their similarity with the search engine rankingsAslib Journal of Information Management10.1108/AJIM-10-2015-016568:4(407-427)Online publication date: 18-Jul-2016
  • (2016)Automatic Creation and Analysis of a Linked Data Cloud DiagramWeb Information Systems Engineering – WISE 201610.1007/978-3-319-48740-3_31(417-432)Online publication date: 2-Nov-2016
  • (2014)Identifying topic-related hyperlinks on twitterProceedings of the 2014 International Conference on Posters & Demonstrations Track - Volume 127210.5555/2878453.2878546(369-372)Online publication date: 21-Oct-2014
  • Show More Cited By

Index Terms

  1. Exploiting the wisdom of the crowds for characterizing and connecting heterogeneous resources

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    HT '14: Proceedings of the 25th ACM conference on Hypertext and social media
    September 2014
    346 pages
    ISBN:9781450329545
    DOI:10.1145/2631775
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 September 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. classification
    2. comparison
    3. domain independent
    4. fingerprints
    5. twikime
    6. wikipedia

    Qualifiers

    • Research-article

    Conference

    HT '14
    Sponsor:

    Acceptance Rates

    HT '14 Paper Acceptance Rate 49 of 86 submissions, 57%;
    Overall Acceptance Rate 378 of 1,158 submissions, 33%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)Testing the stability of “wisdom of crowds” judgments of search results over time and their similarity with the search engine rankingsAslib Journal of Information Management10.1108/AJIM-10-2015-016568:4(407-427)Online publication date: 18-Jul-2016
    • (2016)Automatic Creation and Analysis of a Linked Data Cloud DiagramWeb Information Systems Engineering – WISE 201610.1007/978-3-319-48740-3_31(417-432)Online publication date: 2-Nov-2016
    • (2014)Identifying topic-related hyperlinks on twitterProceedings of the 2014 International Conference on Posters & Demonstrations Track - Volume 127210.5555/2878453.2878546(369-372)Online publication date: 21-Oct-2014
    • (2014)The ARCOMEM Architecture for Social- and Semantic-Driven Web ArchivingFuture Internet10.3390/fi60406886:4(688-716)Online publication date: 4-Nov-2014

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media