skip to main content
10.1145/2911451.2911467acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Cobwebs from the Past and Present: Extracting Large Social Networks using Internet Archive Data

Published: 07 July 2016 Publication History

Abstract

Social graph construction from various sources has been of interest to researchers due to its application potential and the broad range of technical challenges involved. The World Wide Web provides a huge amount of continuously updated data and information on a wide range of topics created by a variety of content providers, and makes the study of extracted people networks and their temporal evolution valuable for social as well as computer scientists. In this paper we present SocGraph - an extraction and exploration system for social relations from the content of around 2 billion web pages collected by the Internet Archive over the 17 years time period between 1996 and 2013. We describe methods for constructing large social graphs from extracted relations and introduce an interface to study their temporal evolution.

References

[1]
C. Bird, A. Gourley, P. Devanbu, M. Gertz, and A. Swaminathan. Mining email social networks. In MSR Workshop 2006.
[2]
X. Canaleta, P. Ros, A. Vallejo, D. Vernet, and A. Zaballos. A system to extract social networks based on the processing of information obtained from internet. In ICC Association for Artificial Intelligence 2008.
[3]
D. K. Elson, N. Dames, and K. R. McKeown. Extracting social networks from literary fiction. In Association for Computational Linguistics (ACL) System Demonstrations, 2010.
[4]
C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations, pages 55--60, 2014.
[5]
Y. Matsuo, J. Mori, M. Hamasaki, K. Ishida, T. Nishimura, H. Takeda, K. Hasida, and M. Ishizuka. Polyphonet: An advanced social network extraction system from the web. In Proceedings of World Wide Web Conference 2006, Semantic Web Track.
[6]
M. K. M. Nasution and S. A. Noah. Superficial method for extracting social network for academics using web snippets. In RSKT '10.
[7]
R. Nuray-Turan, Z. Chen, D. V. Kalashnikov, and S. Mehrotra. Exploiting web querying for web people search in weps2. In Web People Search Evaluation Workshop (WePS 2009).
[8]
S. Siersdorfer, P. Kemkes, H. Ackermann, and S. Zerr. Who with whom and how?: Extracting large social networks using search engines. In CIKM '15, pages 1491--1500. ACM.
[9]
L. Wieneke, M. Düring, G. Sillaume, C. Lallemand, V. Croce, M. Lazzaro, F. S. Nucci, C. Pasini, P. Fraternali, M. Tagliasacchi, M. Melenhorst, J. Novak, I. Micheel, E. Harloff, and J. G. Moron. Building the social graph of the history of european integration - A pipeline for humanist-machine interaction in the digital humanities. In SocInfo 2013 International Workshops.
[10]
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. In Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud'10, pages 10--10, Berkeley, CA, USA, 2010. USENIX Association.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
July 2016
1296 pages
ISBN:9781450340694
DOI:10.1145/2911451
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2016

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. social graphs temporal evolution

Qualifiers

  • Research-article

Funding Sources

Conference

SIGIR '16
Sponsor:

Acceptance Rates

SIGIR '16 Paper Acceptance Rate 62 of 341 submissions, 18%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media