skip to main content
10.1145/1772690.1772722acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Inferring relevant social networks from interpersonal communication

Published: 26 April 2010 Publication History

Abstract

Researchers increasingly use electronic communication data to construct and study large social networks, effectively inferring unobserved ties (e.g. i is connected to j) from observed communication events (e.g. i emails j). Often overlooked, however, is the impact of tie definition on the corresponding network, and in turn the relevance of the inferred network to the research question of interest. Here we study the problem of network inference and relevance for two email data sets of different size and origin. In each case, we generate a family of networks parameterized by a threshold condition on the frequency of emails exchanged between pairs of individuals. After demonstrating that different choices of the threshold correspond to dramatically different network structures, we then formulate the relevance of these networks in terms of a series of prediction tasks that depend on various network features. In general, we find: a) that prediction accuracy is maximized over a non-trivial range of thresholds corresponding to 5-10 reciprocated emails per year; b) that for any prediction task, choosing the optimal value of the threshold yields a sizable (~30%) boost in accuracy over naive choices; and c) that the optimal threshold value appears to be (somewhat surprisingly) consistent across data sets and prediction tasks. We emphasize the practical utility in defining ties via their relevance to the prediction task(s) at hand and discuss implications of our empirical results.

References

[1]
Lada Adamic and Eytan Adar. How to search a social network. Social Networks, 27(3):187--203, July 2005.
[2]
Peter Bearman and Paolo Parigi. Cloning headless frogs and other important matters: Conversation topics and network structure. Social Forces, 83(2):535--557, December 2004.
[3]
Christopher J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov., 2(2):121--167, 1998.
[4]
Ronald S. Burt. Structural holes and good ideas. The American Journal of Sociology, 110(2):349--399, 2004.
[5]
Aaron Clauset and Nathan Eagle. Persistence and periodicity in a dynamic proximity network. In DIMACS Workshop on Computational Methods for Dynamic Interaction Networks, 2007.
[6]
Corinna Cortes, Daryl Pregibon, and Chris Volinsky. Computational methods for dynamic graphs. Journal of Computational and Graphical Statistics, 12(4):950--970, December 2003.
[7]
Leon Danon, Albert Diaz-Guilera, Jordi Duch, and Alex Arenas. Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, page P09008, 2005.
[8]
Jana Diesner, Terrill L. Frantz, and Kathleen M. Carley. Communication networks from the enron email corpus "it's always about the people. enron is no different". Comput. Math. Organ. Theory, 11(3):201--228, 2005.
[9]
N. Eagle, A. Pentland, and D. Lazer. Inferring social network structure using mobile phone data. PNAS, 106(36):15274--15278, 2009.
[10]
Jean-Pierre Eckmann, Elisha Moses, and Danilo Sergi. Entropy of dialogues creates coherent structures in e-mail traffic. Proceedings of the National Academy of Sciences of the United States of America, 101(40):14333--14337, October 2004.
[11]
Scott L. Feld. The focused organization of social ties. The American Journal of Sociology, 86(5):1015--1035, 1981.
[12]
M. S. Granovetter. The strength of weak ties. The American Journal of Sociology, 78(6):1360--1380, 1973.
[13]
M. Hammer. Social access and the clustering of personal connections. Social Networks, 2(4):305--325, 1980.
[14]
Jake M. Hofman and Chris H. Wiggins. A bayesian approach to network modularity. Phys Rev Lett., 100(5), June 2008.
[15]
Gueorgi Kossinets and Duncan J. Watts. Empirical analysis of an evolving social network. Science, 311(5757):88--90, January 2006.
[16]
Ravi Kumar, Jasmine Novak, and Andrew Tomkins. Structure and evolution of online social networks. In KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 611--617, New York, NY, USA, 2006. ACM.
[17]
Ravi Kumar, Andrew Tomkins, and Erik Vee. Connectivity structure of bipartite graphs via the knc-plot. In WSDM '08: Proceedings of the international conference on Web search and web data mining, pages 129--138, New York, NY, USA, 2008. ACM.
[18]
Jure Leskovec and Eric Horvitz. Planetary-scale views on a large instant messaging network. In WWW '08: Proceeding of the 17th international conference on World Wide Web, pages 915--924, New York, NY, USA, 2008. ACM.
[19]
David Liben-Nowell, Jasmine Novak, Ravi Kumar, Prabhakar Raghavan, and Andrew Tomkins. Geographic routing in social networks. Proceedings of the National Academy of Sciences of the United States of America, 102(33):11623--11628, August 2005.
[20]
R. Dean Malmgren, Jake M. Hofman, Luis A.N. Amaral, and Duncan J. Watts. Characterizing individual communication patterns. In KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 607--616, New York, NY, USA, 2009. ACM.
[21]
Peter V. Marsden. Network data and measurement. Annual Review of Sociology, 16:435--463, 1990.
[22]
Winter Mason and Sid Suri. Predicting individual success in social networks. In preparation.
[23]
Miller Mcpherson, Lynn S. Lovin, and James M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27(1):415--444, 2001.
[24]
Theodore Mead Newcomb. The acquaintance process. Holt, Rinehart and Winston, New York, NY, 1961.
[25]
M. E. Newman. Scientific collaboration networks. ii. shortest paths, weighted networks, and centrality. Phys Rev E Stat Nonlin Soft Matter Phys, 64(1 Pt 2), July 2001.
[26]
M. E. J. Newman and M. Girvan. Finding and evaluating community structure in networks. Physical Review E, 69(2):026113+, Feb 2004.
[27]
J.-P. Onella, J. Saramaki, J. Hyvonen, M. Argollo de Menezes, K. Kaski, A.-L. Barabasi, and J. Kertesz. Analysis of a large-scale weighted network of one-to-one human communication. New Journal of Physics, 9(6):179--204, February 2007.
[28]
Michael F. Schwartz and David C. M. Wood. Discovering shared interests using graph analysis. Commun. ACM, 36(8):78--89, 1993.
[29]
J. Shetty and J. Adibi. Enron email dataset. Technical report, USC Information Sciences Institute, 2004.
[30]
Eric Sun, Itamar Rosenn, Cameron Marlow, and Thomas Lento. Gesundheit! modeling contagion through facebook news feed. In ICWSM '09: Proceedings of the Third International Conference on Weblogs and Social Media, San Jose, CA, May 2009. AAAI Press.
[31]
Joshua R. Tyler, Dennis M. Wilkinson, and Bernardo A. Huberman. Email as spectroscopy: Automated discovery of community structure within organizations, Mar 2003.
[32]
Y Wang and G Wong. Stochastic blockmodels for directed graphs. Journal of the American Statistical Association, 1987.
[33]
D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature, 393(6684):440--442, June 1998.

Cited By

View all
  • (2024)Developing the ‘omic toolkit of comparative physiologistsComparative Biochemistry and Physiology Part D: Genomics and Proteomics10.1016/j.cbd.2024.101287(101287)Online publication date: Jul-2024
  • (2023)Does money strengthen our social ties? Longitudinal evidence of lottery winnersRationality and Society10.1177/1043463123115956735:2(139-166)Online publication date: 21-Feb-2023
  • (2023)Large-Scale Analysis of New Employee Network DynamicsProceedings of the ACM Web Conference 202310.1145/3543507.3583400(2719-2730)Online publication date: 30-Apr-2023
  • Show More Cited By

Index Terms

  1. Inferring relevant social networks from interpersonal communication

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '10: Proceedings of the 19th international conference on World wide web
    April 2010
    1407 pages
    ISBN:9781605587998
    DOI:10.1145/1772690

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 April 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. communication networks
    2. email
    3. learning
    4. network structure
    5. network thresholds
    6. social network analysis
    7. social networks
    8. ties

    Qualifiers

    • Research-article

    Conference

    WWW '10
    WWW '10: The 19th International World Wide Web Conference
    April 26 - 30, 2010
    North Carolina, Raleigh, USA

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)59
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 07 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Developing the ‘omic toolkit of comparative physiologistsComparative Biochemistry and Physiology Part D: Genomics and Proteomics10.1016/j.cbd.2024.101287(101287)Online publication date: Jul-2024
    • (2023)Does money strengthen our social ties? Longitudinal evidence of lottery winnersRationality and Society10.1177/1043463123115956735:2(139-166)Online publication date: 21-Feb-2023
    • (2023)Large-Scale Analysis of New Employee Network DynamicsProceedings of the ACM Web Conference 202310.1145/3543507.3583400(2719-2730)Online publication date: 30-Apr-2023
    • (2023)Birds of a Feather Purchase Together: Accurate Social Network Inference using Transaction Data2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00176(1380-1389)Online publication date: 4-Dec-2023
    • (2023)Modeling and Analysis of Organizational Network Analysis Graphs Based on Employee DataOptimization and Learning10.1007/978-3-031-34020-8_27(354-367)Online publication date: 27-May-2023
    • (2022)Social Network Analysis for Precise Friend Suggestion for Twitter by Associating Multiple Networks Using MLInternational Journal of Information Technology and Web Engineering10.4018/IJITWE.30405017:1(1-11)Online publication date: 1-Sep-2022
    • (2022)Cyberbullying and Cyberviolence Detection: A Triangular User-Activity-Content ViewIEEE/CAA Journal of Automatica Sinica10.1109/JAS.2022.1057409:8(1384-1405)Online publication date: Aug-2022
    • (2021)Inferring Users’ Social Roles with a Multi-Level Graph Neural Network ModelEntropy10.3390/e2311145323:11(1453)Online publication date: 1-Nov-2021
    • (2021)A Large-Scale Comparative Study of Informal Social Networks in FirmsManagement Science10.1287/mnsc.2021.399767:9(5489-5509)Online publication date: 1-Sep-2021
    • (2021)Data-Driven Link Screening for Increasing Network PredictabilityIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.295565033:6(2380-2391)Online publication date: 1-Jun-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    EPUB

    View this article in ePub.

    ePub

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media