skip to main content
10.1145/2908131.2908172acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
short-paper
Open access

A manifesto for data sharing in social media research

Published: 22 May 2016 Publication History

Abstract

More and more researchers want to share research data collected from social media to allow for reproducibility and comparability of results. With this paper we want to encourage them to pursue this aim -- despite initial obstacles that they may face. Sharing can occur in various, more or less formal ways. We provide background information that allows researchers to make a decision about whether, how and where to share depending on their specific situation (data, platform, targeted user group, research topic etc.). Ethical, legal and methodological considerations are important for making this decision. Based on these three dimensions we develop a framework for social media sharing that can act as a first set of guidelines to help social media researchers make practical decisions for their own projects. In the long run, different stakeholders should join forces to enable better practices for data sharing for social media researchers. This paper is intended as our call to action for the broader research community to advance current practices of data sharing in the future.

References

[1]
Borgman, C. L. 2012. The conundrum of sharing research data. Journal of the American Society for Information Science and Technology 63(6):1059--1078.
[2]
boyd, d., Crawford, K. 2012. Critical questions for Big Data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society, 15(5), 662--679.
[3]
Bruns, A. 2013. Faster than the speed of print: Reconciling 'Big Data' social media analysis and academic scholarship. First Monday 18(10).
[4]
Bruns, A., Stieglitz, S. 2014. Twitter data: What do they represent? it Information Technology 59(5):240--245.
[5]
COCA. no date. The Corpus of Contemporary American English (COCA) Retrieved from http://corpus.byu.edu/coca/ (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFjBR3mZ)
[6]
Cha, M., Haddadi, H., Benevenuto, B., Gummadi, K. P. 2010. Measuring User Influence in Twitter: The Million Follower Fallacy. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM), May 2010.
[7]
CrisisLex. No date. CrisisLex. Retrieved from http://crisislex.org/ (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFm4G3Jx))
[8]
Fecher, B., Friesike, S., Hebing, M., Linek, S., Sauermann, A. 2015. A Reputation Economy: Results from an Empirical Survey on Academic Data Sharing. DIW Berlin Discussion Paper, No. 1454. Retrieved from http://www.diw.de/documents/publikationen/73/diw_01.c.497416.de/dp1454.pdf (accessed March 19, 2015).
[9]
Fecher, B., Puschmann, C. 2015. On the limits of openness in science: between aspiration and reality when sharing research data. Information -- Wissenschaft und Praxis 66(2-3):146--150.
[10]
Frické, M. 2014. Big Data and Its Epistemology. Journal of the Association for Information Science and Technology 66(4): 651--661.
[11]
Giglietto, F., Rossi, L., Bennato, D. 2012. The open laboratory: Limits and possibilities of using Facebook, Twitter, and YouTube as a research data source. Journal of Technology in Human Services 30(3--4): 145--159.
[12]
Hadgu, A. T., Jäschke, R. 2014. Identifying and analyzing researchers on twitter. In Proceedings of the 2014 ACM conference on Web science. New York: ACM Press, 23--32.
[13]
Hutton, L., and Henderson, T. 2015. "I didn't sign up for this!": Informed consent in social network research. In Proceedings of the Ninth International AAAI Conference on Weblogs and Social Media (ICWSM), 178--187.
[14]
ICWSM. 2012. ICWSM Dataset Sharing Service. Retrieved from: http://icwsm.cs.mcgill.ca (accessed Feb 6, 2016, archived by WebCite® at http://www.webcitation.org/6fC7JfFyR)
[15]
ICWSM. 2015. Usage Agreement for ICWSM Contributed Datasets. Retrieved from http://www.icwsm.org/2015/datasets/datasets/icwsm_user_agreement_v1.pdf (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFl9SHLu).
[16]
Kaczmirek, L., Mayr, P. 2015. German Bundestag Elections 2013: Twitter usage by electoral candidates. ZA5973 Data file Version 1.0.0.
[17]
Kaczmirek, L., Mayr, P., Vatrapu, R. et al. 2014. Social Media Monitoring of the Campaigns for the 2013 German Bundestag Elections on Facebook and Twitter.
[18]
Kinder-Kurlanda, K. E., Weller, K. 2014. 'I always feel it must be great to be a hacker!': The role of interdisciplinary work in social media research. In: Proceedings of the 2014 ACM conference on Web Science, 91--98. New York: ACM.
[19]
KONECT. No date. The Koblenz Network Collection. Retrieved from http://konect.uni-koblenz.de/ (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFmJQs4w).
[20]
McLemee, S. (2015). The archive is closed. Inside Higher Ed. Retrieved from https://www.insidehighered.com/views/2015/06/03/article-difficulties-social-media-research (accessed Feb 6, 2016, archived by WebCite® at http://www.webcitation.org/6fFldRaKg).
[21]
Morstatter, F.; Pfeffer, J.; Liu, H.; Carley, K. M. 2013. Is the sample good enough? Comparing data from Twitter's streaming api with twitter's firehose. In Seventh International AAAI Conference on Weblogs and Social Media.
[22]
Morstatter, F., Pfeffer, J., Liu, H. 2014. When is it biased? Assessing the representativeness of twitter's streaming API. In Proceedings of Web ScienceTrack at the 23rd Conference on the WWW, 555--556. New York: ACM.
[23]
MPI-SWS. no date. The Twitter Project Page at MPI-SWS. Retrieved from http://twitter.mpi-sws.org/ (accessed January 26, 2015, archived by WebCite® at http://www.webcitation.org/6VsuuxQlU)
[24]
Pfeffer, J., Morstatter, F. 2016. Geotagged Twitter posts from the United States: A tweet collection to investigate representativeness.
[25]
Puschmann, C., Burgess, J. 2013. The politics of Twitter data. HIIG Discussion Paper Series No. 2013-01.
[26]
Recker, A., Müller, S., Trixa, J., Schumann, N. (2015). Paving the Way For Data-Centric, Open Science: An Example From the Social Sciences. Journal of Librarianship and Scholarly Communication, 3(2), eP1227.
[27]
Ruths, D., Pfeffer, J. (2014). Social media for large studies of behavior. Science 346(621):1063--1064.
[28]
Schroeder, R. 2014. Big Data and the brave new world of social media research. Big Data & Society 1(2):1--11.
[29]
Stone, B. 2010. Tweet preservation. Twitter Blog (14 April 2010). Retrieved from https://blog.twitter.com/2010/tweet-preservation (accessed Feb 6, 2016).
[30]
stuck_in_the_matrix. 2015a. I have every publicly available Reddit comment for research. ~ 1.7 billion comments @ 250 GB compressed. Any interest in this? Retrieved from https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFpMhWNk).
[31]
stuck_in_the_matrix. 2015b. Complete Public Reddit Comments Corpus. Retrieved from https://archive.org/details/2015_reddit_comments_corpus (accessed Feb 12, 2016).
[32]
Summers, E. 2014. Ferguson-tweet-ids. Retrieved from https://archive.org/details/ferguson-tweet-ids (accessed Feb 6, 2016).
[33]
Summers, E. 2015. Tweets and deletes: silences in the social media archive. Retrieved from https://medium.com/on-archivy/tweets-and-deletes-727ed74f84ed#.pay32r3eu (accessed Feb 6, 2016; archived by WebCite® at http://www.webcitation.org/6f6KxoikL)
[34]
Thomson, S. D. 2016. Preserving Social Media. DPC Technology Watch Report. Retrieved from http://dpconline.org/publications/technology-watch-reports
[35]
Tiropanis T., Hall, W., Hendler, J., de Larrinaga, C. 2014. The Web Observatory: A Middle Layer for Broad Data. Big Data. September 2014, 2(3): 129--133.
[36]
TREC. 2011. Tweets2011. Retrieved from http://trec.nist.gov/data/tweets/ (retrieved Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6W1ZVkk8o)
[37]
Tufekci, Z. 2014. Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls. In ICWSM'14: Proceedings of the 8th International AAAI Conference on Weblogs and Social Media.
[38]
Twitter, Inc. 2015. Developer agreement & policy. Retrieved from: https://dev.twitter.com/overview/terms/agreement-and-policy (accessed Feb 6, 2016).
[39]
Web Science Trust. No date. Web Observatory. Retrieved from http://webscience.org/web-observatory/ (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFnJwWSa).
[40]
Weller, K. 2014. Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In R. Reichert ed., Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie. Bielefeld: transcript, 239--257.
[41]
Weller, K., Kinder-Kurlanda, K. E. 2015. Uncovering the Challenges in Collection, Sharing and Documentation: The Hidden Data of Social Media Research? In Standards and Practices in Large-Scale Social Media Research: Papers from the 2015 ICWSM Workshop. Proceedings Ninth International AAAI Conference on Web and Social Media Oxford University, May 26, 2015 - May 29, 2015, 28--37. Ann Arbor, MI: AAAI Press. Retrieved from http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10657 (accessed Feb 12, 2016).
[42]
Wikipedia. No date. Wikipedia:Database_download. Retrieved from https://en.wikipedia.org/wiki/Wikipedia:Database_download (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFnfeGKS).
[43]
Zenk-Möltgen, W. 2014. Datorium: Benefit from Data Sharing. Presentation at IASSIST 2014. Retrieved from http://www.iassistdata.org/conferences/2014/presentation/3834 (accessed Feb 12, 2016).
[44]
Zimmer, M. 2010. But the data is already public: on the ethics of research in Facebook. Ethics and Information Technology 12(4):313--325.

Cited By

View all
  • (2024)Exploring the Fences and Gains of Data Sharing Practices: From the Perception of Some States in North East Nigerian AcademicsInternational Journal of Innovative Science and Research Technology (IJISRT)10.38124/ijisrt/IJISRT24JUN013(1987-1997)Online publication date: 6-Jul-2024
  • (2024)Big Social Research in PracticeScaling Up: How Data Curation Can Help Address Key Issues in Qualitative Data Reuse and Big Social Research10.1007/978-3-031-49222-8_4(47-72)Online publication date: 2-Jan-2024
  • (2023)Sharing social media data: The role of past experiences, attitudes, norms, and perceived behavioral controlFrontiers in Big Data10.3389/fdata.2022.9719745Online publication date: 16-Jan-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WebSci '16: Proceedings of the 8th ACM Conference on Web Science
May 2016
392 pages
ISBN:9781450342087
DOI:10.1145/2908131
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 May 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. archiving
  2. data archives
  3. data protection
  4. data sharing
  5. legal issues
  6. methodology
  7. privacy
  8. reproducibility
  9. social media

Qualifiers

  • Short-paper

Conference

WebSci '16
Sponsor:
WebSci '16: ACM Web Science Conference
May 22 - 25, 2016
Hannover, Germany

Acceptance Rates

WebSci '16 Paper Acceptance Rate 13 of 70 submissions, 19%;
Overall Acceptance Rate 245 of 933 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)168
  • Downloads (Last 6 weeks)17
Reflects downloads up to 07 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Exploring the Fences and Gains of Data Sharing Practices: From the Perception of Some States in North East Nigerian AcademicsInternational Journal of Innovative Science and Research Technology (IJISRT)10.38124/ijisrt/IJISRT24JUN013(1987-1997)Online publication date: 6-Jul-2024
  • (2024)Big Social Research in PracticeScaling Up: How Data Curation Can Help Address Key Issues in Qualitative Data Reuse and Big Social Research10.1007/978-3-031-49222-8_4(47-72)Online publication date: 2-Jan-2024
  • (2023)Sharing social media data: The role of past experiences, attitudes, norms, and perceived behavioral controlFrontiers in Big Data10.3389/fdata.2022.9719745Online publication date: 16-Jan-2023
  • (2022)Barriers to academic data science research in the new realm of algorithmic behaviour modification by digital platformsNature Machine Intelligence10.1038/s42256-022-00475-74:4(323-330)Online publication date: 18-Apr-2022
  • (2022)Daten in den SozialwissenschaftenForschungsstrategien in den Sozialwissenschaften10.1007/978-3-658-36972-9_10(225-256)Online publication date: 30-Jun-2022
  • (2021)From FAIR data to fair data use: Methodological data fairness in health-related social media researchBig Data & Society10.1177/205395172110103108:1Online publication date: 3-May-2021
  • (2021)Caring for (Big) Data: An Introduction to Research Methodologies and Ethical Challenges in Digital Migration StudiesResearch Methodologies and Ethical Challenges in Digital Migration Studies10.1007/978-3-030-81226-3_1(1-21)Online publication date: 24-Nov-2021
  • (2020)The practical and ethical challenges in acquiring and sharing digital trace data: Negotiating public-private partnershipsNew Media & Society10.1177/146144482092462222:11(2058-2080)Online publication date: 4-Oct-2020
  • (2020)Counter-archiving FacebookEuropean Journal of Communication10.1177/026732312092206935:3(249-264)Online publication date: 1-May-2020
  • (2020)Saving social media dataJournal of the Association for Information Science and Technology10.1002/asi.2436872:1(97-109)Online publication date: 14-Dec-2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media