skip to main content
10.1145/2851613.2851858acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Who likes me more?: analysing entity-centric language-specific bias in multilingual Wikipedia

Published: 04 April 2016 Publication History

Abstract

In this paper we take an important step towards better understanding the existence and extent of entity-centric language-specific bias in multilingual Wikipedia, and any deviation from its targeted neutral point of view. We propose a methodology using sentiment analysis techniques to systematically extract the variations in sentiments associated with real-world entities in different language editions of Wikipedia, illustrated with a case study of five Wikipedia language editions and a set of target entities from four categories.

References

[1]
List of wikipedias, Accessed: 2015-08-22.
[2]
Wikipedia statistics, Accessed: 2015-08-22.
[3]
S. Baccianella, A. Esuli, and F. Sebastiani. Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In Proceedings of the International Conference on Language Resources and Evaluation, LREC 2010, 17-23 May 2010, Valletta, Malta, volume 10, pages 2200--2204, 2010.
[4]
C. Banea, R. Mihalcea, J. Wiebe, and S. Hassan. Multilingual subjectivity analysis using machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 127--135. Association for Computational Linguistics, 2008.
[5]
K. Denecke. Using sentiwordnet for multilingual sentiment analysis. In Proceedings of the 24th International Conference on Data Engineering Workshops, ICDE 2008, April 7-12, 2008, Cancún, México, pages 507--512, 2008.
[6]
S. Greenstein and F. Zhu. Is wikipedia biased? The American Economic Review, 102(3):343--348, 2012.
[7]
A. Hamouda and M. Rohaim. Reviews classification using sentiwordnet lexicon. In World Congress on Computer Science and Information Technology, 2011.
[8]
D. J. Hopkins and G. King. A method of automated nonparametric content analysis for social science. American Journal of Political Science, 54(1):229--247, 2010.
[9]
M. Hu and B. Liu. Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '04, pages 168--177. ACM, 2004.
[10]
K. A. Khatib, H. Schütze, and C. Kantner. Automatic detection of point of view differences in wikipedia. In COLING 2012, 24th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, 8--15 December 2012, Mumbai, India, pages 33--50, 2012.
[11]
P. Massa and F. Scrinzi. Manypedia: Comparing language points of view of wikipedia communities. First Monday, 2013.
[12]
P. N. Mendes, M. Jakob, A. Garcá-Silva, and C. Bizer. Dbpedia spotlight: shedding light on the web of documents. In Proceedings the 7th International Conference on Semantic Systems, I-SEMANTICS 2011, Graz, Austria, September 7-9, 2011, pages 1--8, 2011.
[13]
B. O'Connor, R. Balasubramanyan, B. R. Routledge, and N. A. Smith. From tweets to polls: Linking text sentiment to public opinion time series. In Proceedings of the Fourth International Conference on Weblogs and Social Media, ICWSM 2010, Washington, DC, USA, May 23-26, 2010, volume 11, pages 122--129, 2010.
[14]
B. Ohana and B. Tierney. Sentiment classification of reviews using sentiwordnet. In 9th. IT & T Conference, page 13, 2009.
[15]
J. Perkins. Python text processing with NLTK 2.0 cookbook. Packt Publishing Ltd, 2010.
[16]
R. Prabowo and M. Thelwall. Sentiment analysis: A combined approach. Journal of Informetrics, 3(2):143--157, 2009.
[17]
R. Rogers. Digital Methods, chapter Wikipedia as Cultural Reference. The MIT Press, 2013.
[18]
K. Toutanova, D. Klein, C. D. Manning, and Y. Singer. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, pages 173--180. Association for Computational Linguistics, 2003.
[19]
X. Wan. Co-training for cross-lingual sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1, pages 235--243. Association for Computational Linguistics, 2009.
[20]
T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on human language technology and empirical methods in natural language processing, pages 347--354. Association for Computational Linguistics, 2005.
[21]
T. Yasseri, A. Spoerri, M. Graham, and J. Kertész. The most controversial topics in wikipedia: A multilingual and geographical analysis. In Global Wikipedia: International and cross-cultural issues in online collaboration. Rowman & Littlefield Publishers, 2014.

Cited By

View all

Index Terms

  1. Who likes me more?: analysing entity-centric language-specific bias in multilingual Wikipedia

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied Computing
      April 2016
      2360 pages
      ISBN:9781450337397
      DOI:10.1145/2851613
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 April 2016

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. linguistic point of view
      2. multilingual Wikipedia
      3. sentiment analysis

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      SAC 2016
      Sponsor:
      SAC 2016: Symposium on Applied Computing
      April 4 - 8, 2016
      Pisa, Italy

      Acceptance Rates

      SAC '16 Paper Acceptance Rate 252 of 1,047 submissions, 24%;
      Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

      Upcoming Conference

      SAC '25
      The 40th ACM/SIGAPP Symposium on Applied Computing
      March 31 - April 4, 2025
      Catania , Italy

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 20 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Point of view and narrative in Wikipedia translationTranslation Spaces10.1075/ts.23045.shu13:2(330-353)Online publication date: 21-Nov-2024
      • (2021)Multilingual Sentiment Analysis: A Systematic Literature ReviewPertanika Journal of Science and Technology10.47836/pjst.29.1.2529:1Online publication date: 2021
      • (2021)A Polyvocal and Contextualised Semantic WebThe Semantic Web10.1007/978-3-030-77385-4_30(506-512)Online publication date: 6-Jun-2021
      • (2018)KEYSTONE WG2: Activities and Results Overview on Keyword SearchSemantic Keyword-Based Search on Structured Data Sources10.1007/978-3-319-74497-1_21(215-223)Online publication date: 8-Feb-2018
      • (2017)Bias in WikipediaProceedings of the 26th International Conference on World Wide Web Companion10.1145/3041021.3053375(717-721)Online publication date: 3-Apr-2017
      • (2017)What’s New? Analysing Language-Specific Wikipedia Entity Contexts to Support Entity-Centric News RetrievalTransactions on Computational Collective Intelligence XXVI10.1007/978-3-319-59268-8_10(210-231)Online publication date: 15-Jun-2017
      • (2016)Towards detection of influential sentences affecting reputation in wikipediaProceedings of the 8th ACM Conference on Web Science10.1145/2908131.2908177(244-248)Online publication date: 22-May-2016

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media