skip to main content
10.1145/2808580.2808641acmotherconferencesArticle/Chapter ViewAbstractPublication PagesteemConference Proceedingsconference-collections
research-article

The implications of Wikipedia for contemporary science education: using social network analysis techniques for automatic organisation of knowledge

Published:07 October 2015Publication History

ABSTRACT

Wikipedia is an Open Content resource, which is constructed by a users community, and is widely employed in educational contexts by both students and teachers. Wikipedia articles have hyperlinks that connect them, so it is possible to represent Wikipedia as a network, in which the nodes are the articles and the edges are hyperlinks. In this paper we analyze a complete copy of the Spanish Wikipedia. We apply Social Networks Analysis Techniques and, more precisely, Communities Detection Techniques, in order to identify clusters of articles with similar content. As the number of clusters is relatively small we use manual analyses to detect science articles. In addition we identify the most representative scientific fields and their main features. We conclude that science articles are about 11.66 % of Spanish Wikipedia articles and that the most important clusters of scientific articles do not always coincide with classical Science disciplines. This kind of analyses contributes to understanding better Wikipedia as an educational tool.

References

  1. Viégas, F., Wattenberg, M. and Mckeon, M. 2007. The hidden order of Wikipedia, In Online Communities and Social Computing (Beijing, China, July 22-27, 2007). Springer-Verlag, Berlin, 445--454. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Voss, J. 2005. Measuring wikipedia, In International Conference of the International Society for Scientometrics and Informetrics (Stockholm, Sweden, July 24-28, 2005). 221--231.Google ScholarGoogle Scholar
  3. Zhang, Y., Sun, A., Datta, A., Chang, K. and Lim, E. 2010. Do wikipedians follow domain experts?: A domain-specific study on wikipedia knowledge building, In Proceedings of the 10th Annual Joint Conference on Digital Libraries (Gold Coast, Australia, June 21-25, 2010), ACM, New York, 119--128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Okoli, C. 2009. A brief review of studies of wikipedia in peer-reviewed journals, In Digital Society, 2009. ICDS'09. Third International Conference (Cancun, Mexico, February, 1-7 2009). IEEE Computer Society, 155--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Okoli, C., Mehdi, M., Mesgari, M., Nielsen, F. A. and Lanamäki, A. 2012, The People's Encyclopedia Under the Gaze of the Sages: A Systematic Review of Scholarly Research on Wikipedia. DOI=http://dx.doi.org/10.2139/ssrn.2021326Google ScholarGoogle Scholar
  6. Okoli, C., Mehdi, M., Mesgari, M., Nielsen, F. A. and Lanamäki, A. 2014. Wikipedia in the eyes of its beholders: A systematic review of scholarly research on Wikipedia readers and readership, Journal of the American Society for Information Science and Technology 65, 12(2014), 2381--2403.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bar-Ilan, J. and Aharony, N. 2014. Twelve Years of Wikipedia Research, In Proceedings of the 2014 ACM Conference on Web Science (Bloomington, Indiana, 2014), ACM, New York, 243--244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Almeida, R., Mozafari, B. and Cho, J. 2007. On the evolution of wikipedia, In International Conference on Weblogs and Social Media (Boulder, Colorado, March 26-28, 2007). URL: http://www.icwsm.org/papers/2--Almeida-Mozafari-Cho.pdfGoogle ScholarGoogle Scholar
  9. Holloway, T., Bozicevic, M. and Börner, K. 2007. Analyzing and visualizing the semantic coverage of Wikipedia and its authors, Complexity 12, 3 (2007), 30--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hasan, H. 2011. Wikipedia, 3.5 Million Articles and Counting: Using and Assessing the People's Encyclopedia, The Rosen Publishing Group, New York.Google ScholarGoogle Scholar
  11. Jullien, N. 2012. What We Know About Wikipedia: A Review of the Literature Analyzing the Project (s). URL: http://halshs.archives-ouvertes.fr/docs/00/85/72/08/PDF/reviewliterature_wikipedia_Jullien.pdfGoogle ScholarGoogle Scholar
  12. Ceroni, A., Georgescu, M., Gadiraju, U., Naini, K. D. and Fisichella, M. 2014. Information evolution in wikipedia, In Proceedings of The International Symposium on Open Collaboration (Berlin, Germany, August 27-29, 2014). ACM, New York, 2014, pp. 24--34 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Javanmardi, S. and Lopes, C. 2010. Statistical measure of quality in wikipedia, In Proceedings of the First Workshop on Social Media Analytics (Washington, Columbia, 2010). ACM, New York, 132--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Milne, D. and Witten, I. H. 2013. An open-source toolkit for mining Wikipedia, Artificial Intelligence 194 (2013), 222--239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Nastase, V. and Strube, M. 2008. Decoding Wikipedia Categories for Knowledge Acquisition, In Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (Chicago, Illinois, July 13-17, 2008). AAAI Press, Menlo Park, California, 1219--1224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Wu, X., Fan, W., Sheng, M., Zhang, L., Shi, X., Su, Z. and Yu, Y. 2012. A Framework to Represent and Mine Knowledge Evolution from Wikipedia Revisions, In Proceedings of the 21st International Conference Companion on World Wide Web (Lyon, France, April 16-20, 2012). ACM, New York, 633--634. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Suchanek, F. M., Kasneci, G. and Weikum, G. 2008. YAGO: A Large Ontology from Wikipedia and WordNet, Web Semantics: Science, Services and Agents on the World Wide Web, 6,3(2008), 203--217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hoffart, J., Suchanek, F., Berberich, K. and Weikum, G. 2013. YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia, Artificial Intelligence, 194(2013), 28--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Ponzetto, S. and Strube, M. 2007. Deriving a large scale taxonomy from Wikipedia, In Proceedings of the National Conference on Artificial Intelligence (Vancouver, July 22-27, 2007). AAAI Press, Menlo Park, California, 1440--1445. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Strube, M. and Ponzetto, S. 2006. WikiRelate! Computing semantic relatedness using Wikipedia, in Proceedings of the National Conference on Artificial Intelligence (Boston, Massachussetts, July 16-20, 2006). AAAI Press, Menlo Park, California, 1419--1424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. de Melo, G. and Weikum, G. 2014. Taxonomic data integration from multilingual Wikipedia editions, Knowledge and Information Systems 39, 1(2014), 1--39.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Nastase, V. and Strube, M. 2013. Transforming Wikipedia into a large scale multilingual concept network, Artificial Intelligence 194 (2013), 62--85. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Sorg, P. and Cimiano, P. 2008. Enriching the cross-lingual link structure of wikipedia-a classification-based approach, In Proceedings of the AAAI 2008 Workshop on Wikipedia and Artifical Intelligence. AAAI Press, Menlo Park, California, 49--54. URL: http://www.aaai.org/Papers/Workshops/2008/WS-08-15/WS08-15-009.pdfGoogle ScholarGoogle Scholar
  24. Paramita, M., Clough, P., Aker, A. and Gaizauskas, R. 2012. Correlation between Similarity Measures for Inter-Language Linked Wikipedia Articles, In Proceedings of the Eighth International Conference on Language Resources and Evaluation. LREC 2012 (Istanbul, Turkey, May 21-27, 2012). European Languages Resources Association, 790--797Google ScholarGoogle Scholar
  25. Milne, D. N., Witten, I. H. and Nichols, D. M. 2007. A knowledge-based search engine powered by wikipedia, In Proceedings of the sixteenth ACM Conference on Information and Knowledge Management (Lisbon, Portugal, 2007). ACM, New York, 445--454. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. de Melo, G. and Weikum, G. 2010. Untangling the cross-lingual link structure of Wikipedia, In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (Uppsala, Sweden, 2010). ACL, Stroudsburg, PA, 844--853. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Bouma, G., Duarte, S. and Islam, Z. 2009. Cross-lingual alignment and completion of Wikipedia templates, In Proceedings of the Third International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies (Boulder, Colorado), Association for Computational Linguistics, 21--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Adar, E., Skinner, M. and Weld, D. S. 2009. Information arbitrage across multi-lingual Wikipedia, In Proceedings of the Second ACM International Conference on Web Search and Data Mining (Barcelona, Spain, 2009). ACM, New York, 94--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ren, X., Wang, Y., Yu, X., Yan, J., Chen, Z. and Han, J. 2014. Heterogeneous graph-based intent learning with queries, web pages and Wikipedia concepts, In Proceedings of the 7th ACM International Conference on Web Search and Data Mining (New York, February 24-28, 2014). ACM, New York, 23--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Dalton, J. and Dietz, L. 2012. Bi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 In Text Retrieval Conference 2012. Knowledge Base Acceleration Track. URL: http://trec.nist.gov/pubs/trec21/papers/umass_CIRR.kba.final.pdfGoogle ScholarGoogle Scholar
  31. Faulkner, A. 2014. Automated Classification of Stance in Student Essays: An Approach Using Stance Target Information and the Wikipedia Link-Based Measure, In FLAIRS Conference (Pensacola Beach, Florida, May 21-23, 2014), AAAI Press, Palo Alto, California, 2014. URL: http://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS14/paper/view/7882Google ScholarGoogle Scholar
  32. Milne, D. and Witten, I. 2008. An effective, low-cost measure of semantic relatedness obtained from Wikipedia links, In Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Synergy (Chicago, USA). AAAI Press, Chicago, 2008, 25--30.Google ScholarGoogle Scholar
  33. Holzmann, H. and Risse, T. 2014. Named entity evolution analysis on wikipedia, In Proceedings of the 2014 ACM Conference on Web Science (Bloomington, Indiana, June 23-26, 2014). ACM, New York, 241--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Weale, T. 2006. Utilizing Wikipedia categories for document classification, Evaluation. URL: ftp://ftp.cse.ohio-state.edu/pub/tech-report/2008/TR14.pdfGoogle ScholarGoogle Scholar
  35. Gabrilovich, E. and Markovitch, S. 2007. Computing semantic relatedness using wikipedia-based explicit semantic analysis, In Proceedings of the 20th international joint conference on Artifical intelligence (Hyderabad, India, January 6-12, 2007). AAAI Press, Palo Alto, California, 1606--1611. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Toral, A. and Munoz, R. 2006. A proposal to automatically build and maintain gazetteers for Named Entity Recognition by using Wikipedia, In Workshop on NEW TEXT.Wikis and blogs and other dynamic text sources. EACL'06, (Trento, Italy, 2006). URL: http://www.aclweb.org/anthology/W/W06/W06-2809.pdf.Google ScholarGoogle Scholar
  37. Adafre, S. and de Rijke, M. 2005. Discovering missing links in Wikipedia, In Proceedings of the 3rd International Workshop on Link Discovery (Chicago, Illinois, 2005). ACM, New York, 90--97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Bellomi, F. and Bonato, R. 2005. Network analysis for Wikipedia, In Proceedings of Wikimania (Frankfurt, 2005). URL: http://www.fran.it/articles/wikimania_bellomi_bonato.pdf.Google ScholarGoogle Scholar
  39. Capocci, A., Servedio, V., Colaiori, F., Buriol, L., Donato, D., Leonardi, S. and Caldarelli, G. 2006. Preferential attachment in the growth of social networks: The internet encyclopedia Wikipedia, Physical Review, 74, 3 (2006), 036116. URL: http://www.inf.ufrgs.br/~buriol/papers/Physical_Review_E_06.pdfGoogle ScholarGoogle Scholar
  40. Bu, F., Hao, Y. and Zhu, X. 2011. Semantic relationship discovery with wikipedia structure, In Proceedings of the Twenty-Second international Joint Conference on Artificial Intelligence (Barcelona, Spain, July 16-22, 2011). AAAI Press, Menlo Park, California, 1770--1775. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Chernov, S., Iofciu, T., Nejdl, W. and Zhou, X. 2006. Extracting Semantics Relationships between Wikipedia Categories, SemWiki'06 (Buvda, Montenegro, June 2006). DOI=10.1.1.73.5507Google ScholarGoogle Scholar
  42. Kamps, J. and Koolen, M. 2009. Is Wikipedia link structure different?, In Proceedings of the Second ACM International Conference on Web Search and Data Mining (Barcelona, Spain 2009). ACM, New York, 232--241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Soboroff, I. 2002. Do TREC Web collections look like the Web?, ACM SIGIR Forum, 36, 2(2002), 23--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Kozlova, N. 2005. Automatic ontology extraction for document classification, PhD thesis, Saarland University. URL: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.1221&rep=rep1&type=pdfGoogle ScholarGoogle Scholar
  45. Massa, P. 2011. Social Networks of Wikipedia, In Proceedings of the 22Nd ACM Conference on Hypertext and Hypermedia (Eindhoven, The Netherlands, 2011). ACM, New York, 221--230 Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Preusse, J., Kunegis, J., Thimm, M., Staab, S. and Gottron, T. 2013. Structural Dynamics of Knowledge Networks, In Proceedings of the Seventh International Conference on Weblogs and Social Media (Cambridge, Massachussetts, July 8-11, 2013). AAAI Press, Menlo Park, California, 506--515Google ScholarGoogle Scholar
  47. Albert, R., Jeong, H. and Barabási, A. 1999. The Diameter of the World Wide Web, Nature, 401, 130--131Google ScholarGoogle ScholarCross RefCross Ref
  48. Buriol, L. S., Castillo, C., Donato, D., Leonardi, S. and Millozzi, S. 2006. Temporal analysis of the wikigraph, In Web Intelligence, 2006. WI 2006. IEEE/WIC/ACM International Conference (Honk Kong, December 18-22, 2006). IEEE, Whasington, 45--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Bonacich, P. and Lloyd, P. 2001. Eigenvector-like measures of centrality for asymmetric relations, Social Networks, 23, 191--201.Google ScholarGoogle ScholarCross RefCross Ref
  50. Freeman, L. C. 1979. Centrality in social networks conceptual clarification, Social Networks 1, 3 (1979), 215--239.Google ScholarGoogle ScholarCross RefCross Ref
  51. Girvan, M. and Newman, M. E. J. 2002. Community structure in social and biological networks, In Proceedings of the National Academy of Sciences 99, 12 (2002), 7821--7826.Google ScholarGoogle ScholarCross RefCross Ref
  52. Papadopoulos, S., Kompatsiaris, Y., Vakali, A. and Spyridonos, P. 2012. Community detection in social media, Data Mining and Knowledge Discovery 24, 3 (2012), 515--554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Plantié, M. and Crampes, M. 2013. Survey on social community detection 'Social Media Retrieval', In Social Media Retrieval, Springer, London, 65--85.Google ScholarGoogle Scholar
  54. Lancichinetti, A. and Fortunato, S. 2009. Community detection algorithms: a comparative analysis, Physical Review 80, 5 (2009), 056117--056128.Google ScholarGoogle Scholar
  55. Rosvall, M. and Bergstrom, C. T. 2008. Maps of random walks on complex networks reveal community structure, Proceedings of the National Academy of Sciences 105, 4 (2008), 1118--1123.Google ScholarGoogle ScholarCross RefCross Ref
  56. Rosvall, M. and Bergstrom, C. T. 2008. Maps of random walks on complex networks reveal community structure, Proceedings of the National Academy of Sciences 105, 4 (2008), 1118--1123.Google ScholarGoogle ScholarCross RefCross Ref
  57. Geiger, R. S. and Ribes, D. 2010. The work of sustaining order in wikipedia: the banning of a vandal, In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work (Savannah, Georgia, February 6-10, 2010). ACM, New York, 117--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Blondel, V. D., Guillaume, J. L., Lambiotte, R. and Lefebvre, E. 2008. Fast unfolding of communities in large networks, Journal of Statistical Mechanics: Theory and Experiment. 2008, 10 (2008). URL: http://arxiv.org/pdf/0803.0476.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  59. Almind, T. C. and Ingwersen, P. 1997. Informetric Analyses on the World Wide Web: Methodological Approaches to 'webometrics', Journal of Documentation, 53,4(1997), 404--426Google ScholarGoogle ScholarCross RefCross Ref
  60. Papadopoulos, S., Kompatsiaris, Y., Vakali, A. and Spyridonos, P. 2012. Community detection in social media, Data Mining and Knowledge Discovery 24, 3 (2012), 515--554. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The implications of Wikipedia for contemporary science education: using social network analysis techniques for automatic organisation of knowledge

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      TEEM '15: Proceedings of the 3rd International Conference on Technological Ecosystems for Enhancing Multiculturality
      October 2015
      674 pages
      ISBN:9781450334426
      DOI:10.1145/2808580

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 October 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate496of705submissions,70%
    • Article Metrics

      • Downloads (Last 12 months)4
      • Downloads (Last 6 weeks)1

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader