research-article

The implications of Wikipedia for contemporary science education: using social network analysis techniques for automatic organisation of knowledge

Authors:
Carlos G. Figuerola

University of Salamanca, Spain, Salamanca, Spain

University of Salamanca, Spain, Salamanca, Spain
View Profile

,
Tamar Groves

University of Extremadura, Spain, Cáceres, Spain

University of Extremadura, Spain, Cáceres, Spain
View Profile

,
Miguel Angel Quintanilla

University of Salmanca, Salamanca, Spain

University of Salmanca, Salamanca, Spain
View Profile

TEEM '15: Proceedings of the 3rd International Conference on Technological Ecosystems for Enhancing MulticulturalityOctober 2015Pages 403–410https://doi.org/10.1145/2808580.2808641

Published:07 October 2015Publication History

TEEM '15: Proceedings of the 3rd International Conference on Technological Ecosystems for Enhancing Multiculturality

Pages 403–410

ABSTRACT

Wikipedia is an Open Content resource, which is constructed by a users community, and is widely employed in educational contexts by both students and teachers. Wikipedia articles have hyperlinks that connect them, so it is possible to represent Wikipedia as a network, in which the nodes are the articles and the edges are hyperlinks. In this paper we analyze a complete copy of the Spanish Wikipedia. We apply Social Networks Analysis Techniques and, more precisely, Communities Detection Techniques, in order to identify clusters of articles with similar content. As the number of clusters is relatively small we use manual analyses to detect science articles. In addition we identify the most representative scientific fields and their main features. We conclude that science articles are about 11.66 % of Spanish Wikipedia articles and that the most important clusters of scientific articles do not always coincide with classical Science disciplines. This kind of analyses contributes to understanding better Wikipedia as an educational tool.

References

Viégas, F., Wattenberg, M. and Mckeon, M. 2007. The hidden order of Wikipedia, In Online Communities and Social Computing (Beijing, China, July 22-27, 2007). Springer-Verlag, Berlin, 445--454. Google ScholarDigital Library
Voss, J. 2005. Measuring wikipedia, In International Conference of the International Society for Scientometrics and Informetrics (Stockholm, Sweden, July 24-28, 2005). 221--231.Google Scholar
Zhang, Y., Sun, A., Datta, A., Chang, K. and Lim, E. 2010. Do wikipedians follow domain experts?: A domain-specific study on wikipedia knowledge building, In Proceedings of the 10th Annual Joint Conference on Digital Libraries (Gold Coast, Australia, June 21-25, 2010), ACM, New York, 119--128. Google ScholarDigital Library
Okoli, C. 2009. A brief review of studies of wikipedia in peer-reviewed journals, In Digital Society, 2009. ICDS'09. Third International Conference (Cancun, Mexico, February, 1-7 2009). IEEE Computer Society, 155--160. Google ScholarDigital Library
Okoli, C., Mehdi, M., Mesgari, M., Nielsen, F. A. and Lanamäki, A. 2012, The People's Encyclopedia Under the Gaze of the Sages: A Systematic Review of Scholarly Research on Wikipedia. DOI=http://dx.doi.org/10.2139/ssrn.2021326Google Scholar
Okoli, C., Mehdi, M., Mesgari, M., Nielsen, F. A. and Lanamäki, A. 2014. Wikipedia in the eyes of its beholders: A systematic review of scholarly research on Wikipedia readers and readership, Journal of the American Society for Information Science and Technology 65, 12(2014), 2381--2403.Google ScholarDigital Library
Bar-Ilan, J. and Aharony, N. 2014. Twelve Years of Wikipedia Research, In Proceedings of the 2014 ACM Conference on Web Science (Bloomington, Indiana, 2014), ACM, New York, 243--244. Google ScholarDigital Library
Almeida, R., Mozafari, B. and Cho, J. 2007. On the evolution of wikipedia, In International Conference on Weblogs and Social Media (Boulder, Colorado, March 26-28, 2007). URL: http://www.icwsm.org/papers/2--Almeida-Mozafari-Cho.pdfGoogle Scholar
Holloway, T., Bozicevic, M. and Börner, K. 2007. Analyzing and visualizing the semantic coverage of Wikipedia and its authors, Complexity 12, 3 (2007), 30--40. Google ScholarDigital Library
Hasan, H. 2011. Wikipedia, 3.5 Million Articles and Counting: Using and Assessing the People's Encyclopedia, The Rosen Publishing Group, New York.Google Scholar
Jullien, N. 2012. What We Know About Wikipedia: A Review of the Literature Analyzing the Project (s). URL: http://halshs.archives-ouvertes.fr/docs/00/85/72/08/PDF/reviewliterature_wikipedia_Jullien.pdfGoogle Scholar
Ceroni, A., Georgescu, M., Gadiraju, U., Naini, K. D. and Fisichella, M. 2014. Information evolution in wikipedia, In Proceedings of The International Symposium on Open Collaboration (Berlin, Germany, August 27-29, 2014). ACM, New York, 2014, pp. 24--34 Google ScholarDigital Library
Javanmardi, S. and Lopes, C. 2010. Statistical measure of quality in wikipedia, In Proceedings of the First Workshop on Social Media Analytics (Washington, Columbia, 2010). ACM, New York, 132--138. Google ScholarDigital Library
Milne, D. and Witten, I. H. 2013. An open-source toolkit for mining Wikipedia, Artificial Intelligence 194 (2013), 222--239. Google ScholarDigital Library
Nastase, V. and Strube, M. 2008. Decoding Wikipedia Categories for Knowledge Acquisition, In Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (Chicago, Illinois, July 13-17, 2008). AAAI Press, Menlo Park, California, 1219--1224. Google ScholarDigital Library
Wu, X., Fan, W., Sheng, M., Zhang, L., Shi, X., Su, Z. and Yu, Y. 2012. A Framework to Represent and Mine Knowledge Evolution from Wikipedia Revisions, In Proceedings of the 21st International Conference Companion on World Wide Web (Lyon, France, April 16-20, 2012). ACM, New York, 633--634. Google ScholarDigital Library
Suchanek, F. M., Kasneci, G. and Weikum, G. 2008. YAGO: A Large Ontology from Wikipedia and WordNet, Web Semantics: Science, Services and Agents on the World Wide Web, 6,3(2008), 203--217. Google ScholarDigital Library
Hoffart, J., Suchanek, F., Berberich, K. and Weikum, G. 2013. YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia, Artificial Intelligence, 194(2013), 28--61. Google ScholarDigital Library
Ponzetto, S. and Strube, M. 2007. Deriving a large scale taxonomy from Wikipedia, In Proceedings of the National Conference on Artificial Intelligence (Vancouver, July 22-27, 2007). AAAI Press, Menlo Park, California, 1440--1445. Google ScholarDigital Library
Strube, M. and Ponzetto, S. 2006. WikiRelate! Computing semantic relatedness using Wikipedia, in Proceedings of the National Conference on Artificial Intelligence (Boston, Massachussetts, July 16-20, 2006). AAAI Press, Menlo Park, California, 1419--1424. Google ScholarDigital Library
de Melo, G. and Weikum, G. 2014. Taxonomic data integration from multilingual Wikipedia editions, Knowledge and Information Systems 39, 1(2014), 1--39.Google ScholarDigital Library
Nastase, V. and Strube, M. 2013. Transforming Wikipedia into a large scale multilingual concept network, Artificial Intelligence 194 (2013), 62--85. Google ScholarDigital Library
Sorg, P. and Cimiano, P. 2008. Enriching the cross-lingual link structure of wikipedia-a classification-based approach, In Proceedings of the AAAI 2008 Workshop on Wikipedia and Artifical Intelligence. AAAI Press, Menlo Park, California, 49--54. URL: http://www.aaai.org/Papers/Workshops/2008/WS-08-15/WS08-15-009.pdfGoogle Scholar
Paramita, M., Clough, P., Aker, A. and Gaizauskas, R. 2012. Correlation between Similarity Measures for Inter-Language Linked Wikipedia Articles, In Proceedings of the Eighth International Conference on Language Resources and Evaluation. LREC 2012 (Istanbul, Turkey, May 21-27, 2012). European Languages Resources Association, 790--797Google Scholar
Milne, D. N., Witten, I. H. and Nichols, D. M. 2007. A knowledge-based search engine powered by wikipedia, In Proceedings of the sixteenth ACM Conference on Information and Knowledge Management (Lisbon, Portugal, 2007). ACM, New York, 445--454. Google ScholarDigital Library
de Melo, G. and Weikum, G. 2010. Untangling the cross-lingual link structure of Wikipedia, In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (Uppsala, Sweden, 2010). ACL, Stroudsburg, PA, 844--853. Google ScholarDigital Library
Bouma, G., Duarte, S. and Islam, Z. 2009. Cross-lingual alignment and completion of Wikipedia templates, In Proceedings of the Third International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies (Boulder, Colorado), Association for Computational Linguistics, 21--29. Google ScholarDigital Library
Adar, E., Skinner, M. and Weld, D. S. 2009. Information arbitrage across multi-lingual Wikipedia, In Proceedings of the Second ACM International Conference on Web Search and Data Mining (Barcelona, Spain, 2009). ACM, New York, 94--103. Google ScholarDigital Library
Ren, X., Wang, Y., Yu, X., Yan, J., Chen, Z. and Han, J. 2014. Heterogeneous graph-based intent learning with queries, web pages and Wikipedia concepts, In Proceedings of the 7th ACM International Conference on Web Search and Data Mining (New York, February 24-28, 2014). ACM, New York, 23--32. Google ScholarDigital Library
Dalton, J. and Dietz, L. 2012. Bi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 In Text Retrieval Conference 2012. Knowledge Base Acceleration Track. URL: http://trec.nist.gov/pubs/trec21/papers/umass_CIRR.kba.final.pdfGoogle Scholar
Faulkner, A. 2014. Automated Classification of Stance in Student Essays: An Approach Using Stance Target Information and the Wikipedia Link-Based Measure, In FLAIRS Conference (Pensacola Beach, Florida, May 21-23, 2014), AAAI Press, Palo Alto, California, 2014. URL: http://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS14/paper/view/7882Google Scholar
Milne, D. and Witten, I. 2008. An effective, low-cost measure of semantic relatedness obtained from Wikipedia links, In Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Synergy (Chicago, USA). AAAI Press, Chicago, 2008, 25--30.Google Scholar
Holzmann, H. and Risse, T. 2014. Named entity evolution analysis on wikipedia, In Proceedings of the 2014 ACM Conference on Web Science (Bloomington, Indiana, June 23-26, 2014). ACM, New York, 241--242. Google ScholarDigital Library
Weale, T. 2006. Utilizing Wikipedia categories for document classification, Evaluation. URL: ftp://ftp.cse.ohio-state.edu/pub/tech-report/2008/TR14.pdfGoogle Scholar
Gabrilovich, E. and Markovitch, S. 2007. Computing semantic relatedness using wikipedia-based explicit semantic analysis, In Proceedings of the 20th international joint conference on Artifical intelligence (Hyderabad, India, January 6-12, 2007). AAAI Press, Palo Alto, California, 1606--1611. Google ScholarDigital Library
Toral, A. and Munoz, R. 2006. A proposal to automatically build and maintain gazetteers for Named Entity Recognition by using Wikipedia, In Workshop on NEW TEXT.Wikis and blogs and other dynamic text sources. EACL'06, (Trento, Italy, 2006). URL: http://www.aclweb.org/anthology/W/W06/W06-2809.pdf.Google Scholar
Adafre, S. and de Rijke, M. 2005. Discovering missing links in Wikipedia, In Proceedings of the 3rd International Workshop on Link Discovery (Chicago, Illinois, 2005). ACM, New York, 90--97. Google ScholarDigital Library
Bellomi, F. and Bonato, R. 2005. Network analysis for Wikipedia, In Proceedings of Wikimania (Frankfurt, 2005). URL: http://www.fran.it/articles/wikimania_bellomi_bonato.pdf.Google Scholar
Capocci, A., Servedio, V., Colaiori, F., Buriol, L., Donato, D., Leonardi, S. and Caldarelli, G. 2006. Preferential attachment in the growth of social networks: The internet encyclopedia Wikipedia, Physical Review, 74, 3 (2006), 036116. URL: http://www.inf.ufrgs.br/~buriol/papers/Physical_Review_E_06.pdfGoogle Scholar
Bu, F., Hao, Y. and Zhu, X. 2011. Semantic relationship discovery with wikipedia structure, In Proceedings of the Twenty-Second international Joint Conference on Artificial Intelligence (Barcelona, Spain, July 16-22, 2011). AAAI Press, Menlo Park, California, 1770--1775. Google ScholarDigital Library
Chernov, S., Iofciu, T., Nejdl, W. and Zhou, X. 2006. Extracting Semantics Relationships between Wikipedia Categories, SemWiki'06 (Buvda, Montenegro, June 2006). DOI=10.1.1.73.5507Google Scholar
Kamps, J. and Koolen, M. 2009. Is Wikipedia link structure different?, In Proceedings of the Second ACM International Conference on Web Search and Data Mining (Barcelona, Spain 2009). ACM, New York, 232--241. Google ScholarDigital Library
Soboroff, I. 2002. Do TREC Web collections look like the Web?, ACM SIGIR Forum, 36, 2(2002), 23--31. Google ScholarDigital Library
Kozlova, N. 2005. Automatic ontology extraction for document classification, PhD thesis, Saarland University. URL: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.1221&rep=rep1&type=pdfGoogle Scholar
Massa, P. 2011. Social Networks of Wikipedia, In Proceedings of the 22Nd ACM Conference on Hypertext and Hypermedia (Eindhoven, The Netherlands, 2011). ACM, New York, 221--230 Google ScholarDigital Library
Preusse, J., Kunegis, J., Thimm, M., Staab, S. and Gottron, T. 2013. Structural Dynamics of Knowledge Networks, In Proceedings of the Seventh International Conference on Weblogs and Social Media (Cambridge, Massachussetts, July 8-11, 2013). AAAI Press, Menlo Park, California, 506--515Google Scholar
Albert, R., Jeong, H. and Barabási, A. 1999. The Diameter of the World Wide Web, Nature, 401, 130--131Google ScholarCross Ref
Buriol, L. S., Castillo, C., Donato, D., Leonardi, S. and Millozzi, S. 2006. Temporal analysis of the wikigraph, In Web Intelligence, 2006. WI 2006. IEEE/WIC/ACM International Conference (Honk Kong, December 18-22, 2006). IEEE, Whasington, 45--51. Google ScholarDigital Library
Bonacich, P. and Lloyd, P. 2001. Eigenvector-like measures of centrality for asymmetric relations, Social Networks, 23, 191--201.Google ScholarCross Ref
Freeman, L. C. 1979. Centrality in social networks conceptual clarification, Social Networks 1, 3 (1979), 215--239.Google ScholarCross Ref
Girvan, M. and Newman, M. E. J. 2002. Community structure in social and biological networks, In Proceedings of the National Academy of Sciences 99, 12 (2002), 7821--7826.Google ScholarCross Ref
Papadopoulos, S., Kompatsiaris, Y., Vakali, A. and Spyridonos, P. 2012. Community detection in social media, Data Mining and Knowledge Discovery 24, 3 (2012), 515--554. Google ScholarDigital Library
Plantié, M. and Crampes, M. 2013. Survey on social community detection 'Social Media Retrieval', In Social Media Retrieval, Springer, London, 65--85.Google Scholar
Lancichinetti, A. and Fortunato, S. 2009. Community detection algorithms: a comparative analysis, Physical Review 80, 5 (2009), 056117--056128.Google Scholar
Rosvall, M. and Bergstrom, C. T. 2008. Maps of random walks on complex networks reveal community structure, Proceedings of the National Academy of Sciences 105, 4 (2008), 1118--1123.Google ScholarCross Ref
Rosvall, M. and Bergstrom, C. T. 2008. Maps of random walks on complex networks reveal community structure, Proceedings of the National Academy of Sciences 105, 4 (2008), 1118--1123.Google ScholarCross Ref
Geiger, R. S. and Ribes, D. 2010. The work of sustaining order in wikipedia: the banning of a vandal, In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work (Savannah, Georgia, February 6-10, 2010). ACM, New York, 117--126. Google ScholarDigital Library
Blondel, V. D., Guillaume, J. L., Lambiotte, R. and Lefebvre, E. 2008. Fast unfolding of communities in large networks, Journal of Statistical Mechanics: Theory and Experiment. 2008, 10 (2008). URL: http://arxiv.org/pdf/0803.0476.pdfGoogle ScholarCross Ref
Almind, T. C. and Ingwersen, P. 1997. Informetric Analyses on the World Wide Web: Methodological Approaches to 'webometrics', Journal of Documentation, 53,4(1997), 404--426Google ScholarCross Ref
Papadopoulos, S., Kompatsiaris, Y., Vakali, A. and Spyridonos, P. 2012. Community detection in social media, Data Mining and Knowledge Discovery 24, 3 (2012), 515--554. Google ScholarDigital Library

Index Terms

The implications of Wikipedia for contemporary science education: using social network analysis techniques for automatic organisation of knowledge
1. Applied computing
  1. Education

Recommendations

Analysis of community structure in Wikipedia
WWW '09: Proceedings of the 18th international conference on World wide web

We present the results of a community detection analysis of the Wikipedia graph. Distinct communities in Wikipedia contain semantically closely related articles. The central topic of a community can be identified using PageRank. Extracted communities ...
Read More
Two-stage approach to named entity recognition using Wikipedia and DBpedia
IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication

In natural language understanding, extraction of named entity (NE) mentions in given text and classification of the mentions into pre-defined NE types are important processes. Most NE recognition (NER) relies on resources such as a training corpus or NE ...
Read More
Learning multilingual named entity recognition from Wikipedia

We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
TEEM '15: Proceedings of the 3rd International Conference on Technological Ecosystems for Enhancing Multiculturality
October 2015
674 pages
ISBN:9781450334426
DOI:10.1145/2808580
Conference Chair:
Gustavo R. Alves
Polytechnic of Porto, Portugal
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 October 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
SNA techniques
community detection
science education
social networks
wikipedia
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate496of705submissions,70%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 97
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The implications of Wikipedia for contemporary science education: using social network analysis techniques for automatic organisation of knowledge

TEEM '15: Proceedings of the 3rd International Conference on Technological Ecosystems for Enhancing Multiculturality

ABSTRACT

References

Cited By

Index Terms

Recommendations

Analysis of community structure in Wikipedia

Two-stage approach to named entity recognition using Wikipedia and DBpedia

Learning multilingual named entity recognition from Wikipedia