ABSTRACT
Citations play a crucial role in the scientific discourse, in information retrieval, and in bibliometrics. Many initiatives are currently promoting the idea of having free and open citation data. Creation of citation data, however, is not part of the cataloging workflow in libraries nowadays.
In this paper, we present our project Linked Open Citation Database, in which we design distributed processes and a system infrastructure based on linked data technology. The goal is to show that efficiently cataloging citations in libraries using a semi-automatic approach is possible. We specifically describe the current state of the workflow and its implementation. We show that we could significantly improve the automatic reference extraction that is crucial for the subsequent data curation. We further give insights on the curation and linking process and provide evaluation results that not only direct the further development of the project, but also allow us to discuss its overall feasibility.
- Akansha Bhardwaj, Dominik Mercier, Andreas Dengel, and Sheraz Ahmed. 2017. DeepBIBX: Deep Learning for Image Based Bibliographic Data Extraction. Springer International Publishing, Cham, 286-293. https://doi.org/10.1007/978-3-319-70096-0_30Google Scholar
- Marshall Breeding. 2015. Future of Library Discovery Systems. Information Standards Quarterly 27, 1 (2015), 24. https://doi.org/10.3789/isqv27no1.2015.04Google ScholarCross Ref
- Thomas M. Breuel. 2008. The OCRopus open source OCR system. In Document Recognition and Retrieval XV, part of the IS&T-SPIE Electronic Imaging Symposium, San Jose, CA, USA, January 29--31, 2008. Proceedings (SPIE Proceedings), Berrin A. Yanikoglu and Kathrin Berkner (Eds.), Vol. 6815. SPIE, 68150F. https://doi.org/10.1117/12.783598Google Scholar
- Thomas M. Breuel, Adnan Ul-Hasan, Mayce Ibrahim Ali Al Azawi, and Faisal Shafait. 2013. High-Performance OCR for Printed English and Fraktur Using LSTM Networks. In 2013 12th International Conference on Document Analysis and Recognition, Washington, DC, USA, August 25--28, 2013. IEEE Computer Society, 683-687. https://doi.org/10.1109/ICDAR.2013.140 Google ScholarDigital Library
- Isaac Councill, C. Lee Giles, and Min-Yen Kan. 2008. ParsCit: an Open-source CRF Reference String Parsing Package. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC-08). European Language Resources Association (ELRA), Marrakech, Morocco. http://www.lrec-conf.org/ proceedings/lrec2008/summaries/166.htmlGoogle Scholar
- Lee R. Dice. 1945. Measures of the Amount of Ecologic Association Between Species. Ecology 26, 3 (1945), 297--302. https://doi.org/10.2307/1932409Google ScholarCross Ref
- Eugene Garfield. 1955. Citation Indexes for Science: A New Dimension in Documentation through Association of Ideas. Science 122, 3159 (July 1955), 108-111. https://doi.org/10.1126/science.122.3159.108Google ScholarCross Ref
- C. Lee Giles, Kurt D. Bollacker, and Steve Lawrence. 1998. CiteSeer: An Automatic Citation Indexing System. In Proceedings of the Third ACM Conference on Digital Libraries (DL '98). ACM, New York, NY, USA, 89--98. Google ScholarDigital Library
- Annette Klein. 2017. Von der Schneeflocke zur Lawine: Möglichkeiten der Nutzung freier Zitationsdaten in Bibliotheken. o-bib. Das offene Bibliotheksjournal 4, 4 (Dec. 2017), 127-136. https://doi.org/10.5282/o-bib/2017H4S127-136Google Scholar
- Jonathan Lazar, Jinjuan Feng, and Harry Hochheiser. 2017. Research Methods in Human-Computer Interaction. Morgan Kaufmann. Google ScholarDigital Library
- Vladimir I. Levenshtein. 1966. Binary Codes Capable of Correcting Deletions, Insertions and Reversals. Soviet Physics Doklady 10, 8 (February 1966), 707-710.Google Scholar
- James Martin. 1981. Managing the data base environment. (1981). Google ScholarDigital Library
- John Mingers and Loet Leydesdorff. 2015. A review of theory and practice in scientometrics. European Journal of Operational Research 246, 1 (2015), 1-19. https://doi.org/10.1016/j.ejor.2015.04.002Google ScholarCross Ref
- Philippe Mongeon and Adèle Paul-Hus. 2016. The journal coverage of Web of Science and Scopus: a comparative analysis. Scientometrics 106, 1 (Jan. 2016), 213-228. https://doi.org/10.1007/s11192-015-1765-5 Google ScholarDigital Library
- Silvio Peroni. 2014. The Semantic Publishing and Referencing Ontologies. In Semantic Web Technologies and Legal Scholarly Publishing. Springer, Cham, 121-193. https://doi.org/10.1007/978-3-319-04777-5_5Google Scholar
- Silvio Peroni, Alexander Dutton, Tanya Gray, and David Shotton. 2015. Setting our bibliographic references free: towards open citation data. Journal of Documentation 71, 2 (2015), 253-277. arXiv:https://doi.org/10.1108/JD-12-2013-0166Google ScholarCross Ref
- Silvio Peroni and David Shotton. 2016. Metadata for the OpenCitations Corpus. Technical Report. https://dx.doi.org/10.6084/m9.figshare.3443876Google Scholar
- Silvio Peroni, David M. Shotton, and Fabio Vitali. 2017. One Year of the OpenCitations Corpus - Releasing RDF-Based Scholarly Citation Data into the Public Domain. In The Semantic Web - ISWC 2017 - 16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017, Proceedings, Part II (Lecture Notes in Computer Science), Vol. 10588. Springer, 184-192. https://doi.org/10.1007/978-3-319-68204-4_19Google Scholar
- Yvonne Rogers, Helen Sharp, and Jenny Preece. 2012. Interaction Design - Beyond Human-Computer Interaction, 3rd Edition. Wiley. Google ScholarDigital Library
- Dario Taraborelli, Lydia Pintscher, Daniel Mietchen, and Sarah Rodlund. 2017. WikiCite 2017 report. (Dec. 2017). https://figshare.com/articles/WikiCite_2017_ report/5648233 DOI: 10.6084/m9.figshare.5648233.v3Google Scholar
- Christian Wilke and Regina Retter. 2017. Zitationsdaten extrahieren: halbautomatisch, offen, vernetzt. Ein Workshopbericht. Informationspraxis 3, 2 (Dec. 2017). https://doi.org/10.11588/ip.2017.2.43235Google Scholar
- Dietmar Wolfram. 2015. The symbiotic relationship between information retrieval and informetrics. Scientometrics 102, 3 (2015), 2201-2214. https://doi.org/10.1007/s11192-014-1479-0 Google ScholarDigital Library
Index Terms
- Linked Open Citation Database: Enabling Libraries to Contribute to an Open and Interconnected Citation Graph
Recommendations
Linked Open Data in the Digital Humanities (Review of Publications)
AbstractAn overview of the application of linked open data technologies in foreign projects in the field of digital humanities is provided. In this field, several directions can be distinguished: the transformation of digital collections of culture and ...
From syllables, lines and stanzas to linked open data: standardization, interoperability and multilingual challenges for digital humanities
TEEM '16: Proceedings of the Fourth International Conference on Technological Ecosystems for Enhancing MulticulturalityThis proposal presents the challenges and first results of POSTDATA ERC Starting Grant project, which aims at bridging the digital gap among traditional poetry collections and the growing world of data. It is focused on poetry analysis, classification ...
Sustainability of open access citation advantage: the case of Elsevier's author-pays hybrid open access journals
The present study tended to investigate the sustainability of citation advantage of author-pays hybrid open access journals. Applying a comparative citation analysis method, it explored a sample consisted of 160,168 articles in 47 Elsevier APC-funded ...
Comments