skip to main content
10.1145/3197026.3197050acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

Linked Open Citation Database: Enabling Libraries to Contribute to an Open and Interconnected Citation Graph

Authors Info & Claims
Published:23 May 2018Publication History

ABSTRACT

Citations play a crucial role in the scientific discourse, in information retrieval, and in bibliometrics. Many initiatives are currently promoting the idea of having free and open citation data. Creation of citation data, however, is not part of the cataloging workflow in libraries nowadays.

In this paper, we present our project Linked Open Citation Database, in which we design distributed processes and a system infrastructure based on linked data technology. The goal is to show that efficiently cataloging citations in libraries using a semi-automatic approach is possible. We specifically describe the current state of the workflow and its implementation. We show that we could significantly improve the automatic reference extraction that is crucial for the subsequent data curation. We further give insights on the curation and linking process and provide evaluation results that not only direct the further development of the project, but also allow us to discuss its overall feasibility.

References

  1. Akansha Bhardwaj, Dominik Mercier, Andreas Dengel, and Sheraz Ahmed. 2017. DeepBIBX: Deep Learning for Image Based Bibliographic Data Extraction. Springer International Publishing, Cham, 286-293. https://doi.org/10.1007/978-3-319-70096-0_30Google ScholarGoogle Scholar
  2. Marshall Breeding. 2015. Future of Library Discovery Systems. Information Standards Quarterly 27, 1 (2015), 24. https://doi.org/10.3789/isqv27no1.2015.04Google ScholarGoogle ScholarCross RefCross Ref
  3. Thomas M. Breuel. 2008. The OCRopus open source OCR system. In Document Recognition and Retrieval XV, part of the IS&T-SPIE Electronic Imaging Symposium, San Jose, CA, USA, January 29--31, 2008. Proceedings (SPIE Proceedings), Berrin A. Yanikoglu and Kathrin Berkner (Eds.), Vol. 6815. SPIE, 68150F. https://doi.org/10.1117/12.783598Google ScholarGoogle Scholar
  4. Thomas M. Breuel, Adnan Ul-Hasan, Mayce Ibrahim Ali Al Azawi, and Faisal Shafait. 2013. High-Performance OCR for Printed English and Fraktur Using LSTM Networks. In 2013 12th International Conference on Document Analysis and Recognition, Washington, DC, USA, August 25--28, 2013. IEEE Computer Society, 683-687. https://doi.org/10.1109/ICDAR.2013.140 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Isaac Councill, C. Lee Giles, and Min-Yen Kan. 2008. ParsCit: an Open-source CRF Reference String Parsing Package. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC-08). European Language Resources Association (ELRA), Marrakech, Morocco. http://www.lrec-conf.org/ proceedings/lrec2008/summaries/166.htmlGoogle ScholarGoogle Scholar
  6. Lee R. Dice. 1945. Measures of the Amount of Ecologic Association Between Species. Ecology 26, 3 (1945), 297--302. https://doi.org/10.2307/1932409Google ScholarGoogle ScholarCross RefCross Ref
  7. Eugene Garfield. 1955. Citation Indexes for Science: A New Dimension in Documentation through Association of Ideas. Science 122, 3159 (July 1955), 108-111. https://doi.org/10.1126/science.122.3159.108Google ScholarGoogle ScholarCross RefCross Ref
  8. C. Lee Giles, Kurt D. Bollacker, and Steve Lawrence. 1998. CiteSeer: An Automatic Citation Indexing System. In Proceedings of the Third ACM Conference on Digital Libraries (DL '98). ACM, New York, NY, USA, 89--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Annette Klein. 2017. Von der Schneeflocke zur Lawine: Möglichkeiten der Nutzung freier Zitationsdaten in Bibliotheken. o-bib. Das offene Bibliotheksjournal 4, 4 (Dec. 2017), 127-136. https://doi.org/10.5282/o-bib/2017H4S127-136Google ScholarGoogle Scholar
  10. Jonathan Lazar, Jinjuan Feng, and Harry Hochheiser. 2017. Research Methods in Human-Computer Interaction. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Vladimir I. Levenshtein. 1966. Binary Codes Capable of Correcting Deletions, Insertions and Reversals. Soviet Physics Doklady 10, 8 (February 1966), 707-710.Google ScholarGoogle Scholar
  12. James Martin. 1981. Managing the data base environment. (1981). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. John Mingers and Loet Leydesdorff. 2015. A review of theory and practice in scientometrics. European Journal of Operational Research 246, 1 (2015), 1-19. https://doi.org/10.1016/j.ejor.2015.04.002Google ScholarGoogle ScholarCross RefCross Ref
  14. Philippe Mongeon and Adèle Paul-Hus. 2016. The journal coverage of Web of Science and Scopus: a comparative analysis. Scientometrics 106, 1 (Jan. 2016), 213-228. https://doi.org/10.1007/s11192-015-1765-5 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Silvio Peroni. 2014. The Semantic Publishing and Referencing Ontologies. In Semantic Web Technologies and Legal Scholarly Publishing. Springer, Cham, 121-193. https://doi.org/10.1007/978-3-319-04777-5_5Google ScholarGoogle Scholar
  16. Silvio Peroni, Alexander Dutton, Tanya Gray, and David Shotton. 2015. Setting our bibliographic references free: towards open citation data. Journal of Documentation 71, 2 (2015), 253-277. arXiv:https://doi.org/10.1108/JD-12-2013-0166Google ScholarGoogle ScholarCross RefCross Ref
  17. Silvio Peroni and David Shotton. 2016. Metadata for the OpenCitations Corpus. Technical Report. https://dx.doi.org/10.6084/m9.figshare.3443876Google ScholarGoogle Scholar
  18. Silvio Peroni, David M. Shotton, and Fabio Vitali. 2017. One Year of the OpenCitations Corpus - Releasing RDF-Based Scholarly Citation Data into the Public Domain. In The Semantic Web - ISWC 2017 - 16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017, Proceedings, Part II (Lecture Notes in Computer Science), Vol. 10588. Springer, 184-192. https://doi.org/10.1007/978-3-319-68204-4_19Google ScholarGoogle Scholar
  19. Yvonne Rogers, Helen Sharp, and Jenny Preece. 2012. Interaction Design - Beyond Human-Computer Interaction, 3rd Edition. Wiley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Dario Taraborelli, Lydia Pintscher, Daniel Mietchen, and Sarah Rodlund. 2017. WikiCite 2017 report. (Dec. 2017). https://figshare.com/articles/WikiCite_2017_ report/5648233 DOI: 10.6084/m9.figshare.5648233.v3Google ScholarGoogle Scholar
  21. Christian Wilke and Regina Retter. 2017. Zitationsdaten extrahieren: halbautomatisch, offen, vernetzt. Ein Workshopbericht. Informationspraxis 3, 2 (Dec. 2017). https://doi.org/10.11588/ip.2017.2.43235Google ScholarGoogle Scholar
  22. Dietmar Wolfram. 2015. The symbiotic relationship between information retrieval and informetrics. Scientometrics 102, 3 (2015), 2201-2214. https://doi.org/10.1007/s11192-014-1479-0 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Linked Open Citation Database: Enabling Libraries to Contribute to an Open and Interconnected Citation Graph

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                JCDL '18: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries
                May 2018
                453 pages
                ISBN:9781450351782
                DOI:10.1145/3197026

                Copyright © 2018 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 23 May 2018

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                JCDL '18 Paper Acceptance Rate26of71submissions,37%Overall Acceptance Rate415of1,482submissions,28%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader