skip to main content
10.1145/3197026.3197037acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

Evaluation of Conformance Checkers for Long-Term Preservation of Multimedia Documents

Authors Info & Claims
Published:23 May 2018Publication History

ABSTRACT

We develop an evaluation framework for the validation of conformance checkers for the long-term preservation. The framework assesses the correctness, usability, and usefulness of the tools for three media types: PDF/A (text), TIFF (image), and Matroska (audio/video). Finally, we report the results of the validation of these conformance checkers using the proposed framework. In general, the presented framework is a high-level tool that can be quite easily employed in other preservation-related tasks.

References

  1. O. Alonso. Implementing crowdsourcing-based relevance experimentation: an industrial perspective. Information Retrieval, 16(2):101--120, April 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Becker and K. Duretec. Free Benchmark Corpora for Preservation Experiments: Using Model-Driven Engineering to Generate Data Sets. In Proc. 13th ACM/IEEECS Joint Conference on Digital Libraries (JCDL 2013), pages 349--358. ACM Press, New York, USA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Becker, K. Duretec, and A. Rauber. The Challenge of Test Data Quality in Data Processing. ACM Journal of Data and Information Quality (JDIQ), 8(2), 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Becker and A. Rauber. Decision Criteria in Digital Preservation: What to Measure and How. Journal of the American Society for Information Science and Technology (JASIST), 62(6):1009--1028, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Calvanese, D. De Nart, and C. Tasso, editors. Digital Libraries on the Move -- Proc. 11th Italian Research Conference on Digital Libraries (IRCDL 2015). Communications in Computer and Information Science (CCIS) 612, Springer, Heidelberg, Germany, 2016.Google ScholarGoogle Scholar
  6. L. Cappellato, N. Ferro, A. Fresa, M. Geber, B. Justrel, B. Lemmen, C. Prandoni, and G. Silvello. The PREFORMA Project: Federating Memory Institutions for Better Compliance of Preservation Formats. In Calvanese et al. {5}, pages 86--91.Google ScholarGoogle Scholar
  7. J.-P. Chanod, M. Dobreva, A. Rauber, S. Ross, and V. Casarosa. Issues in Digital Preservation: Towards a New Research Agenda. In J.-P. Chanod, M. Dobreva, A. Rauber, and S. Ross, editors, Report from Dagstuhl Seminar 10291: Automation in Digital Preservation, Dagstuhl Reports, pages 1--14. Schloss Dagstuhl--LeibnizZentrum für Informatik, Germany, 2010.Google ScholarGoogle Scholar
  8. C. W. Cleverdon. The Cranfield Tests on Index Languages Devices. In K. Spärck Jones and P. Willett, editors, Readings in Information Retrieval, pages 47--60. Morgan Kaufmann Publisher, Inc., San Francisco, CA, USA, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Duretec, A. Kulmukhametov, A. Rauber, and C. Becker. Benchmarks for Digital Preservation Tools. In Proc. 11th International Conference on Preservation of Digital Objects (iPRES 2015), 2015.Google ScholarGoogle Scholar
  10. K. Duretec, A. Rauber, and C. Becker. A Text Extraction Software Benchmark Based on a Synthesized Dataset. In 2017 ACM/IEEE Joint Conference on Digital Libraries, JCDL 2017, pages 109--118. IEEE Computer Society, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Ferro. Quality and Interoperability: The Quest for the Optimal Balance. In I. Iglezakis, T.-E. Synodinou, and S. Kapidakis, editors, E-Publishing and Digital Libraries: Legal and Organizational Issues, pages 48--68. IGI Global, USA, 2010.Google ScholarGoogle Scholar
  12. N. Ferro. Proposal for an Evaluation Framework for Compliance Checkers for Long-term Digital Preservation. In Digital Libraries and Multimedia Archives -- Proc. 12th Italian Research Conference on Digital Libraries (IRCDL 2016), pages 125--136. Communications in Computer and Information Science (CCIS) 701, Springer, Heidelberg, Germany, 2016.Google ScholarGoogle Scholar
  13. N. Ferro. Reproducibility Challenges in Information Retrieval Evaluation. ACM Journal of Data and Information Quality (JDIQ), 8(2):8:1--8:4, January 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. Ferro, E. Buelinckx, B. Doubrov, K. Jadeglans, B. Lemmens, J. Martinez, V. Muñoz, C. Prandoni, D. Rice, S. Rohde-Enslin, X. Tarrés, E. Verbruggen, B. Yousefi, and C. Wilson. Deliverable D8.1R2 -- Competitive Evaluation Strategy. PREFORMA PCP Project, EU 7FP, Contract N. 619568, October 2016.Google ScholarGoogle Scholar
  15. N. Ferro and G. Silvello. Towards a Semantic Web Enabled Representation of DL Foundational Models: The Quality Domain Example. In Calvanese et al. {5}, pages 24--35.Google ScholarGoogle Scholar
  16. N. Ferro, G. Silvello, E. Buelinckx, B. Doubrov, M. Geber, K. Jadeglans, J. Martinez, V. Muñoz, D. Rice, S. Rohde-Enslin, X. Tarrés, E. Verbruggen, B. Yousefi, and C. Wilson. Deliverable D8.6 -- Testing Report. PREFORMA PCP Project, EU 7FP, Contract N. 619568, October 2017.Google ScholarGoogle Scholar
  17. N. Fuhr, G. Tsakonas, T. Aalberg, M. Agosti, P. Hansen, S. Kapidakis, C.-P. Klas, L. Kovács, M. Landoni, A. Micsik, C. Papatheodorou, C. Peters, and I. Sølvberg. Evaluation of Digital Libraries. International Journal on Digital Libraries, 8(1):21-- 38, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. IEC 60958. Digital audio interface - Part 1: General. Standard IEC 60958--1 Ed. 3.1 b:2014, 2014.Google ScholarGoogle Scholar
  19. P. Innocenti, S. Ross, E. Maceviciute, T. Wilson, J. Ludwig, and W. Pempe. Assessing Digital Preservation Frameworks: The Approach of the SHAMAN Project. In N. Spyratos, E. Kapetanios, and A. Traina, editors, Proc. ACM International Conference on Management of Emergent Digital EcoSystems (MEDES 2009), pages 412--416. ACM Press, New York, USA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. ISO 12234--2. Electronic still-picture imaging -- Removable memory -- Part 2: TIFF/EP image data format. Recommendation ISO 12234--2:2001, 2001.Google ScholarGoogle Scholar
  21. ISO 12639. Graphic technology -- Prepress digital data exchange -- Tag image file format for image technology (TIFF/IT). Recommendation ISO 12639:2004, 2004.Google ScholarGoogle Scholar
  22. ISO 14721. Space data and information transfer systems -- Open archival information system (OAIS) -- Reference model. Recom. ISO 14721:2012, 2012.Google ScholarGoogle Scholar
  23. ISO 19005--1. Document management -- Electronic document file format for long-term preservation -- Part 1: Use of PDF 1.4 (PDF/A-1). Recommendation ISO 19005--1:2005, 2005.Google ScholarGoogle Scholar
  24. ISO 19005--2. Document management -- Electronic document file format for long-term preservation -- Part 2: Use of ISO 32000--1 (PDF/A-2). Recommendation ISO 19005--2:2011, 2011.Google ScholarGoogle Scholar
  25. ISO 19005--3. Document management -- Electronic document file format for long-term preservation -- Part 3: Use of ISO 32000--1 with support for embedded files (PDF/A-3). Recommendation ISO 19005--3:2012, 2012.Google ScholarGoogle Scholar
  26. ISO/IEC 15444. Information technology -- JPEG 2000 image coding system: Core coding system. Recommendation ISO/IEC 15444--1:2004, 2004.Google ScholarGoogle Scholar
  27. S. T. Kowalczyk. Before the Repository: Defining the Preservation Threats to Research Data in the Lab. In P. Logasa Bogen II, S. Allard, H. Mercer, and M. Beck, editors, Proc. 15th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2015), pages 215--222. ACM Press, New York, USA, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Ross. Digital Preservation, Archival Science and Methodological Foundations for Digital Libraries. New Review of Information Networking, 17(1):43--68, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. F. Sebastiani. Machine Learning in Automated Text Categorization. ACM Computing Surveys (CSUR), 34(1):1--47, March 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. G. Silvello. Theory and practice of data citation. JASIST, 69(1):6--20, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. I. Soboroff, C. Nicholas, and P. Cahan. Ranking Retrieval Systems without Relevance Judgments. In D. H. Kraft, W. B. Croft, D. J. Harper, and J. Zobel, editors, Proc. 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001), pages 66--73. ACM Press, New York, USA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Sokolova and G. Lapalme. A Systematic Analysis of Performance Measures for Classification Tasks. Information Processing &Management, 45(4):427--437, July 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. The Consultative Committee for Space Data Systems (CCSDS). Reference Model for an Open Archival Information System (OAIS). Magenta Book, Issue 2. Recommended Practice CCSDS 650.0-M-2, http://public.ccsds.org/publications/archive/ 650x0m2.pdf, June 2012.Google ScholarGoogle Scholar
  34. E. M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness. Information Processing &Management, 36(5):697--716, September 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Evaluation of Conformance Checkers for Long-Term Preservation of Multimedia Documents

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      JCDL '18: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries
      May 2018
      453 pages
      ISBN:9781450351782
      DOI:10.1145/3197026

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 May 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      JCDL '18 Paper Acceptance Rate26of71submissions,37%Overall Acceptance Rate415of1,482submissions,28%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader