skip to main content
abstract

System evaluation of archival description and access

Published:09 January 2012Publication History
Skip Abstract Section

Abstract

How do archives provide access to their records and let users search? The answer is archival description. Encoded Archival Description (EAD) in Extensible Markup Language (XML) is the de facto technical standard for "electronic" archival descriptions. It is now used to bridge the gulf between tangible records in archives and digital objects on the World WideWeb. These descriptions are finding aids, which are tools to search and find information about, or references to, archival records. The archival finding aids in EAD are left to searchers (out of sight and contact) to explore in unknown ways: how do searchers interact with these finding aids, and what type of retrieval system is needed to support them.

The approach is to apply XML retrieval techniques to the EAD finding aids, develop system evaluation of EAD retrieval, and study information seeking behavior of archival search. The main information retrieval (IR) contributions are the system evaluation of an important "real" and domain-specific search task, a study on the usage of transaction logs for deriving domain-specific test collections, an analysis of search behavior in yet unexplored structured documents, and tailoring IR evaluation to specific searcher stereotypes.

The first study involves the design and implementation of the archival search engine README. The README system attempts to incorporate current technologies with the archival structure in finding aids---such as XML retrieval---and simultaneously to uphold the archival principles where this structure is based upon. The system is the proof of concept. Having established this baseline, the next study explores and tests the construction of an IR test collection. A test collection is a key component in IR evaluation. The basis of this test collection are the queries and clicks on archival descriptions that can be found in the search log files of the National Archives of the Netherlands. There is no readily-available test collection for evaluating the accuracy of the retrieval of archival descriptions of records by an archival search engine. Manually creating such a collection is expensive. The study shows that automatically creating a test collection seems viable.

Archival principles---such as provenance and original order---are deeply rooted in the arrangement and subsequent description of archival records. These principles have been cast on EAD finding aids as well. The investigation continues by shedding new light on them in a system evaluation. Additionally, the experiments probe XML retrieval-specific issues, such as the retrieval of certain elements. The study concludes by reflecting on the README archival search engine, which is the baseline of the probes in this dissertation. How effective are certain archival principles for archival access in this digital age.

Using the archival search log files, the research focus shifts to the arrangement of records in EAD and user search behaviors using this arrangement. The sub-document clicks within the finding aids point to the online interaction of users within "electronic" archival descriptions of records. The analysis of the interactions comprises of quantifying the search behavior. This results in a state diagram that captures different information search behaviors of different people. By analyzing real-world interaction, the discussion on the use of the finding aid in this digital age as access tool becomes more complete. The result is more understanding of online archival search behavior within EAD finding aids, which can be used to improve a search system adapted to existing "electronic" archival description.

Finally, the system evaluation deals with tailoring a search engine to the different user stereotypes, namely "expert" and 'novice' groups based on the number of times that a user re-uses the system. The results show that although there are significant differences in terms of search behavior, this does not necessarily mean that for more effective retrieval of archival descriptions, the system needs to be adapted to improve access for these different user groups.

The doctoral dissertation is available online at http://dare.uva.nl/record/395154.

Index Terms

  1. System evaluation of archival description and access

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM SIGIR Forum
            ACM SIGIR Forum  Volume 45, Issue 2
            December 2011
            94 pages
            ISSN:0163-5840
            DOI:10.1145/2093346
            Issue’s Table of Contents

            Copyright © 2012 Author

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 9 January 2012

            Check for updates

            Qualifiers

            • abstract

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader