This webpage has been archived. Its content will not be updated.
View web retention policy
Start date: 1 February 2003
End date: 24 July 2006
Funding programme: Focus on Access to Institutional Resources (FAIR) programme
Project website:
http://ahds.ac.uk/hybrid_archives.htm
JISC theme(s): Information environment
Introduction
The project aims to address two key challenges facing the JISC Information environment as it strives to provide access to and preservation of important institutional collections:
- First, it will address some of the institutional barriers to depositing collections with data services such as the Arts & Humanities Data Service (AHDS)
- Second, it will address the shortcoming of the OAI harvesting methodology for disclosing metadata for searching at a rich level of granularity, and the lack of provision for the long-term preservation of disclosed assets
Traditionally, both analogue and digital collections have been lodged with a host archive or data service that then takes responsibility for preserving the collection and for providing access to users. However, the experience of the AHDS over the last six years suggests that the requirement for full deposit is not always an attractive option for institutions:
- They may not wish to hand over control (as they see it) of their collection to a third party (and there is no legal requirement for them to do so)
- Institutions may wish to act as the primary disseminator for the collection in order to attract students and/or researchers to that institution and to profile the work of the research team responsible for creating the collection
- There may be outstanding copyright/IPR problems that create significant barriers to traditional deposit
- The collection may be dynamic – that is, still being extended and added to – where traditional archiving generally assumes a static, complete collection
Of course, with digital collections it is possible to acquire a copy for archiving and dissemination whilst the owning institution continues to disseminate its copy but in an age of limited resources, this is not a very cost effective solution. Usual practice therefore, has been for the AHDS to concentrate its efforts on acquiring other collections where the owners have no wish to disseminate themselves, and leave dissemination of these collections to the host institutions. But while this practice makes sense in terms of cost effectiveness it presents other drawbacks.
Significantly such a practice fails to provide for the proper preservation of the collection. Whilst many institutions provide back-up facilities for the collection, this takes no account of the need to migrate systems and content over time, and is no substitute for a formal preservation strategy. Moreover, should the host institution no longer be able or willing to support the collection there is a real danger that the collection will no longer be available to the wider community and may be lost forever. It also renders impossible cross-searching with other AHDS collections, and by implication, with other collections in the JISC information environment.
Of course, the OAI protocol offers services such as the AHDS opportunities to harvest metadata that conforms to the Bath Profile into its web-based catalogues and from there to offer cross-searching with its other collections. The drawback is that the protocol harvests only limited Dublin Core metadata that comprises only a small part of the complex, rich metadata that would be made available were the collection to be fully accessioned into the AHDS.
However, recent technological developments, in particular the creative use of the full potential of the Z39.50 and OAI protocols, have the potential to improve the existing discovery and access capabilities of the Arts and Humanities Data Service and to harvest data from institutional collections at a more detailed level of description.
The AHDS therefore seeks to develop an extensible and sustainable ‘partial deposit’ model that would allow host institutions to retain control and dissemination of their collections, while simultaneously employing OAI and Z39.50 technologies to integrate rich, complex metadata into AHDS web-based search systems. Additionally such a model would provide for the long term preservation of the collection and offer an exit strategy should an institution no longer be able or willing to host the collection. Partial Deposit would be regulated by a formal licence agreement.
The model is underpinned by a formal licence agreement that regulates the above process and provides an exit strategy should the collection no longer be supported by the host institution. The term ‘partial deposit’ is used throughout this project plan as an appropriate descriptive term for the project’s aims in regard to providing a viable alternative to the practice of ‘full deposit’. This usage of terminology should not, however, be confused with or seen to detract in any way from the preferred project title: the Hybrid Archive Model.
Aims and Objectives
The project aims to develop a model for the partial deposit of collections, where full deposit is either not an option or where it would not prove cost effective. The development of this model would extend the deposit options available from data archives and services to institutions and owners and would, it is anticipated, increase the number of collections preserved and made available for use within the JISC Information environment. It would address a number of the key problems that currently act as barriers to depositing and making important collections available. The model is expected to make a significant contribution to building a critical mass of resources within the arts and humanities and as such would make a significant contribution to the JISC Information and Collection strategies.
Under this model Depositors (owners of institutional collections) would:
- Disclose agreed metadata at a detailed level of granularity for capture by the AHDS using either Z39.50 or OAI as appropriate, and an agreed preservation metadata set
- Make available, in a form and format to be agreed, content for preservation. This is likely to include documentation and data e.g. texts, images, databases, sound, moving images etc. plus explanatory documentation to enable informed use of the collection
- Sign a formal licence agreement that regulates the process, defines the rights and responsibilities of each party, and provides for a move to full deposit should the institution no longer be willing or able to continue to support the collection
- Move to full deposit should the institution no longer be willing or able to continue to support the collection
- Take responsibility for disseminating and supporting use of the collection
Under this model the AHDS would:
- Capture rich, complex metadata and integrate it within AHDS search and retrieve systems for cross-searching with other AHDS collections, using Z39.50 and OAI technologies as appropriate
- Expose this metadata (along with metadata from full deposit collections) to appropriate Portals in the JISC information environment
- Capture content for preservation and integrate it into the AHDS preservation system
- Sign the formal licence agreement that regulates the process, defines the rights and responsibilities of each party, and provides for a move to full deposit should the institution no longer be willing or able to continue to support the collection
- Move to full accession of the collection should the institution no longer be willing or able to continue to support the collection
- Publicise the collection and promote awareness and use of the collection
Overall Approach
The project will develop and test a hybrid model for depositing institutional assets that provides a bridge between the complexity (and burdens) of full ‘traditional’ deposit of institutional collections and the more simplified approach embodied in harvesting methodology. The Hybrid Archives model would take elements of both traditional deposit methodology and harvesting methodology and re-work them to produce an integrated and cohesive deposit model that ensures:
- Institutional assets are disclosed to the AHDS at a level of granularity that enables detailed cross-searching with AHDS full deposit collections, and that they are exposed for searching within the appropriate Portals of the JISC information environment
- Long-term preservation of institutional assets is provided for, together with a formal exit strategy should the institution no longer be able or willing to support the collection
The overall approach is that:
- The Project Research Officer, in consultation with the Steering Committee, will provide a detailed model and method of delivery;
- An evaluation of this model will be delivered after a test-bed phase which involves working with AHDS centres for the Visual and Performing Arts, the Courtauld Institute, Royal Holloway and the Theatre Museum;
- There will be a revision process based on the evaluative test-bed phase undertaken by the participants;
- Dissemination will occur through the project website, in addition to the publication of at least two journal articles and a paper offered to a major conference.
The model will include process architecture, metadata requirements including preservation metadata, recommendations for technologies to be used for harvesting metadata and for managing content for preservation, and licensing requirements.
Though agreement on a variety of metadata standards is a necessary prerequisite for interoperable digital collections, implementation of interoperability also requires a set of architectures and a common approach to making that metadata available to other collections, middleware, and end users. Interoperability will be addressed through use of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) and the ANSI/NISO Z39.50 Protocol, in addition to the RELAX NG Specification that specifies a simple schema language for XML. Scalability will be monitored during the test-bed phase in order to ensure that the model developed can be scaled beyond the constraints of the collections selected for evaluation purposes.
Sustainability is addressed through the commitment of the AHDS to developing a model for the partial deposit of collections, where full deposit is either not an option or where it would not prove cost effective.
Targeted user communities would include:
- Institutions currently engaged or planning future deposits of institutional assets
- Harvesting bodies and repositories with an interest in the proposed model of metadata harvesting
- The academic research community.
Feedback from user communities in terms of deposit and harvesting would initially be confined to those actively involved in the test-bed phase of the project. It is to be hoped, however, that networking and other linkages established through the steering committee would provide further feedback in terms of the developed model’s applicability to a wide range of collections. Feedback to the academic community will consist in large part of the draft reports and recommendations which will be posted on the website, with comments invited. In addition to this further feedback will be elicited by the dissemination of project findings through the publication of articles in relevant journals and conference papers.
Project Consortium
The Arts and Humanities Data Service is a distributed service with five Service Providers and a managing Executive at King’s College London. The principal function of the AHDS is to provide a research archive for digital resources in the arts and humanities created by and for the Higher Education community, in order to (i) ensure that they meet the highest technical standards, (ii) guarantee their preservation, (iii) provide appropriate means of access to them. It therefore plays an indispensable role in helping to ensure value for money for public funding spent on data creation projects.
The AHDS has successfully set up a common policy framework to manage and regulate the deposit of digital collections. Much of this work has been published in the highly acclaimed and widely quoted ‘Managing Digital Collections’ series, published on the AHDS website. The AHDS was also a leader in the field of distributed search environments, establishing a distributed search environment based upon Z39.50 technology in 1997. It has the necessary management and organisational framework in which to carry out a project, such as the one proposed, to a successful conclusion.
During the test-bed phase of the project the AHDS will be working with the Courtauld Institute, the Theatre Museum and Royal Holloway. The AHDS Centres for the Visual and Performing Arts will manage this part of the project.