In order to analyse and scope future developments to support preservation in institutional repositories, JISC is funding this study.

Requirements and Feasibility Study on Preservation of e-prints


End date: 1 February 2004

Funding programme: Digital Preservation and Records Management Programme

JISC theme(s): Information environment, e-Administration

JISC has funded a number of projects to support access to and sharing of institutional content within Higher Education (HE) and Further Education (FE) and to allow intelligence to be gathered about the technical, organisational and cultural challenges of these processes. The Focus on Access to Information Resources (FAIR) programme will contribute to developing the mechanisms and supporting services to allow the submission and sharing of content generated by the HE/FE community. This programme is part of a broader area of development to build an Information Environment for the UK's Distributed National Electronic Resource. 

The programme is inspired by the vision of the Open Archives Initiative (OAI) that digital resources can be shared between organisations based on a simple mechanism allowing metadata about those resources to be harvested into services. In the e-prints community this is realised through data providers who mount the e-prints (and who could be based in institutions, in subject groupings, or in some other way), and who disclose their metadata to a service provider, which again could be based in institutions, or could be subject based, regional, national or international. End users can either search the particular data provider of interest, if they know it, or can search the service provider, which will have gathered together the metadata from many data providers. The OAI protocol is one mechanism that can support this model, but there are others. The model can clearly be extended to include other kinds of objects, for example learning objects, images, video clips, finding aids, etc. The vision here is of a complex web of resources built by groups with a long term stake in the future of those resources, but made available through service providers to the whole community of learning. 

Currently the best known and most heavily used e-print archives have been established in specific discipline based repositories but individual institutional repositories are also now being established focussing on e-prints and other institutional assets. The FAIR programme is one example of this. 

Although referred to as the "Open Archive Initiative" the ‘Archive’ of the OAI refers  to the process of depositing articles, discovery, and promoting access (particularly pre-publication), rather than to archival custody as such and the process of their preservation in a long-term repository. 

E-prints and institutional repositories are a new and high profile area for JISC and institutions. The initial focus has inevitably been on encouraging the deposit of e-prints and developing the OAI schema and tools both in the UK and internationally. However longer-term requirements will inevitably having some bearing on these emerging institutional repositories as they progress beyond the proof of concept and development stage. In order to analyse and scope future developments to support preservation in institutional repositories, JISC is now funding this study. 

In the first instance there are a number of practical issues to address in terms of guidance on collection development and policy, which can be built into institutional collections policies and procedures. 

One issue is the criteria for which material should or can be retained and for how long (in part this could be dependent on other issues such as eventual publication and IPR policy). 

Another will be the data formats accepted or held (if different) by the repository as these affect longer-term costs and planning. There are a number of existing repositories and emerging guidelines on formats that could be relevant to this work. These include existing e-prints and e-theses repositories or preservation services handling similar file formats (for example the current preservation review in the Arts and Humanities Data Service which is providing guidance on a range of data formats for its services). 

OAI metadata provides a metadata set suitable for supporting discovery but will also need addition of preservation metadata if long-term retention is to be supported. Preservation metadata has been considered in Cedars, NEDLIB, and National Library of Australia amongst others and a report considering and mapping the schema has recently been produced by a Research Libraries Group/OCLC working group

In the medium to longer-term a number of possible scenarios could exist for developing preservation infrastructure for institutional repositories. Previous digital preservation research funded by JISC has explored models for distributed preservation of digital materials and the broad thrust of this work is supported by other preservation research internationally. 

A major part of the JISC Strategy therefore is the establishment of a Digital Curation Centre to move recommendations from preservation research from proof of concept to production services  and provide appropriate services and collaboration to support distributed preservation in institutions and national services. The study will be expected to take account of the establishment of the Centre and its development path in its analysis and recommendations for future development of institutional repositories. 

In developing preservation at an institutional level it is likely that the ‘Open Archival Information System’ (OAIS)Reference Model guiding future developments. This provides a set of high level principles, functions, and common terminology for digital archives. Although it does not provide an implementation it is already a starting point for many leading initiatives.  In the medium to long term e-print archive managers may well want to apply its principles in developing archive policy and procedures. 

Another possible infrastructure scenario is that the long-term preservation could be supported by transfer/replication in a central national service(s). This will need careful consideration in the study of requirements and capacity of institutional repositories themselves, proposed support services available from the Digital Curation Centre, and exploration of transfer issues such as IPR and formats, and selection (what should /could be placed in a central repository and when), and the acquisition and transfer process from the perspective of central national services. 

In any scenario consideration will need to be given to sustainability as preservation will require long-term commitments. Proposals will need to be scaleable over time to accommodate anticipated acquisition and development. The study will be expected to be informed by cost modelling in institutional repository projects eg Roquade or D-Space and existing preservation services eg papers from the Digital Preservation Coalition October Forum. It is recognised that costs of long-term preservation remain difficult to quantify but there is a growing body of practical experience in services which can help isolate principal cost elements over the "lifecycle" of preservation.

  • Last updated on 07/01/09 by Lisa Clifford