The ultimate aim of this project is to investigate and advise on some of the technical, cultural, and organisational requirements associated with the deposit, disclosure, and discovery of institutional resources in the Information Environment.

HaIRST: Harvesting Institutional Resources in Scotland Testbed

This webpage has been archived. Its content will not be updated. View web retention policy

Start date: 1 August 2002

End date: 31 July 2005

Funding programme: Focus on Access to Institutional Resources (FAIR) programme

Project website: http://hairst.cdlr.strath.ac.uk/

JISC theme(s): Information environment

Introduction

The HaIRST consortium will conduct research into the design, implementation and deployment of a pilot service for UK-wide access of autonomously created institutional resources in Scotland, the aim being to investigate and advise on some of the technical, cultural, and organisational requirements associated with the deposit, disclosure, and discovery of institutional resources in the JISC Information Environment (IE). 

HaIRST will take a ‘whole environment’ approach to the issues addressed, encompassing the general areas mentioned above, together with associated specifics such as policies on IPR, preservation mechanisms, and similar.  

Technically, the HaIRST approach to interoperability will primarily rely on harvesting for remote interaction and on the Open Archive Initiative protocol (OAI-PMH) as its standard interaction protocol. Metadata on research and learning resources available at partner institutions will be created or mapped from pre-existing forms and then regularly harvested into a common repository for local querying and further disclosure. 

Examples of materials available at partner institutions include e-prints, electronic teaching materials, digitised collections of Victorian era parliamentary papers, learning support materials, digitised ephemera from the first Scottish Parliament elections, electronic teaching materials, and digitised historical photograph collections. 

A local discovery service will be implemented over the repository of harvested metadata. Exposed to the user through a Web-based interface, the service will be capable of querying the underlying metadata at different levels of granularity, each level corresponding to a different layer of agreement among the metadata providers. Under the user control, an initial keyword-based, Web-like search may be progressively refined into a structured DC-like query at the cost of excluding part of the original metadata from retrieval. Similarly, a further step may produce a more refined query against an even smaller subset of metadata which adheres to some stronger inter-institutional agreement, and so on, including any field-by-field combinations of the above. 

To ensure the integration of the service with the current IE landscape, HaIRST will then develop two-way mapping services to support further discovery and disclosure of metadata through a number of different routes, including local institutional interfaces, other national and international OAI harvesters, Z39.50-based services, and collection-level discovery services. 

Beyond technical considerations, HaIRST will coordinate inter-institutional activity, stimulate institutional activity, and offer advice and support to those running institutional or inter-institutional services on issues such as standards, IPR, security, and preservation. It will develop draft institutional collection development policy documents covering the entire institutional e-resource activity – from institutionally created learning or research materials, through digitisation priorities, to the purchase of commercial research and learning materials – and it will explore areas such as collaborative collecting and development of e-learning materials and digitisation programmes, the aim being to develop a model that can be expanded to encompass other Scottish FE and HE institutions and perhaps offer an approach applicable to other areas of the UK. 

In its final year, the project will offer institutions outside the consortium direct assistance in depositing into the IE in exchange for an agreement to adhere to the identified interoperability standards and agreements. It will also encourage Scottish FE and HE institutions with their existing e-archives to adopt HaIRST standards and to be involved in their development, and offer space on HaIRST servers to institutions who prefer this approach. 

The project will be co-ordinated with other relevant initiatives such as the Resource Discovery Network (RDN), the High Level Thesaurus Project (HILT), the CAIRNS distributed catalogue, the SCONE Scottish Collections Database, the Glasgow Digital Library (GDL), and NGfL (Scotland). The project will also take into account work on MLEs and VLEs locally and nationally, drawing on work already carried out in the INSPIRAL project’s examination of issues associated with the links between digital libraries and digital learning and work at participant institutions.  Every effort will be made to work with JISC, JISC agencies, and other JISC FAIR projects, particularly regarding metadata and other standards. 

Aims and Objectives

The project will:

  • Build access to a central repository of harvested institutional metadata which fully reflects locally-defined data and processing requirements;
  • Provide access to the repository through a Web-based search interface capable of supporting views of increasing granularity over the varied richness of the underlying metadata;
  • Ensure integration of the pilot service into the JISC IE by further disclosing the harvested metadata within and beyond the IE via a number of different routes, including other OAI harvesters and Z39-based remote discovery services;
  • Provide a model infrastructure for the design, implementation, and deployment of institutional e-archives and associated disclosure services in Scotland;
  • Encourage the creation and deposit of institutional resources at institutions within the model infrastructure;

In particular, key deliverables are:

  1. A proof-of-concept suite of layered of metadata agreements, defined as XML applications and containing at least a top, universal layer based on unqualified Dublin Core, a middle layer which reifies agreements across HaIRST partners, and a bottom, institutions-specific layer which supports the locally-defined data and processing requirements;
  2. Development or re-structuring of institutional and inter-institutional e-archives (from relational databases to file-based HTML repositories) capable of serving metadata adhering to some of the layers of the stack of agreements defined in 1) through server-side OAI-PMH functionality;
  3. A pilot discovery service capable of regularly harvesting remote metadata from partner institutions through client-side OAI-PMH functionality, and storing it into an XML-based back-end (initially defined directly on a file-system for prototyping and testing purposes and later on a dedicated XML database system of choice);
  4. A Web-based query interface to the service in 3) capable of executing user-defined structured and partially structured queries against the metadata back-end, quantifying the portion of harvested metadata corresponding to given query structures, and including novel structure-based discovery of collection-level metadata;
  5. A number of two-way metadata mappings to support further discovery and disclosure to and from a number of existing services and archives. At this stage the list includes other national and international OAI servers – first of all the RDN repository developed under the FAIR-funded project EPRINTS-UK – the CAIRNS Z39.50-based remote discovery service, the RDN national subject-based access service, the CORC shared cataloguing service run by OCLC, and collection-level metadata databases such as SCONE. Further linking may be added depending on the availability of time and resources;
  6. Associated changes in institutional cultures, policies, strategies, and organisational structures, as appropriate to, and agreed by, participants.
  7. A pilot for a regional institutional e-archives advisory service, together with an associated web-site offering advice and guidance;
  8. The involvement of Scottish FE and HE beyond the consortium institutions;
  9. Draft institutional and inter-institutional collection development policy documents, covering all institutional e-resource activity, from institutionally created learning or research materials, through digitisation priorities, to the purchase of commercial research and learning materials;
  10. An associated exploration of inter-institutional activity through the SCAMP [8] gateway to facilitate collaborative collection development work in areas such as e-learning materials and digitisation programmes;
  11. Report and recommendations on requirements in respect of changes in institutional culture, policy, strategy, and organisational structures, as well as on the communication protocols, metadata standards, and software implications of a service based on the various types of partner institutions;
  12. A model that can be expanded to encompass other Scottish FE and HE institutions and perhaps offer an approach applicable to other areas of the UK;
  13. Increased community understanding of issues through appropriate and sustained dissemination activities;
  14. A full report on all project activities, together with recommendations covering further development guidelines on all areas of project activity.

Overall Approach

The aim will be to stagger the start dates of the various key processes (see below) but to then allow them to proceed in parallel over most of the timescale of the project so that interim results coming out of any one process can inform developments in the others. Within this overall context, the methodology will be as follows.

Technically, the project can be partitioned in the two distinct phases of interoperable deposit at partner institutions and interoperable discovery and disclosure based at the lead institution.  A first pass will experiment with sample metadata sets and ‘thin’ metadata agreements (most likely at the DC level of granularity), while a second pass will extend harvesting and discovery to full-sized metadata resources and ‘thicker’ agreement layers. This should allow the investigators to test most of the technical assumptions in a simplified scenario whilst minimising the risk of delays associated with the establishment of inter-institutional metadata agreements. 

The latter process will draw on experience gained in HILT and CAIRNS, and on NOF related work at the CDLR and Glasgow University and the National Library of Scotland to use CORC LC authority files to ensure interoperability between participating projects. The aim would be to involve JISC and the UKOLN Interoperability Focus fully in this process. 

An advisory service will become active early in the project and will co-ordinate and inform the other areas of activity. This to be based initially on the CDLR’s Digital Information Office (DIO) and to be charged with the task of co-ordinating the initiative, stimulating activity, and offering advice and support to those running institutional or inter-institutional archiving services on issues such as standards, IPR, managing security, and preservation. 

Evaluation will also play a key role in guiding activities, with the results of an early formative evaluation (months 3 and 4), a summative evaluation in months 33-35, and interim evaluations in months 13 and 26, all feeding back into and affecting other project processes. 

The institutional cultures, policies, strategies and structures process will begin with a focus group involving a mix of senior staff from the participating types of institutions, project staff, and others. This will inform an examination and analysis of the various institutional environments represented within the project aimed at identifying key organisational structures in each, possible roles for these, and any changes that might be needed if the HaIRST initiative is to succeed. Small groups will then be set up at each type of institution to interface between project processes and the institution in question and will work to facilitate change in both institutional and project processes as necessary. This work will gradually feed into other processes, the creation of the model for a regional service, and the final report and recommendations of HaIRST. 

The collection development and collaboration process will build on the work of SCONE and involve the use of the SCAMP portal between project staff at the various institutions to develop mechanisms for collaboration in this area. In addition, a draft DIO collection development policy document will be adapted for the various institutions, discussed and developed further by small groups, then (if possible) adopted by the institutions concerned. 

Project Consortium

The participants are the Universities of Strathclyde, Napier, and St. Andrews, the ten Glasgow FE colleges (the Glasgow Colleges Group (GCG) plus the John Wheatley College). 

The bid has support from the Scottish Library and Information Council (SLIC), the Resource Discovery Network (RDN), the JISC Regional Support Centre for South and West Scotland (FE sector).  An effort would be made through organisations such as the Scottish Confederation of University and Research Libraries (SCURL), the Confederation of Scottish Mini-Clumps (CoSMiC) and SLIC to bring other organisations into the pilot service as described above.

project staff

Contact

Fabio Simeoni
Centre for Digital Library Research (CDLR)
c/o Andersonian Library
University of Strathclyde
101 St James’ Road
Glasgow G4 0NS

Telephone: 0141 548 2102
Email: fabio.simeoni@cis.strath.ac.uk

  • Last updated on 07/01/09 by Kerry Ann Down