The SOAPI project aims to create a toolkit for the creation of workflows supporting ingest and preservation workflows for digital repositories.

Service Oriented Architecture for Preservation and Ingest of Digital Objects (SOAPI)


Start date: 1 March 2007

End date: 30 September 2008

Funding programme: Digital Preservation and Records Management Programme

Project website: http://www.ahds.ac.uk/about/projects/soapi/index.htm

JISC theme(s): Information environment

Committees: JISC Integrated Information Environment committee

Overview

The Arts and Humanities Data Service (AHDS) is responsible for preserving a variety of digital resources arising from arts and humanities research. Broadly speaking, the AHDS’ approach to digital preservation involves the normalisation of data to suitable preservation formats and capture of preservation metadata on ingest, supported by ongoing monitoring for format obsolescence and performance of remedial action as required. These activities are currently performed manually by appropriate staff. The purpose of the SOAPI project is to identify new ways to automate many of the tasks associated with ingest and long-term preservation, through the creation of software tools and corresponding workflows.

Aims and objectives

The primary aim of the project is to produce a software toolkit that allows repository managers to perform activities associated with ingest and preservation is a scalable manner. The toolkit must be capable of supporting workflows comprised of automated and manual stages; easily configurable to the needs of each digital repository; and suitably flexible to allow the integration of third-party tools that are currently in use or will be produced in the future.

Project methodology

The approach of the project will be:

  • Produce use cases representing the functionality that the toolkit should support. These will to a great extent be derived from the AHDS’ documented ingest and preservation procedures, although they will also cover configuring and extending the toolkit
  • Investigate the technologies in detail, develop prototypes for internal evaluation, and produce an architecture document. The approach will be to develop modular services that can be combined to implement workflows meeting the requirements of particular repositories
  • Develop and test the toolkit software
  • Evaluate the toolkit in a number of environments

Anticipated outputs and outcomes

The main deliverable will be a toolkit composed of: web services tailored to perform automated ingest and preservation functions; web-based forms to allow for manual entry of information that cannot be automatically produced; and finally, a workflow tool that combines each functional activity into a single automated, or semi-automated workflow.

The toolkit will be repository-independent and configurable to allow the definition and implementation of workflows appropriate to the unique requirements of each repository. In the current project, the toolkit has been integrated with Fedora, however it may be subsequently tailored by third parties to other repository software.

Technology / Standards used

Standard or specification

Version

Notes

METS

1.5

Use as packaging format for digital objects and their metadata.

MPEG-DIDL

2

Will be considered as an alternative to METS, although it may not be supported completely within the project timescale.

SOAP

1.2

Web service standard.

WSDL

1.1

Web service standard.

UDDI

3

Web service discovery standard.

PREMIS

1

Data dictionary and XML schemas for preservation metadata.

RDF Specifications

Latest

W3C Recommendations

OWL

1

W3C Recommendation

Technologies: web services, Java, jBPM, JBoss, Axis, Castor, Fedora, Shibboleth.

project staff

Project Manager

Mark Hedges, King’s College London, Centre for e-Research / Arts and Humanities Data Service, Telephone: 020-7848-1970, Fax: 020-7848-1989, Email: firstname.lastname@kcl.ac.uk

Project Team

Andreas Mavrides – Technical Officer (software development)

Malcolm Polfreman – Information Officer (metadata issues)

Gareth Knight – Preservation Officer (digital preservation issues)

  • Last updated on 07/01/09 by Kerry Ann Down