The project will develop an interoperability demonstrator to explore the technical aspects of providing machine services to facilitate metadata discovery and aggregation in different presentational contexts.

metadata+: Machine Services for Metadata Discovery and Aggregation


Start date: 1 October 2005

End date: 31 July 2006

Funding programme: PALS Metadata and Interoperability programme (phase 2)

Project website: http://baillie.lib.ed.ac.uk/metadataplus/

JISC theme(s): Information environment

Background  

Useful metadata can be ascribed to multiple authorships; each contributes to a specific resource description. It is also likely to be in aggregated forms and originate from harvested, referenced and de-referenced data from disparate sources and online locations. Applying metadata in aggregated forms requires the capability to enable distributed and autonomously managed metadata to be assembled, often in dynamic ways, for various presentation contexts in digital library and e-learning. 

The prerequisites for metadata aggregation correspond to two types of interoperability services for metadata discovery: access and data mapping services. The former provides unified machine interfaces for searching and linking metadata held in different types of repositories through the use of standardised access protocols. The latter repurposes metadata of various schemas from disparate sources into formats which are coherent and of consistent quality for specific use contexts. Both services are referred to in this project collectively as 'machine services'. 

This project, metadata+, develops a metadata test bed that is based on the machine services for aggregating heterogeneous metadata and presents them in coherent formats relevant to the intended presentation contexts (portals, VLEs, digital library etc). The metadata sources include publisher and digital libraries providing both bibliographic and user-generated (enrichment) metadata such as annotations and reviews. The test bed focuses on facilitating metadata from providers of informatics subjects. 

Aims and Objectives  

The main aim of the project is to develop an interoperability demonstrator exploring the technical aspects of providing a service-oriented infrastructure to facilitate metadata discovery and aggregation in different presentation contexts. Objectives are to:

  • Provide a critical mass of metadata sufficient to demonstrate the test bed in real-life portal and VLE use scenarios
  • Install a service-oriented repository system to host the test bed metadata
  • Build on the project repository and provide machine services for metadata discovery:
    • Enabling distributed search of both the test bed and external metadata using the SRW/U protocol
    • Facilitating context-sensitive and persistent linking via OpenURL
  • Provide dynamic mapping of metadata from various sources for test bed storage and interoperability purposes, i.e. transforming metadata output into formats appropriate for specific demonstration scenarios e.g. e-learning specifications for VLEs
  • Work with the project partners to facilitate interoperability demonstrations in different presentation contexts, i.e. EGEE Digital Library, NeSC MSc e-Science course (WebCT), Informatics Portal within EUL.

Overall Approach  

Developing a service-oriented repository infrastructure demands extensive resources and timeframe. Due to limited resources and short duration, this project adopts a strategy of reuse for both metadata procurement and infrastructure development. This is to ensure the timely delivery of the project outputs. Off-the-shelf products from JISC-funded and open source projects are considered prior to any in house development.

Metadata  

The project aims to create a test bed of 15,000 records, and EUL cataloguing staff will work to ensure successful population of the test bed. The test bed targets metadata of the following scope:

  • Bibliographic metadata (local): from NetLibrary, the NeSC EGEE Digital Library and Edinburgh University Library’s (EUL) catalogue. The metadata is sourced from data harvesting and batch processing e.g. from publisher-provided metadata and copy cataloguing. The metadata will be stored locally in the test bed and made accessible via metadata+ machine services.
  • Bibliographic metadata (brokered): from Safari O’Reilly and other related heterogeneous cross-searchable gateways such as EEVL and ZETOC. Metadata in this category are not stored locally, but retrieved remotely via metadata+ machine (SRW/U and mapping) services primarily for demonstrating access/data interoperability and metadata aggregation, e.g. aggregating an e-book test bed metadata with table of content (TOC) retrieved dynamically from ZETOC.
  • Enrichment metadata (local and brokered): exploratory and “value-added” metadata that enhance the usefulness of resources in specific contexts. This would be used in conjunction with the bibliographic metadata to provide additional information such as reviews, table of content (TOC) data and reading lists. The metadata is sourced from other digital library initiatives currently being undertaken by the project partners, as well as from brokered contexts.

Metadata from various sources will be mapped into the Dublin Core (DC) format for test bed storage. Additionally, the following list of mapping scenarios will be developed as part of the machine services (i.e. using the SRW/U recordSchema parameter) developed for the purpose of specifying the multiple metadata outputs:

  • MARC/MODS to DC, GILS/GRS to DC, proprietary metadata to DC
  • DC to MARC/MODS, DC to IEEE Learning Object Metadata (LOM), DC to IMS Resource List Interoperability (RLI) Metadata.

In addition to metadata mapping, there is a need to consider metadata containers as a means of grouping composite metadata records for digital library and e-learning use scenarios. Mapping among the following list of metadata containers will be considered:

  • SRW/U - default search results container
  • MARC/MODS, METS
  • IMS Content Packaging, Resource List Interoperability Specification, ADL SCORMS.

Cataloguing staff will perform quality control on the test bed to ensure that records have mapped successfully and that metadata containers achieve the desired integration. 

Test Bed Infrastructure  

The main scope of metadata+ is to provide an infrastructure facilitating web services for external demonstrator projects to build upon. It addresses the practical and wider project outcomes: providing a shared infrastructure for several existing digital library projects; advancing metadata interoperability research area and promoting collaboration and metadata sharing among publishers and libraries. Hence metadata+ is not a full-fledge digital library project. The project website provides technical know-how and exemplars of how the web services can be consumed for metadata discovery and aggregation purposes. It doesn’t provide a user-facing portal and content management functionalities typical to a digital library. 

The project aims to establish the metadata test bed infrastructure early. The test bed will be available when the test bed metadata can be accessed via the proprietary web services (Fedora) and later through standardised machine services (SRW/U and OpenURL). 

The test bed infrastructure would subsequently enable the project team and collaborators from wider communities to explore the issues around presenting content from a range of sources. This will be explored during project evaluation and when the project stakeholders start developing demonstration scenarios and consuming the metadata+ machine services. Two areas related to metadata interoperability will be explored:

  • Access and linkage: what types of enrichment metadata can be aggregated with the publisher metadata? How would the metadata be accessed from disparate sources and referred to in both static and dynamic manner?
  • Quality and mapping: what is the quality of publisher-supplied metadata? How would the metadata be stored and in what schemas or profiles? What are the demands of metadata mapping for this project demonstration and wider communities?

Key Standards

  • FOXML 1.0 / METS 1.4
  • Dublin Core (DCMI Recommendation), 2005-06-13
  • Learning Object Metadata (LOM), IEEE
  • Metadata Object Description Schema (MODS) 3.1
  • IMS Content Packaging 1.1.4
  • Resource List Interoperability Specifications 1.0
  • Search & Retrieve Web Services (SRW/U) 1.1
  • OpenURL 1.0

Project Outputs

  • Cross-searchable metadata content
  • Test bed infrastructure providing machine services for metadata discovery and aggregation:
    • Service-oriented repository (Fedora)
    • Machine services: SRW/U, OpenURL
    • Metadata mapping services
  • Reports detailing technical implementation and experience gained
  • Dissemination and demonstrations of metadata interoperability in various presentation contexts, i.e. exposing the test bed metadata in different portals and VLEs.

Project Outcomes  

The advancement of the metadata interoperability research area; evaluating issues arising from the mapping (e.g. compatibility) of different metadata schemas.

The use of the metadata+ services will increase awareness of the potential implications of service-oriented system architecture in terms of:

  • Demonstrating the benefits and thereby fostering a culture of metadata and service sharing within institutions and among the JISC H/FE communities
  • Advancing the common service agenda of the JISC e-Framework Initiative which incorporates the Information Environment, e-Learning Framework and e-Research.

Increase the collaboration between the JISC and publisher communities as metadata+ aims to provide a proof-of-concept infrastructure for:

  • The academic and research user communities to add values to existing bibliographic metadata from publishers
  • Sharing metadata between JISC communities and publishers
  • Enabling publishers to create new use scenarios and hence markets for their metadata services

The metadata+ test bed provides a practical solution to underpin several current initiatives in education and research at NeSC and the University of Edinburgh:

  • NeSC EGEE and ICEAGE e-Learning pilots that seek to provide training for grid computing and e-Science education within European universities respectively
  • NeSC and the University of Edinburgh MSc e-Science Course
  • Edinburgh University Library Informatics Portal pilot system.

Project Partners

project staff

Contact  

Boon Low
(Project Manager)
National e-Science Centre
B18, Old College, South Bridge
University of Edinburgh
Edinburgh EH8 9YK
Tel: 0131 6514290
Email: boon.low@ed.ac.uk

  • Last updated on 07/01/09 by Kerry Ann Down