Start date: 1 October 2005
End date: 31 July 2006
Funding programme: PALS Metadata and Interoperability programme (phase 2)
Project website:
http://baillie.lib.ed.ac.uk/metadataplus/
JISC theme(s): Information environment
Background
Useful metadata can be ascribed to multiple authorships; each contributes
to a specific resource description. It is also likely to be in aggregated
forms and originate from harvested, referenced and de-referenced data from
disparate sources and online locations. Applying metadata in aggregated
forms requires the capability to enable distributed and autonomously
managed metadata to be assembled, often in dynamic ways, for various
presentation contexts in digital library and e-learning.
The prerequisites for metadata aggregation correspond to two types of
interoperability services for metadata discovery: access and
data mapping services. The former provides unified machine
interfaces for searching and linking metadata held in different types of
repositories through the use of standardised access protocols. The latter
repurposes metadata of various schemas from disparate sources into formats
which are coherent and of consistent quality for specific use contexts.
Both services are referred to in this project collectively as 'machine
services'.
This project, metadata+, develops a metadata test bed that is based on the
machine services for aggregating heterogeneous metadata and presents them
in coherent formats relevant to the intended presentation contexts
(portals, VLEs, digital library etc). The metadata sources include
publisher and digital libraries providing both bibliographic and
user-generated (enrichment) metadata such as annotations and reviews. The
test bed focuses on facilitating metadata from providers of informatics
subjects.
Aims and Objectives
The main aim of the project is to develop an interoperability demonstrator
exploring the technical aspects of providing a service-oriented
infrastructure to facilitate metadata discovery and aggregation in
different presentation contexts. Objectives are to:
-
Provide a critical mass of metadata sufficient to demonstrate the test
bed in real-life portal and VLE use scenarios
-
Install a service-oriented repository system to host the test bed
metadata
-
Build on the project repository and provide machine services for metadata
discovery:
-
Enabling distributed search of both the test bed and external
metadata using the SRW/U protocol
-
Facilitating context-sensitive and persistent linking via OpenURL
-
Provide dynamic mapping of metadata from various sources for test bed
storage and interoperability purposes, i.e. transforming metadata output
into formats appropriate for specific demonstration scenarios e.g.
e-learning specifications for VLEs
-
Work with the project partners to facilitate interoperability
demonstrations in different presentation contexts, i.e. EGEE Digital
Library, NeSC MSc e-Science course (WebCT), Informatics Portal within
EUL.
Overall Approach
Developing a service-oriented repository infrastructure demands extensive
resources and timeframe. Due to limited resources and short duration, this
project adopts a strategy of reuse for both metadata procurement and
infrastructure development. This is to ensure the timely delivery of the
project outputs. Off-the-shelf products from JISC-funded and open source
projects are considered prior to any in house development.
Metadata
The project aims to create a test bed of 15,000 records, and EUL
cataloguing staff will work to ensure successful population of the test
bed. The test bed targets metadata of the following scope:
-
Bibliographic metadata (local): from NetLibrary, the
NeSC EGEE Digital Library and Edinburgh University Library’s (EUL)
catalogue. The metadata is sourced from data harvesting and batch
processing e.g. from publisher-provided metadata and copy cataloguing.
The metadata will be stored locally in the test bed and made accessible
via metadata+ machine services.
-
Bibliographic metadata (brokered): from Safari O’Reilly
and other related heterogeneous cross-searchable gateways such as EEVL
and ZETOC. Metadata in this category are not stored locally, but
retrieved remotely via metadata+ machine (SRW/U and mapping) services
primarily for demonstrating access/data interoperability and metadata
aggregation, e.g. aggregating an e-book test bed metadata with table of
content (TOC) retrieved dynamically from ZETOC.
-
Enrichment metadata (local and brokered): exploratory
and “value-added” metadata that enhance the usefulness of resources in
specific contexts. This would be used in conjunction with the
bibliographic metadata to provide additional information such as reviews,
table of content (TOC) data and reading lists. The metadata is sourced
from other digital library initiatives currently being undertaken by the
project partners, as well as from brokered contexts.
Metadata from various sources will be mapped into the Dublin Core (DC)
format for test bed storage. Additionally, the following list of mapping
scenarios will be developed as part of the machine services (i.e. using the
SRW/U recordSchema parameter) developed for the purpose of specifying the
multiple metadata outputs:
-
MARC/MODS to DC, GILS/GRS to DC, proprietary metadata to DC
-
DC to MARC/MODS, DC to IEEE Learning Object Metadata (LOM), DC to IMS
Resource List Interoperability (RLI) Metadata.
In addition to metadata mapping, there is a need to consider metadata
containers as a means of grouping composite metadata records for digital
library and e-learning use scenarios. Mapping among the following list of
metadata containers will be considered:
-
SRW/U - default search results container
-
MARC/MODS, METS
-
IMS Content Packaging, Resource List Interoperability Specification, ADL
SCORMS.
Cataloguing staff will perform quality control on the test bed to ensure
that records have mapped successfully and that metadata containers achieve
the desired integration.
Test Bed Infrastructure
The main scope of metadata+ is to provide an infrastructure facilitating
web services for external demonstrator projects to build upon. It addresses
the practical and wider project outcomes: providing a shared infrastructure
for several existing digital library projects; advancing metadata
interoperability research area and promoting collaboration and metadata
sharing among publishers and libraries. Hence metadata+ is not a
full-fledge digital library project. The project website provides technical
know-how and exemplars of how the web services can be consumed for metadata
discovery and aggregation purposes. It doesn’t provide a user-facing portal
and content management functionalities typical to a digital library.
The project aims to establish the metadata test bed infrastructure early.
The test bed will be available when the test bed metadata can be accessed
via the proprietary web services (Fedora) and later through standardised
machine services (SRW/U and OpenURL).
The test bed infrastructure would subsequently enable the project team and
collaborators from wider communities to explore the issues around
presenting content from a range of sources. This will be explored during
project evaluation and when the project stakeholders start developing
demonstration scenarios and consuming the metadata+ machine services. Two
areas related to metadata interoperability will be explored:
-
Access and linkage: what types of enrichment metadata can be aggregated
with the publisher metadata? How would the metadata be accessed from
disparate sources and referred to in both static and dynamic manner?
-
Quality and mapping: what is the quality of publisher-supplied metadata?
How would the metadata be stored and in what schemas or profiles? What
are the demands of metadata mapping for this project demonstration and
wider communities?
Key Standards
-
FOXML 1.0 / METS 1.4
-
Dublin Core (DCMI Recommendation), 2005-06-13
-
Learning Object Metadata (LOM), IEEE
-
Metadata Object Description Schema (MODS) 3.1
-
IMS Content Packaging 1.1.4
-
Resource List Interoperability Specifications 1.0
-
Search & Retrieve Web Services (SRW/U) 1.1
-
OpenURL 1.0
Project Outputs
-
Cross-searchable metadata content
-
Test bed infrastructure providing machine services for metadata discovery
and aggregation:
-
Service-oriented repository (Fedora)
-
Machine services: SRW/U, OpenURL
-
Metadata mapping services
-
Reports detailing technical implementation and experience gained
-
Dissemination and demonstrations of metadata interoperability in various
presentation contexts, i.e. exposing the test bed metadata in different
portals and VLEs.
Project Outcomes
The advancement of the metadata interoperability research area; evaluating
issues arising from the mapping (e.g. compatibility) of different metadata
schemas.
The use of the metadata+ services will increase awareness of the potential
implications of service-oriented system architecture in terms of:
-
Demonstrating the benefits and thereby fostering a culture of metadata
and service sharing within institutions and among the JISC H/FE
communities
-
Advancing the common service agenda of the JISC e-Framework Initiative
which incorporates the Information Environment, e-Learning Framework and
e-Research.
Increase the collaboration between the JISC and publisher communities as
metadata+ aims to provide a proof-of-concept infrastructure for:
-
The academic and research user communities to add values to existing
bibliographic metadata from publishers
-
Sharing metadata between JISC communities and publishers
-
Enabling publishers to create new use scenarios and hence markets for
their metadata services
The metadata+ test bed provides a practical solution to underpin several
current initiatives in education and research at NeSC and the University of
Edinburgh:
-
NeSC EGEE and ICEAGE e-Learning pilots that seek to provide training for
grid computing and e-Science education within European universities
respectively
-
NeSC and the University of Edinburgh MSc e-Science Course
-
Edinburgh University Library Informatics Portal pilot system.
Project Partners