Start date: 1 May 2003
End date: 31 August 2003
Funding programme: PALS Metadata and Interoperability programme (phase 1)
JISC theme(s): Information environment
ROSA is carried out by the Nature Publishing Group, the scientific publishing arm of Macmillan Publishers Ltd.
RSS (RDF Site Summary) news feeds are an increasingly popular means of receiving news and other time-sensitive information. The RSS feed readers enable users to scan a large number of information sources without having to manually visit each website in turn. They also enable webmasters to embed automatically updated links to content of interest at other websites. In this way they greatly enhance data interoperability and hence the information dissemination capabilities of the web.
There are however barriers to adoption of RSS:
- It is not usually possible for information providers to set up their own RSS feeds without creating custom code.
- It is hard for non-programmers to merge and filter RSS feeds.
- It is harder still to set up your own RSS aggregation service for a particular area of interest.
ROSA attempts to overcome these barriers by creating an open source, customisable RSS aggregator and filter, of which the source code will be released under the General Public License so that others can make use of it and build upon it.
Examples of potential applications for ROSA include:
- Automated Announcements. An academic institution could use ROSA to scrape announcements information from their web page (or read it directly from an underlying database) and expose it in an RSS format. This would enable members to be automatically alerted to new items via their desktop RSS readers.
- Automated Citation Alerts. Someone who would like to monitor the results of a citation search at CiteSeer could apply ROSA's web-scraping features to extract the results lists and package it as RSS. This would enable them to be altered to any new entries via their RSS reader. The same principle could be applied to other search engines, such as Google.
- Automated News Feed Aggregation. A research group could use the system to aggregate any number of RSS feeds and other information sources, then filter them using arbitrary criteria to create a custom feed that reflects their area of interest. They could then expose the results as RSS for members of the group to use with their RSS readers. If they choose, they could also allow anyone at all with web access to share their uniquely configured feed.
- Personalised News Feeds. A publisher could provide its website users with the ability to register RSS sources of interest, then merge and filter them in arbitrary ways to create personalised RSS feeds.
Using ROSA in the ways described above would require a modest amount of system administration skill - enough to install and configure the software - but would require essentially no knowledge of RSS. It would therefore allow a much broader range of people and organisations to publish information as RSS.
In addition to RSS, ROSA would also be able to accept information from relational databases, XML and structured or semi-structured text formats (including HTML and SGML). Output data other than RSS includes HTML, XML, JavaScript and other standard formats.
The RSS aggregator named "Urchin" produced by the project is now downloadable from Sourceforge: http://urchin.sourceforge.net/ including installation and usage instructions. Urchin is released under the GNU General Public License and the GNU Lesser General Public License.