SIMILE: Extending DSpace for metadata and service interoperability using RDF and the Semantic Web
Abstract:
Simile - A word or phrase by which anything is likened,
in one or more of its aspects, to something else
(source: Webster's)
SIMILE - Semantic Interoperability of Metadata and
Information in unLike Environments
SIMILE is a joint project conducted by the W3C, HP, MIT Libraries, and MIT's Lab for Computer Science, and is a research-oriented successor project to previous work to create the DSpace open-source software (http://www.dspace.org, http://www.sourceforge.net/projects/dspace) and production service at MIT Libraries (http://libraries.mit.edu/dspace). SIMILE is sponsored by HP through the HP-MIT Research Alliance <http://www.hpl.hp.com/mit/>, and is expected to run through 2005. Like DSpace, software output from SIMILE will be open-source.
In this presentation, we will describe the goals of the SIMILE project, review targeted use cases, and discuss the architecture and implementation approaches that we are exploring.
SIMILE seeks to enhance inter-operability among digital assets, schemas,
metadata, and services. A key challenge is that the collections which must
inter-operate are often distributed across individual, community, and
institutional stores. We seek to be able to provide end-user services by
drawing upon the assets, schemas, and metadata held in such stores.
SIMILE will leverage and extend DSpace, enhancing its support for arbitrary schemas and metadata, primarily though the application of RDF and semantic web techniques. We aim to support multiple domain-specific schemas and vocabularies, including modelling relationships among them and search across multiple domains. The project also aims to implement a digital asset dissemination architecture based upon web standards. The dissemination architecture will provide a mechanism to add useful "views" to a particular digital artifact (i.e. asset, schema, or metadata instance), and bind those views to consuming services.
SIMILE will also apply and build upon prior work in MIT's haystack project
(http://haystack.lcs.mit.edu), which seeks to bring modern information
management and retrieval technologies to the average computer user in order to make computers a more compelling place for users to interact with their
information. Haystack looks into the use of artificial intelligence techniques
for analyzing unstructured information and providing more accurate retrieval.
Haystack also deals with the modeling, management, and display of user data in more natural and useful ways.
To guide the SIMILE effort we will focus on well-defined, real-world use cases in the libraries domain. Since parallel work is underway to deploy DSpace at a number of leading research libraries, we hope that such an approach will lead to a powerful deployment channel through which the utility and readiness of RDF and semantic web tools and techniques can be compellingly demonstrated in a visible and global community.
Mick Bass
Research Manager
Digital Media Systems Program
HP Laboratories
Hewlett-Packard Company
1 Cambridge Center
Cambridge, MA 02142
