Contribute to the DSpace Development Fund
The newly established DSpace Development Fund supports the development of new features prioritized by DSpace Governance. For a list of planned features see the fund wiki page.
Proposed URI Mapping for DSpace Object Model
This page proposes a mapping of objects in the DSpace data model
(aka Object Model) onto Uniform Resource Identifiers (URIs).
The URI scheme was developed specifically for the History system prototype
but it may also find uses in the AIP prototype implementation
and policy expression languages – and in any application that needs a stable,
persistent, URI naming an object in the DSpace object model.
Objectives
The specific goals of this proposal are:
- Conform to existing, applicable standards.
- URIs are meaningful and human-readable.
- Every URI has a one-to-one correspondence with its Object: there is only one valid URI for any given object.
- The URI is resolvable to an object within its realm of uniqueness:
- URIs of persistent objects such as Items and Collections are unique and resolvable globally.
- URIs of archive-dependent objects (such as a Bitstream's asset-store location) are only resolvable within the archive.
- Follow the RDF convention of a common URI prefix with identifying elements in the URI "fragment", so RDF viewers display it correctly in condensed form.
Design Choices
We propose to base DSpace Object Model URIs on the
Info URI Scheme.
If this proposal is adopted, we will request to
register the
"dspace"
namespace in the
info:
scheme.
Why Not URLs?
Why not use URLs in e.g. the
http:
or
ftp:
scheme?
Recall that it is a goal for the URIs to correspond 1:1 with objects, and
objects may be duplicated (replicated or custody-transferred) at
other archives, so they would then have multiple URLs.
Also, any identifier based on the domain name of a network host
is not going to be persistent.
Besides, the URI does not have to be globally resolvable. It only has to be
resolvable in the context where the object is available, e.g. within
a DSpace archive that contains the Item.
URI Specification
The general formats for a DSpace URI starts with
the scheme
info:
and the namespace
dspace
, followed by
a path element delimiter *
"/"
* (slash). The rest of
the URI depends on the object to be described. We have established rules
for two classes of objects:
1. First-Class DSpace Objects with Persistent Identifiers
Any "first-class" object with a persistent identifer – i.e. a Handle – can
be mapped to a URI based on that Handle, following the pattern:
info:dspace/handle#
handle
:
subfragment
For example, any DSpace Item, Collection, Community, and Site has a globally
unique Handle. An Item with the Handle
1721.1/4325
would have this URI:
info:dspace/handle#1721.1/4325
The subfragment notation is used for the "persistent" identifiers of
Bitstreams. A Bitstream in the preceding example's Item with the
Sequence ID 3 would be identified by this URI:
info:dspace/handle#1721.1/4325:3
NOTE: The "handle" word in the URI path is there to declare that the unique
identifier following is a CNRI Handle. Since DSpace may eventually
implement other persistent identifier schemes, they would each be mapped to
a class of DSpace URI with the name of the type of PID in place of "handle".
2. Representing Internal Objects
Some applications need to refer to objects within the archive with a
URI, e.g. because it is needed within an XML or RDF representation and a URL is
inappropriate. The first such case is in the internal AIP METS document,
which needs to identify a file in the Asset Store, bypassing the object model.
The solution is to add a different unique keyword to the DSpace URI prefix
and let the application dictate the rest of the URI. In this case,
we give it the keyword
asset
, and the format (for file-based
local asset stores, at least) is a path element naming the bit-store type
and an identifying path in the fragment. The general format for a local file is:
info:dspace/asset/
storagetype#assetstore:assetpath
and for a registered asset it's:
info:dspace/registered/
storageType#assetStore:assetIdentifer
In each of the above formats:
- storageType is either
or
file
, depending on whether local file storage or the SRB is used.srb
- assetStore is the asset-store prefix under which this file is stored; there can be several configured. In the
case it would be a local
file
URI.file:
- assetIdentifier is the identifer, unique only within that asset store, of the file.
Here is an example: The URI
info:/dspace/asset/file#file%3A%2Fvar%2Flocal%2Fdspace%2Fassetstore%2F:47662570435556328444977060694430104239
is broken down into:
info:/dspace/asset -- actual asset in the bitstore (as opposed to "registered").
/file# -- storage type is local file, not SRB.
file%3A%2Fdspace%2Fassetstore0%2F: _-- in asset store rooted at _/dspace/assetstore0
47662570435556328444977060694430104239 -- actual file name (under prefix directories)
Note how only one URI can be derived from the asset file, and likewise the
URI corresponds directly to exactly one file. Given the asset store prefix
and filename it is completely straightforward to match that to a Bitstream
object.
API
The following methods are implemented in patches to DSpace version 1.4.1.
Note that there is only a method to obtain the Handle-based persistent URI
of any archival object; there is no corresponding method to resolve it,
because none of the code needed it. The implementation is quite
straightforward, however.
// in RDFRepository
/**
- Returns the persistent, globally-unique URI of the given object,
- if possible. If there is no basis for a persistent URI (i.e. if
- it has no Handle), returns null.
* - @param context - the dSpace context
- @param dso - any DSpace object.
- @return new URI or null if one cannot be created.
*/
public static URI makeDSpaceObjectURI(Context context, DSpaceObject dso)
// in Bitstream
/**
- Returns a URI of the storage occupied by this bitstream in the
- asset store. It can be resolved by the dereferenceAbsoluteURI()
- method. Note that the "absolute" URI does not depend on the DSpace
- object model or RDBMS storage, it only depends on the asset store layer.
* - @return external-based URI to bitstream.
*/
public URI getAbsoluteURI()
// in Bitstream
/**
- Returns the Bitstream object containing the file in the asset
- store indicated by the URI, or null if there is none.
- See getAbsoluteURI().
* - @param context - the context.
- @param uri a bitstream absolute URI created by getAbsoluteURI()
- @return a Bitstream object or null.
*/
public static Bitstream dereferenceAbsoluteURI(Context context, URI uri)