Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
h1.  DSpace REST API and Web Application 



h1. Details

Project Title: DSpace REST Webapp

Student: Bojan Suzic, University of Technology Graz

Mentor: Aaron Zeckoski

Contacting author: bojan AT trust - b . com using subject line \[DSpace\]

SCM Location for Project:  [http://scm.dspace.org/svn/repo/modules/rest]

h1. Project Summary

A RESTful service as DSpace addon is to be implemented, enabling guest and authorized users to browse and retrieve DSpace collections and related data.


The principles to follow:

* Stateless communication

* Separation of concerns: methods (GET/PUT/DELETE) are used according to their designation

* JSON and XML will both be supported as output formats

* Configuration interface for administrator to control aspects of functionality

* Logging of requests will be handled via the framework

* The API will be versioned, enabling easier upgrades in the future

* The (return) status codes should be handled according to the HTTP spec

* Resource retrieval (books...) should be possible (to decide later: binary encoding or forwarding approach)

* BasicAuth will be supported for authentication; X509 support for user logging would be a good idea



h1. Endpoint (API methods) descriptions

Available endpoints are described here. Please note that this list is *not final or complete*. Suggestions and comments are welcomed.

The required parameters are these found in path of the request URL in most cases (except where noted). Optional parameters are found in the query part of the URL. No optional parameters are found in the URL path, except one defining format (see bellow).

Optional parameters should indicate the default value when shown in the API definitions below. For example, {{?thing=true}} indicates that if the thing param is not included it will default to true. For the parameters without predefined value explicitly mentioned, it is assumed that the value is not predefined at all. It means that it is not required, but using this parameter usually produces narrower results if such are requested.

The optional version parameter in query can be used when necessary like this: {{?version=\{version\}}}. If no version is specified then the current version will be returned or used. Currently it is not supported.


*Universal parameters*

These parameters are valid for each call and as such are not explicitly mentioned in the specification tables.

For the *format*, by default it is determined using the ACCEPT header (e.g. {{setRequestHeader}} in JS) but optionally may be specified in the URL as a suffix like {{.json\|.xml}} (e.g. {{/thing/item.json}}). JSON is used by default if there is no ACCEPT header parameter present and the format is not indicated. The ACCEPT header overrides the format suffix. If there is wrong (unsupported) accept-header set, then the status code {{415 Unsupported Media Type}} shall be returned.


*Authentication* is to be based on provided parameters, supporting cookies and basic auth. For the authentication, if parameters are omitted (and no cookie present), the guest (readonly/public) user is used, otherwise user is authenticated according to provided parameters ({{?user=\{username\}&pass=\{password\}}}) or cookie (in this order). Parameters can be included in header too, in this case header has precedence over other methods. Later the possibility to use X509 certificate could be implemented.

In all cases, if the requested resource is out of reach of the user, the errors {{401 Unauthorized}} (not logged in) OR {{403 Forbidden}} (logged in but not allowed) are used accordingly.


For the *searching/sorting* methods, we will follow OpenSearch guidelines and RoR conventions - where it is applicable. The following list with supported requests is maintained and updated when needed. These will be valid for each endpoint which uses GET unless otherwise noted in the API.

When searching for entities in a list, the following parameters are handled specially in the system (note that all the RoR conventions are followed for sorting/paging):

* {{\_start=\{number\}}}: the position of the first entity to return (0 is the first, default), e.g. {{\_start=5}}

* {{\_page=\{number\}}}: the page of data to display (0 is first, default), e.g. {{\_page=2}}

* {{\_perpage=\{number\}}}: the number of entities to return for the page (0 means all, default), e.g. {{\_perpage=20}}

* {{\_limit=\{number\}}}: the maximum number of entities to return (0 means all, default), e.g. {{\_limit=50}}

* {{\_order=\{string\}}}: the sort order to return entities in (default is ascending), should be a comma separated list of entity field names which optionally include a suffix to determine order, suffix can be {{\_reverse}} or {{\_desc}} for descending order OR '' (blank) or {{\_asc}} for ascending order, e.g. {{\_order=name}} OR {{\_order=name_reverse}} OR {{\_order=name,email_desc,firstname_asc,lastname_reverse}}

* {{\_sort=\{string\}}}: same as order


This part usually may generate two (error) status codes: {{204: No content}}, in the case there are no fields satisfying criteria, and {{400: Bad request}}, in the case the query is malformed or incompatible parameters are used.

The searching criteria is applied only on items returning full info. Items returning only ids ({{idOnly=true}}) are not sent to  sorting/filtering procedures.



*Information usually returned*

In the most cases there are two types of returning information entities: 

* first, defined with {{idOnly=true}}, which returns only ids of entities satisfying request

* second, used by default, returning all available info


In the second case included is info for related entities. For instance, when user browses collection, it also receives information about communities related to collection, items related to it and so on. This principle goes through several layers. For instance, Collection  \-> Item \-> Bitstream. So, in one request all these information are present. 

Exception is present in the cases where chaining is possible. After some extent, not all information about sub/related-entities are sent, but only their ids. Example: Collection \-> Item \-> Bundle \-> Bitstream \-> BundleId. As Bitstream and Bundle are mutually referenced and included, this would cause unlimited chaining. For this reason the mechanism is implemented which encapsulates only id of entities after some extent. For more details please take a look at the example and code.


h3. Browsing methods
----

| Name and description | Value and notes |
| Base URI: |
| Description: | Returns a list of all communities on the system or return just top level communities. |
| HTTP method: | {{GET}} |
| Optional parameters: |
| Sorting fields supported: | {{id, name, countitems}}: sorting by id, community name and item count |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 204: no content\\
 400: bad request\\
 500: internal server error}} |
| Response details |


| Name and description | Value and notes |
| Base URI: |
| Description: | Returns a list of all parent communities of the {{id}} community. |
| HTTP method: | {{GET}} |
| Optional parameters: |
| Sorting fields supported: | {{id, name, countitems}}: sorting by id, community name and item count |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 204: no content\\
 400: bad request\\
 500: internal server error}} |
| Response details |


| Name and description | Value and notes |
| Base URI: |
| Description: | Returns a list of immediate sub-communities (children) of the {{id}} community. |
| HTTP method: | {{GET}} |
| Optional parameters: |
| Sorting fields supported: | {{id, name, countitems}}: sorting by id, community name and item count |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 204: no content\\
 400: bad request\\
 500: internal server error}} |
| Response details |


| Name and description | Value and notes |
| Base URI: |
| Description: | Returns a list of collections in the {{id}} community |
| HTTP method: | {{GET}} |
| Optional parameters: |
| Sorting fields supported: | {{id, name, countitems}}: sorting by id, collection name and item count |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 204: no content\\
 400: bad request\\
 500: internal server error}} |
| Response details |


| Name and description | Value and notes |
| Base URI: |
| Description: | Returns a list of recent submissions to a community |
| HTTP method: | {{GET}} |
| Optional parameters: |
| Sorting fields supported: | {{id, name, lastmodified, submitter}}: sorting by id, name(title), last modified date and submitter(name) of item |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 204: no content\\
 400: bad request\\
 500: internal server error}} |
| Response details |




| Name and description | Value and notes |
| Base URI: |
| Description: | Returns a list of all collections in the system |
| HTTP method: | {{GET}} |
| Optional parameters: |
| Sorting fields supported: | {{id, name, countitems}}: sorting by id, collection name and item count |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 204: no content\\
 400: bad request\\
 500: internal server error}} |
| Response details |


| Name and description | Value and notes |
| Base URI: |
| Description: | Returns a list of all communities a collection with {{id}} belongs to |
| HTTP method: | {{GET}} |
| Optional parameters: |
| Sorting fields supported: | {{id, name, countitems}}: sorting by id, community name and item count |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 204: no content\\
 400: bad request\\
 500: internal server error}} |
| Response details |



| Name and description | Value and notes |
| Base URI: |
| Description: | Returns a list of all items from the collection {{id}} |
| HTTP method: | {{GET}} |
| Optional parameters: |
| Sorting fields supported: | {{id, name, lastmodified, submitter}}: sorting by id, name, lastmodified date and submitter of item |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 204: no content\\
 400: bad request\\
 500: internal server error}} |
| Response details |

h3. Content searching
----

| Name and description | Value and notes |
| Base URI: |
| Description: | Returns a list of all objects found by searching criteria |
| HTTP method: | {{GET}} |
| Optional parameters: |
| Sorting fields supported: | {{id, name, lastmodified, submitter}}: sorting by id, name, last modifed date or submitter of item |
| Sorting/ordering modifiers: | {{title, issueDate, author, subject, submitter}} |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 204: no content\\
 400: bad request\\
 500: internal server error}} |
| Response details |


| Name and description | Value and notes |
| Base URI: |
| Description: | Returns a list of all objects that have been created, modified or withdrawn within specified time range |
| HTTP method: | {{GET}} |
| Optional parameters: |
| Sorting/ordering modifiers: | {{id, name, lastmodified, submitter}}: information on item returned |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 204: no content\\
 400: bad request\\
 500: internal server error}} |
| Response details |

h3. Item status/info and retrieval
----

| Name and description | Value and notes |
| Base URI: | {{/items/{id}}} \|
| Description: | Returns detailed information about an item |
| HTTP method: | {{GET}} |
| Required parameters: | {{ |{id}}}: item id |
| Sorting fields supported: | {{id, name, lastmodified, submitter}}: sorting by id, name, last modifed date or submitter of item |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 204: no content\\
 400: bad request\\
 500: internal server error}} |
| Response details | Contains an information about an item including resource name, metadata, owning collection, collections stored in, communities stored in, bundle ids, last modified date, archival/withdrawn status and submitter of an item |



| Name and description | Value and notes |
| Base URI: | {{/items/{id}/permissions}} \|
| Description: | Returns status of user permissions on this item |
| HTTP method: | {{GET}} |
| Required parameters: | {{ |{id}}}: item id |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 400: bad request\\
 500: internal server error}} |
| Response details | Boolean variable, stating can user edit the listed item |




| Name and description | Value and notes |
| Base URI: |
| Description: | Returns communities this item is part of |
| HTTP method: | {{GET}} |
| Required parameters: |
| Sorting fields supported: | {{id, name, countitems}}: community properties used for sorting |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 400: bad request\\
 500: internal server error}} |
| Response details | Communities listed |




| Name and description | Value and notes |
| Base URI: |
| Description: | Returns collections this item is part of |
| HTTP method: | {{GET}} |
| Required parameters: |
| Sorting fields supported: | {{id, name, countitems}}: collection parameters |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 400: bad request\\
 500: internal server error}} |
| Response details | Collections listed |





| Name and description | Value and notes |
| Base URI: | {{/bitstream/{id}}} \|
| Description: | Returns bitstream object - usually the library item file |
| HTTP method: | {{GET}} |
| Required parameters: | {{ |{id}}}: bitstream item id |
| Response formats: | {{json}}, {{xml}} (not yet complete) |
| Status codes | {{200: OK\\
 404: Not found\\
 400: bad request\\
 401: Unauthorized\\
 403: Forbidden\\
 500: internal server error}} |
| Response details | Includes all information about referenced bitstream, including file name, licence, corresponding ittem etc. It is possible only to get information for particular bitstreams. When the request is made without parameters/references, the blank list is presented (there is no list of all bitstreams in the system available). |





| Name and description | Value and notes |
| Base URI: | {{/bitstream/{id}/receive}} \|
| Description: | Returns checksum of bitstream |
| HTTP method: | {{GET}} |
| Required parameters: | {{ |{id}}}: bitstreamitem id |
| Response formats: | {{binary}} |
| Status codes | {{200: OK\\
 400: bad request\\
 401: Unauthorized\\
 403: Forbidden\\
 500: internal server error}} |
| Response details | Receive full bitstream |


h3. User-oriented functions
----



| Name and description | Value and notes |
| Base URI: |
| Description: | Returns list containing id, name and email of persons (optionally matching a query) |
| HTTP method: | {{GET}} |
| Optional parameters: |
| Sorting fields supported: | {{id, name, lastname, fullname, language}}: sorting properties of user(submitter) supported |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 204: no content\\
 400: bad request\\
 500: internal server error}} |
| Response details | List with information on particular user. Additionaly only identifiers are sent if idOnly is true. |


h3. Statistical info
----



| Name and description | Value and notes |
| Base URI: | {{/stats}} |
| Description: | Returns general statistics |
| HTTP method: | {{GET}} |
| Response formats: | {{json}}, {{xml}} |
| Status codes | {{200: OK\\
 400: bad request (if there is no stats package available)\\
 500: internal server error}} |
| Response details | Returns cummulative list of statistics data for the system currently available |


h1. Comments

h2. Concerning DSpace Data Model exposure in REST Paths

I am concerned about the adoption of the 1.x dspace data model, which in 2.0, is not hardcoded in this manner,
entity resource "type" being part of the url path.  We are trying to move away from this convention and for the content and represent a generic mechanism for traversing and manipulating the graph/hierarchy of the  resources (entities) .

I think we should treat them as such and think about how such resource/entity graphs are traversed using rest

Rather than: /communities/\{id\}/parents?idOnly=false&immediateOnly=true

We have something more like

{panel}
 /resource/\{id\}/related?relation=ds:isPartOfCommunity&idOnly=false&immediateOnly=true
{panel}

Rather than: /communities/\{id\}/children?idOnly=false&immediateOnly=tru

We have

{panel}
 /resource/\{id\}/related?relation=ds:hasCommunityPart&idOnly=false&immediateOnly=true
{panel}

I think we need to make sure the REST interfaces clearly map to 2.0 Services and the actions that can be performed on them. So harvest, stats and users make sense to me. But, Community, Collection, Item and Bitstream do not and we should be consolidating these under some service path like "content/" or "resource/" or the like.

\-\-[Mark Diggory|~mdiggory:MarkDiggory] 16:04, 12 July 2009 (EDT)

h2. See Fedora REST API for reference

Please see for reference:

[Fedora REST|http://www.fedora-commons.org/documentation/3.0/userdocs/server/webservices/rest/index.html]
[Fedora API-M|http://www.fedora-commons.org/documentation/3.0/userdocs/server/webservices/apim/index.html]
[Fedora API-A|http://www.fedora-commons.org/documentation/3.0/userdocs/server/webservices/apia/index.html]

for some examples of methods appropriate for the entity relationship model we are considering for 2.0

h3. addRelationship

Creates a new relationship in the object. Adds the specified relationship to the object's RELS-EXT datastream. If the Resource Index is enabled, the relationship will be added to the Resource Index.

The DSpace+2.0 proposed mapping to Fedora places RDF references for ds:hasCollection/ds:isPartOfCollection, ds:hasCommunity/ds:isPartOfCommunity directly into the RELS-EXT as relationships between Fedora representations of DSpace objects.

URL Syntax
{panel}
 /objects/\{pid\} ? [relationship] [object] [isLiteral] [datatype]
{panel}

Parameters:

{panel}
    * pid: The PID of the object.
    * relationship: The predicate.
    * object: The object.
    * isLiteral: A boolean value indicating whether the object is a literal.
    * datatype: The datatype of the literal. Optional.
{panel}

For DSpaceObjects:

(a) Creates either a new Top Level Community, SubCommunity, Collection, Item, Bundle or Bitstream as defined in the DSpace Data Model, the context of which is the current \{pid\} entity

Get next pid, /objects/nextPID ? [type]

{code} 
 /objects/nextPID?type="http://purl.org/dspace/model/Bitstream"

 /objects/\{bundlePid\}?relationship="http://purl.org/dspace/model/hasBitstream"&object=\{bitstreamPid\}

 /objects/\{bitstreamPid\} ? ... see [http://www.fedora-commons.org/documentation/3.0/userdocs/server/webservices/rest/index.html#addDatastream addDatastream]
{code}


(b) Creates metadata properties attached to any of the above DSpace Objects.

{panel}
 /objects/\{pid\} ? relationship=[http://purl.org/elements/1.1/title&object=]"My Title"&isLiteral=true
{panel}

h2. addDatastream

URL Syntax
{panel}
    /objects/\{pid\}/datastreams/\{dsID\} ? [controlGroup] [dsLocation] [altIDs] [dsLabel] [versionable] [dsState] [formatURI] [checksumType] [checksum] [logMessage]
{panel}

\-\-[Mark Diggory|~mdiggory:MarkDiggory] 15:58, 12 July 2009 (EDT)

h1. References

[Main+Page]

[Microformats conventions|http://microformats.org/wiki/rest/urls]

[RFC2616 Method Definitions|http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html]

[RFC2616 Status Code Definitions|http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html]

[Fedora API-M|http://www.fedora-commons.org/documentation/3.0/userdocs/server/webservices/apim/index.html]