All Versions
- DSpace 7.x (Current Release)
- DSpace 8.x (Unreleased)
- DSpace 6.x (EOL)
- DSpace 5.x (EOL)
- More Versions...
Old Release
This documentation relates to an old version of DSpace, version 3.x. Looking for another version? See all documentation.
This DSpace release is end-of-life and is no longer supported.
Relevant Links
Open Archives Initiative Protocol for Metadata Harvesting is a low-barrier mechanism for repository interoperability. Data Providers are repositories that expose structured metadata via OAI-PMH. Service Providers then make OAI-PMH service requests to harvest that metadata. OAI-PMH is a set of six verbs or services that are invoked within HTTP.
OAI 2.0 is a java implementation of an OAI-PMH data provider interface developed by Lyncode that uses XOAI, an OAI-PMH Java Library.
Projects like OpenAIRE, Driver have specific metadata requirements (to the published content through the OAI-PMH interface). As the OAI-PMH protocol doesn't establish any frame to these specifics, OAI 2.0 can, in a simple way, have more than one instance of an OAI interface (feature provided by the XOAI core library) so one could define an interface for each project. That is the main purpose, although, OAI 2.0 allows much more than that.
To understand how XOAI works, one must understand the concept of Filter, Transformer and Context. With a Filter it is possible to select information from the data source. A Transformer allows one to make some changes in the metadata before showing it in the OAI interface. XOAI also adds a new concept to the OAI-PMH basic specification, the concept of context. A context is identified in the URL:
http://www.example.com/oai/<context>
Contexts could be seen as virtual distinct OAI interfaces, so with this one could have things like:
With this ingredients it is possible to built a robust solution that fulfill all requirements in Driver, OpenAIRE and also other projects specific requirements. As shown in Figure 1, with contexts one could select a subset of all available items in the data source. So when entering the OpenAIRE context, all OAI-PMH request will be restricted to that subset of items.
At this stage, contexts could be seen as sets (also defined in the basic OAI-PMH protocol). The magic of XOAI happens when one need specific metadata format to be shown in each context. Metadata requirements by Driver slightly differs from the OpenAIRE ones. So for each context one must define it's specific transformer. So, contexts could be seen as an extension to the concept of sets.
To implement an OAI interface from the XOAI core library, one just need to implement the datasource interface.
OAI 2.0 is a separate webapp which is a complete substitute for the old "oai" webapp. OAI 2.0 has a configurable data source, by default it will not query the DSpace SQL database at the time of the OAI-PMH request. Instead, it keeps the required metadata in its Solr index (currently in a separate "oai" Solr core) and serves it from there. Although it's also possible to set OAI 2.0 to only use the database for querying purposes. Furthermore, it caches the requests, so doing the same query repeatedly is very fast, but not only, it also compiles DSpace items to make uncached responses much faster.
Details about OAI 2.0 internals could be found here.
OAI 2.0 establishes Solr data source by default.
The Solr index can be updated at your convenience, depending on how fresh you need the information to be. Typically, the administrator sets up a nightly cron job to update the Solr index from the SQL database.
OAI manager it's an utility that allows one to do some administrative operations with OAI.
Syntax
bin/dspace oai <action> [parameters]
Actions
Parameters
In order to refresh the OAI Solr index, it is required to run the {{[dspace]/bin/dspace oai import }}command periodically. You can add the following task to your crontab:
0 3 * * * [dspace]/bin/dspace oai import
Note that [dspace]
should be replaced by the correct value, that is, the value defined in dspace.cfg
parameter dspace.dir
.
OAI 2.0 could also work using the database for querying. To configure that one must change the [dspace]/config/modules/xoai.cfg file, specifically parameter 'storage', setting it to database.
OAI manager it's an utility that allows one to do some administrative operations with OAI.
Syntax
bin/dspace oai <action> [parameters]
Actions
Parameters
In order to refresh the OAI cache and compile DSpace items (for fast responses), it is required to run the {{[dspace]/bin/dspace xoai compile-items }}command periodically. You can add the following task to your crontab:
0 3 * * * [dspace]/bin/dspace oai compile-items
Note that [dspace]
should be replaced by the correct value, that is, the value defined in dspace.cfg
parameter dspace.dir
.
The OAI-PMH response is an XML file. While OAI-PMH is primarily used by harvesting tools and usually not directly by humans, sometimes it can be useful to look at the OAI-PMH requests directly - usually when setting it up for the first time or to verify any changes you make. For these cases, XOAI provides an XSLT stylesheet to transform the response XML to a nice looking, human-readable and interactive HTML. The stylesheet is linked from the XML response and the transformation takes place in the user's browser. Most automated tools are interested only in the XML file itself and will not perform the transformation. If you want, you can change which stylesheet will be used by placing it into the [dspace]/webapps/xoai/static
directory (or into the [dspace-src]/dspace-xoai/dspace-xoai-webapp/src/main/webapp/static
after which you have to rebuild DSpace), modifying the "stylesheet" attribute of the "Configuration" element in [dspace]/config/modules/xoai/xoai.xml
and restarting your servlet container.
By default OAI 2.0 provides 12 metadata formats within the /request context:
At /driver context it provdes:
And at /openaire context it provides:
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="d4ee3ecc-3f8d-4efa-8db6-2c46cd9f21ca"><ac:plain-text-body><![CDATA[ |
Configuration File: |
|
]]></ac:plain-text-body></ac:structured-macro> |
---|---|---|---|
Property: |
|
||
Example Value: |
|
||
Information Note: |
This allows to choose the OAI data source between solr and database |
||
Property: |
|
||
Example Value: |
|
||
Informational Note: |
Solr Server location |
||
Property: |
|
||
Example Value: |
|
||
Informational Note: |
OAI persistent identifier prefix. Format - oai:PREFIX:HANDLE |
||
Property: |
|
||
Example Value: |
|
||
Informational Note: |
Configuration directory, used by XOAI (core library). Contains xoai.xml, metadata format XSLTs and transformer XSLTs. |
||
Property: |
|
||
Example Value: |
|
||
Informational Note: |
Directory to store runtime generated files (for caching purposes). |
OAI 2.0 provides an advanced configuration allowing you to configure:
It's a XML file commonly located at: ${dspace.dir}/config/modules/oai/xoai.xml
Each context could have it's own metadata formats. So to add/remove metadata formats to/from it, just need add/remove it's reference within xoai.xml, for example, imagine one need to remove the XOAI schema from:
<Context baseurl="request"> <Format refid="oaidc" /> <Format refid="mets" /> <Format refid="xoai" /> <Format refid="didl" /> <Format refid="dim" /> <Format refid="ore" /> <Format refid="rdf" /> <Format refid="etdms" /> <Format refid="mods" /> <Format refid="qdc" /> <Format refid="marc" /> <Format refid="uketd_dc" /> </Context>
Then one would have:
<Context baseurl="request"> <Format refid="oaidc" /> <Format refid="mets" /> <Format refid="didl" /> <Format refid="dim" /> <Format refid="ore" /> <Format refid="rdf" /> <Format refid="etdms" /> <Format refid="mods" /> <Format refid="qdc" /> <Format refid="marc" /> <Format refid="uketd_dc" /> </Context>
It is also possible to create new metadata format by creating a specific XSLT for it. All already defined XSLT for DSpace could be found within directory ${dspace.dir}/config/modules/oai/metadataFormats. So after producing a new one, in ${dspace.dir}/config/modules/oai/xoai.xml* add* *inside element <Formats> the following information:
<Format id="[IDENTIFIER]"> <Prefix>[PREFIX]</Prefix> <XSLT>metadataFormats/[XSLT]</XSLT> <Namespace>[NAMESPACE]</Namespace> <SchemaLocation>[SCHEMA_LOCATION]</SchemaLocation> </Format>
where:
Parameter |
Description |
---|---|
IDENTIFIER |
The identifier used within context configurations to reference this specific format, must be unique within all Metadata Formats available. |
PREFIX |
The prefix used in OAI interface (metadataPrefix=PREFIX). |
XSLT |
The name of the XSLT file within ${dspace.dir}/config/modules/oai/metadataFormats directory |
NAMESPACE |
XML Default Namespace of the created Schema |
SCHEMA_LOCATION |
URI Location of the XSD of the created Schema |
NOTE: Changes in dspace/config/modules/oai/xoai.xml requires reloading/restarting the servlet container.