Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: OAI script looks at the configuration file. If solr is used as data querier, it shows a specific list of options. In the other hand, if database source is defined, it will show another list of possibilities.

Introduction

Open Archives Initiative Protocol for Metadata Harvesting is a low-barrier mechanism for repository interoperability. Data Providers are repositories that expose structured metadata via OAI-PMH. Service Providers then make OAI-PMH service requests to harvest that metadata. OAI-PMH is a set of six verbs or services that are invoked within HTTP.

What is

...

OAI 2.0?

XOAI OAI 2.0 is a java implementation of an adaptable OAI-PMH data provider interface developed by Lyncode that uses XOAI, an OAI-PMH Java Toolkit.

Why

...

OAI 2.0?

Projects like OpenAIRE, Driver and EUBrazilOpenBio have specific metadata requirements (to the published content through the OAI-PMH interface). As the OAI-PMH protocol doesn't establish any frame to these specifics, XOAI OAI 2.0 can, in a simple way, have more than one instance of an OAI interface , (feature provided by the XOAI core library) so one could define an interface for each project. That is the main purpose, although, XOAI OAI 2.0 allows much more than that.

Concepts (XOAI Core Library)

To understand how XOAI works, one must understand the concept of Filter, Transformer and Context. With a Filter it is possible to select information from the data source. A Transformer allows one to make some changes in the metadata before showing it in the OAI interface. XOAI also adds a new concept to the OAI-PMH basic specification, the concept of context. A context is identified in the URL:

...

With this ingredients it is possible to built a robust solution that fulfill all requirements in Driver, OpenAIRE and also other projects specific requirements. As shown in Figure 1, with contexts one could select a subset of all available items in the data source. So when entering the OpenAIRE context, all OAI-PMH request will be restricted to that subset of items.

At this stage, contexts could be seen as sets (also defined in the basic OAI-PMH protocol). The magic of XOAI happens when one need specific metadata format to be shown in each context. Metadata requirements by Driver slightly differs from the OpenAIRE ones. So for each context one must define it's specific transformer. So, contexts could be seen as an extension to the concept of sets.

XOAI in DSpace

To implement an OAI interface from the XOAI core library, one just need to implement the datasource interface.

OAI 2.0

OAI 2.0 XOAI in DSpace is a separate webapp which is a complete substitute for the old "oai" webapp. XOAI doesn't . OAI 2.0 has a configurable data source, by default it will not query the DSpace SQL database at the time of the OAI-PMH request. Instead, it keeps the required metadata in its Solr index (currently in a separate "xoaioai" Solr core) and serves it from there. Although it's also possible to set OAI 2.0 to only use the database for querying purposes. Furthermore, it caches the requests, so doing the same query repeatedly is very fast, but not only, it also compiles DSpace items to making uncached responses much faster.

Using Solr

OAI 2.0 establishes Solr data source by default.

The Solr index can be updated at your convenience, depending on how fresh you need the information to be. Typically, the administrator sets up a nightly cron job to update the Solr index from the SQL database.

...

OAI Manager (Solr Data Source)

XOAI OAI manager it's an utility that allows one to do some administrative operations with XOAIOAI

Syntax

Wiki Markup
&nbsp;bin/dspace xoaioai <action> \[parameters\]

Actions

  • import  Imports DSpace items into XOAI OAI Solr index (also cleans XOAI OAI cache)
  • clean-cache  Cleans the XOAI OAI cache

Parameters

  • -o Optimize index after indexing
  • -c Clears the Solr index before indexing (it will import all items again)
  • -v Verbose output
  • -h Shows an help text

Scheduled Tasks

Wiki Markup
In order to refresh the XOAIOAI Solr index, it is required to run the {{\[dspace\]/bin/dspace xoaioai import&nbsp;}}command periodically. You can add the following task to your crontab:

Code Block
0 3 * * * [dspace]/bin/dspace oai import

Wiki Markup
Note that {{\[dspace\]}} should be replaced by the correct value, that is, the value defined in {{dspace.cfg}} parameter {{dspace.dir}}.

Using Database

Wiki Markup
OAI 2.0 could also work using the database for querying. To configure that one must change the \[dspace\]/config/modules/xoai.cfg file, specifically parameter 'storage', setting it to _database_.

OAI Manager (Database Data Source)

OAI manager it's an utility that allows one to do some administrative operations with OAI. 

Syntax

Wiki Markup
&nbsp;bin/dspace oai <action> \[parameters\]

Actions

  • clean-cache  Cleans the OAI cache
  • compile-items  Compiles DSpace items
  • erase-compiled-items  Erases all DSpace compiled items

Parameters

  • -v Verbose output
  • -h Shows an help text

Scheduled Tasks

Wiki Markup
In order to refresh the OAI cache and compile DSpace items (for fast responses), it is required to run the {{\[dspace\]/bin/dspace xoai compile-items&nbsp;}}command periodically. You can add the following task to your crontab:

Code Block

0 3 * * * [dspace]/bin/dspace oai compile-items import

Wiki Markup
Note that {{\[dspace\]}} should be replaced by the correct value, that is, the value defined in {{dspace.cfg}} parameter {{dspace.dir}}.

Client-side stylesheet

Wiki Markup
The OAI-PMH response is an XML file. While OAI-PMH is primarily used by harvesting tools and usually not directly by humans, sometimes it can be useful to look at the OAI-PMH requests directly - usually when setting it up for the first time or to verify any changes you make. For these cases, XOAI provides an XSLT stylesheet to transform the response XML to a nice looking, human-readable and interactive HTML. The stylesheet is linked from the XML response and the transformation takes place in the user's browser. Most automated tools are interested only in the XML file itself and will not perform the transformation. If you want, you can change which stylesheet will be used by placing it into the {{\[dspace\]/webapps/xoai/static}} directory (or into the {{\[dspace-src\]/dspace-xoai/dspace-xoai-webapp/src/main/webapp/static}} after which you have to rebuild DSpace), modifying the "stylesheet" attribute of the "Configuration" element in {{\[dspace\]/config/modules/xoai/xoai.xml}} and restarting your servlet container.