Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

To support the differing needs for sophisticated, rich searching, fedora Fedora 4 comes with a standard mechanism and integration point for indexing content in an external service.  This could be a general search service such as Apache Solr .

Define Indexing Namespace and Mixin in CND of fcrepo4

External indexing relies upon the objects you wish to have indexed to have an indexing:indexable mixin property. This can be done using the REST interface or via the fedora-node-types.cnd

Definition using the REST interface

Go to http://localhost:8080/rest/fcr:namespaces and in the Register Namespace form add:

     Prefix: indexing

     Namespace: http://fedora.info/definitions/v4/indexing#

Go to http://localhost:8080/rest/fcr:nodetypes and in the Update CND form add:  

[indexing:indexable] mixin
- indexing:hasIndexingTransformation (STRING) multiple COPY nofulltext noqueryorder

Definition via fedora-node-types.cnd

Make sure your node definitions contain the following:

Code Block
titlefcrepo-kernel/src/main/resources/fedora-node-types.cnd
<indexing = 'http://fedora.info/definitions/v4/indexing#'>

[indexing:indexable] mixin
- indexing:hasIndexingTransformation (STRING) multiple COPY nofulltext noqueryorder

The standard configuration chain is as follows:

...

or a standalone triplestore such as Sesame or Fuseki.


To set up external index and searching you must:

Table of Contents
outlinetrue

Install and configure

Configure fcrepo4 messaging

 fcrepo-webapp/src/main/resources/spring/jms.xml contains the bean definitions used by the repository for messaging.  Currently the DefaultMessageFactory is used to implement messaging:

Code Block
titlejms.xml
<bean class="org.fcrepo.jms.headers.DefaultMessageFactory"/>

...

standalone search applications

fcrepo-jms-indexer-pluggable currently message-consumer currently supports the following triplestores:

 

Tip

See the External Triplestore page for more details on the triplestore setup.

 

fcrepo-jmsmessage-indexer-pluggable consumer currently supports the following indexer:

Tip

See the Solr Indexing Quick Guide to get quickly up and running with a Fedora 4 Solr integration.

 

Install and configure fcrepo-message-consumer

The fcrepo

...

-message-consumer project includes software for a web service that sits between your Fedora 4 repository and an external search service.  As its name implies, it is a generic framework that allows for easy extension for integrating unanticipated or proprietary services with the Fedora 4 repository.  There are proof-of-concept implementations for Jena Fuseki, Sesame and Apache Solr.

The following github page has detailed instructions as to how to set up fcrepo-jmsmessage-indexer-pluggableconsumer.  This standalone app listens to messages produced by fcrepo4 and invokes the search applications as configured:

https://github.com/futuresfcrepo4/fcrepo-jmsmessage-indexer-pluggable

Load an LDPATH program

The following is an example of loading a LDPATH program called "custom".

...

consumer

Mark a resource as indexable and assign an appropriate indexing transformation

For a resource to be indexed it must:

  1. have the rdf type http://fedora.info/definitions/v4/indexing#indexable
  2. (optionally) have the property http://fedora.info/definitions/v4/

...

  1. indexing#hasIndexingTransformation set to a registered index transformation
Tip
titleIndexing Transformations

A default indexing transformation exists that maps the appropriate properties to the field names "title", "uuid" and "id".  To meet your needs, you can write and register custom indexing transformations.

Create new

Note that for solr indexing the field name (such as id, title, and uuid) must match the fields that are defined in the solr schema.xml (see solr documentation: https://cwiki.apache.org/confluence/display/solr/Solr+Field+Types).  One recommended schema.xml is provided by hydra-jetty (https://github.com/projecthydra/hydra-jetty/blob/master/solr/development-core/conf/schema.xml) which has a robust set of default dynamic fields.

...

objects with indexing properties

For an object to be indexed it must have a rdf:type of indexing:indexable, and optionally a indexing:hasIndexingTransformation corresponding to an LDPATH program.

Code Block
titlecreate object
curl -X POSTPATCH -H "Content-Type: application/n3sparql-update" --data-binary "@object.rdf" "http://localhost:8080/rest/anIndexableObject" -d "@bodyindexableObject"

object.rdf":

body.rdf:
@prefixPREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefixPREFIX indexing: <http://fedora.info/definitions/v4/indexing#>.
<> rdf:type
DELETE  <http://fedora.info/definitions/v4/indexing#indexable>
{ }
INSERT { 
  <> indexing:hasIndexingTransformation "default"; 
  rdf:type indexing:indexable; 
  dc:title "This title will show up in the index." }
WHERE { }