Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

To support the differing needs for sophisticated, rich searching, fedora 4 comes with a standard mechanism and integration point for indexing content in an external service such as Apache Solr.

Define Indexing Namespace and Mixin in

...

CND of fcrepo4

External indexing relies upon the objects you wish to have indexed to have an indexing:indexable mixin property. This can be done using the REST interface or via the fedora-node-types.cnd

Definition using the REST interface

...

  1. fcrepo4/fcrepo-webapp/src/main/webapp/WEB-INF/web.xml contains a context-param element with param-name "contextConfigLocation".  The param-value points to your spring configuration file, usually a path like WEB-INF/classes/*.xml
  2. One of you spring configuration files is repo.xmlthat contains a property repositoryConfiguration defining the location of your repository.json
  3. fedora-node-types.cnd is defined in you repository.json

Configure fcrepo4 messaging

 fcrepo-webapp/src/main/resources/spring/jms.xml contains the bean definitions used by the repository for messaging.  Currently the DefaultMessageFactory is used to implement messaging:

Code Block
titlejms.xml
<bean class="org.fcrepo.jms.headers.DefaultMessageFactory"/>

Install standalone search applications

fcrepo-jms-indexer-pluggable currently supports the following triplestores:

fcrepo-jms-indexer-pluggable currently supports the following indexer:

Install and configure fcrepo-jms-indexer-pluggable

The following github page has detailed instructions as to how to set up fcrepo-jms-indexer-pluggable.  This standalone app listens to messages produced by fcrepo4 and invokes the search applications as configured:

https://github.com/futures/fcrepo-jms-indexer-pluggable

 

Load an LDPATH program

The following is an example of loading a LDPATH program called "custom".

Code Block
curl -X POST -H "Content-Type: application/rdf+ldpath" -d "@post.txt" "http://localhost:8080/rest/fedora:system/fedora:transform/fedora:ldpath/custom/fcr:content"

post.txt:
@prefix fcrepo : <http://fedora.info/definitions/v4/repository#>
id      = . :: xsd:string ;
title_tsi = dc:title :: xsd:string;
uuid_ssi = fcrepo:uuid :: xsd:string ;

Note that for solr indexing the field name (such as id, title, and uuid) must be match the fields that are defined in the solr schema.xml (see solr documentation: https://cwiki.apache.org/confluence/display/solr/Solr+Field+Types).  One recommended schema.xml is provided by hydra-jetty (https://github.com/projecthydra/hydra-jetty/blob/master/solr/development-core/conf/schema.xml) which has a robust set of default dynamic fields.

Create objects with indexing properties

 For an object to be indexed it must have a rdf:type of indexing:indexable, and optionally a indexing:hasIndexingTransformation corresponding to an LDPATH program.

Code Block
titlecreate object
curl -X POST -H "Content-Type: application/n3" "http://localhost:8080/rest/anIndexableObject" -d "@body.rdf"

body.rdf:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix indexing:<http://fedora.info/definitions/v4/indexing#>.
<> rdf:type  <http://fedora.info/definitions/v4/indexing#indexable>
<> indexing:hasIndexingTransformation "default".

...