Define Indexing Namespace and Mixin in CND of fcrepo4
External indexing relies upon the objects you wish to have indexed to have an indexing:indexable mixin property. This can be done using the REST interface or via the fedora-node-types.cnd
Definition using the REST interface
Go to http://localhost:8080/rest/fcr:namespaces and in the Register Namespace form add:
Prefix: indexing
Namespace: http://fedora.info/definitions/v4/indexing#
Go to http://localhost:8080/rest/fcr:nodetypes and in the Update CND form add:
[indexing:indexable] mixin - indexing:hasIndexingTransformation (STRING) multiple COPY nofulltext noqueryorder
Definition via fedora-node-types.cnd
Make sure your node definitions contain the following:
<indexing = 'http://fedora.info/definitions/v4/indexing#'> [indexing:indexable] mixin - indexing:hasIndexingTransformation (STRING) multiple COPY nofulltext noqueryorder
The standard configuration chain is as follows:
- fcrepo4/fcrepo-webapp/src/main/webapp/WEB-INF/web.xml contains a context-param element with param-name "contextConfigLocation". The param-value points to your spring configuration file, usually a path like WEB-INF/classes/*.xml
- One of your spring configuration files is repo.xml that contains a property repositoryConfiguration defining the location of your repository.json
- fedora-node-types.cnd is defined in you repository.json
Configure fcrepo4 messaging
fcrepo-webapp/src/main/resources/spring/jms.xml contains the bean definitions used by the repository for messaging. Currently the DefaultMessageFactory is used to implement messaging:
<bean class="org.fcrepo.jms.headers.DefaultMessageFactory"/>
Install standalone search applications
fcrepo-jms-indexer-pluggable currently supports the following triplestores:
- Jena Fuseki (Fuseki setup instructions)
- Sesame (Sesame setup instructions)
fcrepo-jms-indexer-pluggable currently supports the following indexer:
Install and configure fcrepo-jms-indexer-pluggable
The following github page has detailed instructions as to how to set up fcrepo-jms-indexer-pluggable. This standalone app listens to messages produced by fcrepo4 and invokes the search applications as configured:
https://github.com/futures/fcrepo-jms-indexer-pluggable
Load an LDPATH program
The following is an example of loading a LDPATH program called "custom".
curl -X POST -H "Content-Type: application/rdf+ldpath" -d "@post.txt" "http://localhost:8080/rest/fedora:system/fedora:transform/fedora:ldpath/custom/fcr:content" post.txt: @prefix fcrepo : <http://fedora.info/definitions/v4/repository#> id = . :: xsd:string ; title_tsi = dc:title :: xsd:string; uuid_ssi = fcrepo:uuid :: xsd:string ;
Note that for solr indexing the field name (such as id, title, and uuid) must match the fields that are defined in the solr schema.xml (see solr documentation: https://cwiki.apache.org/confluence/display/solr/Solr+Field+Types). One recommended schema.xml is provided by hydra-jetty (https://github.com/projecthydra/hydra-jetty/blob/master/solr/development-core/conf/schema.xml) which has a robust set of default dynamic fields.
Create objects with indexing properties
For an object to be indexed it must have a rdf:type of indexing:indexable, and optionally a indexing:hasIndexingTransformation corresponding to an LDPATH program.
curl -X POST -H "Content-Type: application/n3" "http://localhost:8080/rest/anIndexableObject" -d "@body.rdf" body.rdf: @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix indexing:<http://fedora.info/definitions/v4/indexing#>. <> rdf:type <http://fedora.info/definitions/v4/indexing#indexable> <> indexing:hasIndexingTransformation "default".