Excerpt |
---|
To support the differing needs for sophisticated, rich searching, fedora Fedora 4 comes with a standard mechanism and integration point for indexing content in an external service. This could be a general search service such as Apache Solr . |
Define Indexing Namespace and Mixin in CND of fcrepo4
External indexing relies upon the objects you wish to have indexed to have an indexing:indexable mixin property. This can be done using the REST interface or via the fedora-node-types.cnd
Definition using the REST interface
Go to http://localhost:8080/rest/fcr:namespaces and in the Register Namespace form add:
Prefix: indexing
Namespace: http://fedora.info/definitions/v4/indexing#
Go to http://localhost:8080/rest/fcr:nodetypes and in the Update CND form add:
[indexing:indexable] mixin
- indexing:hasIndexingTransformation (STRING) multiple COPY nofulltext noqueryorder
Definition via fedora-node-types.cnd
Make sure your node definitions contain the following:
Code Block | ||
---|---|---|
| ||
<indexing = 'http://fedora.info/definitions/v4/indexing#'>
[indexing:indexable] mixin
- indexing:hasIndexingTransformation (STRING) multiple COPY nofulltext noqueryorder |
The standard configuration chain is as follows:
...
or a standalone triplestore such as Sesame or Fuseki. |
To set up external index and searching you must:
Table of Contents | ||
---|---|---|
|
Install and configure
Configure fcrepo4 messaging
fcrepo-webapp/src/main/resources/spring/jms.xml contains the bean definitions used by the repository for messaging. Currently the DefaultMessageFactory is used to implement messaging:
Code Block | ||
---|---|---|
| ||
<bean class="org.fcrepo.jms.headers.DefaultMessageFactory"/> |
...
standalone search applications
fcrepo-jms-indexer-pluggable currently message-consumer currently supports the following triplestores:
- Jena Fuseki (Fuseki setup instructions)
- Sesame (Sesame setup instructions)
Tip |
---|
See the External Triplestore page for more details on the triplestore setup. |
fcrepo-jmsmessage-indexer-pluggable consumer currently supports the following indexer:
Tip |
---|
See the Solr Indexing Quick Guide to get quickly up and running with a Fedora 4 Solr integration. |
Install and configure fcrepo-message-consumer
The fcrepo
...
-message-consumer project includes software for a web service that sits between your Fedora 4 repository and an external search service. As its name implies, it is a generic framework that allows for easy extension for integrating unanticipated or proprietary services with the Fedora 4 repository. There are proof-of-concept implementations for Jena Fuseki, Sesame and Apache Solr.
The following github page has detailed instructions as to how to set up fcrepo-jmsmessage-indexer-pluggableconsumer. This standalone app listens to messages produced by fcrepo4 and invokes the search applications as configured:
https://github.com/futuresfcrepo4/fcrepo-jmsmessage-indexer-pluggable
Load an LDPATH program
The following is an example of loading a LDPATH program called "custom".
...
Mark a resource as indexable and assign an appropriate indexing transformation
For a resource to be indexed it must:
- have the rdf type http://fedora.info/definitions/v4/indexing#indexable
- (optionally) have the property http://fedora.info/definitions/v4/
...
- indexing#hasIndexingTransformation set to a registered index transformation
Tip | ||
---|---|---|
| ||
A default indexing transformation exists that maps the appropriate properties to the field names "title", "uuid" and "id". To meet your needs, you can write and register custom indexing transformations. |
Create new
Note that for solr indexing the field name (such as id, title, and uuid) must match the fields that are defined in the solr schema.xml (see solr documentation: https://cwiki.apache.org/confluence/display/solr/Solr+Field+Types). One recommended schema.xml is provided by hydra-jetty (https://github.com/projecthydra/hydra-jetty/blob/master/solr/development-core/conf/schema.xml) which has a robust set of default dynamic fields.
...
objects with indexing properties
For an object to be indexed it must have a rdf:type of indexing:indexable, and optionally a indexing:hasIndexingTransformation corresponding to an LDPATH program.
Code Block | ||
---|---|---|
| ||
curl -X POSTPATCH -H "Content-Type: application/n3sparql-update" --data-binary "@object.rdf" "http://localhost:8080/rest/anIndexableObject" -d "@bodyindexableObject" object.rdf": body.rdf: @prefixPREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefixPREFIX indexing: <http://fedora.info/definitions/v4/indexing#>. <> rdf:type DELETE <http://fedora.info/definitions/v4/indexing#indexable> { } INSERT { <> indexing:hasIndexingTransformation "default"; rdf:type indexing:indexable; dc:title "This title will show up in the index." } WHERE { } |