The fcrepo-camel-toolbox project includes a number of production-ready services that can be used to integrate Fedora with external systems, such as Solr or a Triplestore.
Deployment and configuration instructions are available in the project's README file.
Solr Indexing
The Solr indexer can optionally use XSLT to convert RDF XML documents to Solr Indexing Documents A default XSLT is provided in the default transform. It is also possible to override the default transformation program by assigning an RDF property to particular documents: <> indexing:hasIndexingTransformation "file:///path/to/your/transform.xsl". Furthermore, one can choose to index only certain documents from the repository. By identifying certain documents as <> a indexing:Indexable and enabling the indexable.predicate configuration value, only those resources will be indexed. (For Tomcat/Jetty-deployed applications, this can be enabled by setting JAVA_OPTS="-Dfcrepo.onlyIndexableObjects=true")
Triplestore Indexing
The triplestore indexing service runs just like the Solr Indexing service, pushing all changes from the repository into an external triplestore. Fuseki, Sesame and BlazeGraph have been used successfully with this service. Like with the Solr Indexing service, it is possible to identify certain objects as "Indexable" by setting an rdf:type as indexing:Indexable. (One must also enable this filtering, as described above).
Message Forwarding (Endpoint) Service
It is possible to configure an endpoint service that receives messages from Fedora and can process them using arbitrary rules. The advantages of include the option to programme the service in any language. A typical use of the service is to build enhanced Solr search records with content pulled from more than one Fedora record. For configuration details see https://github.com/fcrepo-exts/fcrepo-camel-toolbox.
Reindexing Service
Periodically, it may be necessary to reindex some or all of a repository. In certain cases, one may wish to re-index only Solr, only the Triplestore, or both. The reindexing service exposes a RESTful endpoint where it is possible to initiate these sorts of reindexing processes. By default, the reindexing service exposes an HTTP endpoint at localhost:9080/reindexing (to change this, see the documentation). That endpoint accepts JSON documents like so:
curl -XPOST localhost:9080/reindexing/objects -H"Content-Type: application/json" \
-d '["activemq:queue:solr.reindex","activemq:queue:triplestore.reindex"]'
This will reindex both Solr and the external triplestore, starting at the /objects node in Fedora. To start at the root node in Fedora, you would POST to localhost:9080/reindexing/, while to start at the node /a/b/c/d, you would POST to localhost:9080/reindexing/a/b/c/d
The values in the JSON array are used to determine which endpoints to reindex.
By sending a GET request to the reindexing service, you will retrieve a short summary of its usage.
twerets