...
The indexer can have any number of workers configured to process the events. So the main indexer process retrieves the object RDF from the repository, and that content can be reused by multiple workers. If you want to process the events in several ways (triplestore, Solr, archive to disk, update remote repository, etc.), this limits the number of times the metadata has to be retrieved from the repository to once each time the object is updated.
Configuration
The indexer is configured using Spring. Here is a sample configuration fragment showing two workers (saving RDF to disk, and syncing to a Jena Fuseki triplestore) and the framework for listening to events and connecting them with the workers:
...
No Format |
---|
<!-- Worker #1: Copy object RDF to a Sesame triplestore using SPARQL Update --> <bean id="sparqlUpdate" class="org.fcrepo.indexer.SparqlIndexer"> <!-- base URL for triplestore subjects, PID will be appended --> <property name="prefix" value="http://localhost:${test.port:8080}/rest/objects/"/> <property name="queryBase" value="http://localhost:8081/openrdf-sesame/repositories/test"/> <property name="updateBase" value="http://localhost:8081/openrdf-sesame/repositories/test/statements"/> <property name="formUpdates"> <value type="java.lang.Boolean">true</value> </property> </bean> |
Extending the Indexer
To implement a new kind of indexer:
- Implement the indexing functionality using the org.fcrepo.indexer.Indexer interface, which consists of only two methods (one to handle new/updated records, and another to handle deleted records). Any configuration required should be done using Java bean setter methods.
- Update the Spring configuration to add a bean referencing the new class and providing the configuration properties needed.
- Add the bean to the list of workers invoked by the indexer.
Trying Out the Indexer
To get hands-on experience with the indexer and see updates synced with an external triplestore, you need three components. Each component will potentially run in its own application container. The three components are:
...
The triplestore and Fedora4 do not need to be aware of each other or of the jms-event-listener. However, the event-listener needs to know the web-endpoints of both the triplestore and Fedora 4. It is therefore important that you start the three components on different ports.
Instructions on how to start up and configure the three components follows:
1. Triplestore
- The easiest to setup is Jena Fuseki (Fuseki setup instructions).
- Alternatively, you can setup Sesame (Sesame setup instructions).
2. Fedora Repository
You can deploy Fedora4 either by downloading the latest war file and dropping it into an application container (e.g. Tomcat7). Or you can clone the Git fcrepo4 project and run the fcrepo-webapp directly within the code base.
See the following pages for details on either approach:
3. JMS Event Indexer
You can deploy the jms-event-listener by downloading the latest war file and dropping it into an application container (e.g. Tomcat7). Or you can clone the Git fcrepo-jms-indexer-pluggable project and run the fcrepo-jms-indexer-webapp directly within the code base.
...