Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The indexer can have any number of workers configured to process the events.  So the main indexer process retrieves the object RDF from the repository, and that content can be reused by multiple workers.  If you want to process the events in several ways (triplestore, Solr, archive to disk, update remote repository, etc.), this limits the number of times the metadata has to be retrieved from the repository to once each time the object is updated.

Configuration

The indexer is configured using Spring.  Here is a sample configuration fragment showing two workers (saving RDF to disk, and syncing to a Jena Fuseki triplestore) and the framework for listening to events and connecting them with the workers:

...

No Format
  <!-- Worker #1: Copy object RDF to a Sesame triplestore using SPARQL Update -->
  <bean id="sparqlUpdate" class="org.fcrepo.indexer.SparqlIndexer">
    <!-- base URL for triplestore subjects, PID will be appended -->
    <property name="prefix" value="http://localhost:${test.port:8080}/rest/objects/"/>
    <property name="queryBase" value="http://localhost:8081/openrdf-sesame/repositories/test"/>
    <property name="updateBase" value="http://localhost:8081/openrdf-sesame/repositories/test/statements"/>
    <property name="formUpdates">
      <value type="java.lang.Boolean">true</value>
    </property>
  </bean>

Extending the Indexer

To implement a new kind of indexer:

  1. Implement the indexing functionality using the org.fcrepo.indexer.Indexer interface, which consists of only two methods (one to handle new/updated records, and another to handle deleted records).  Any configuration required should be done using Java bean setter methods.
  2. Update the Spring configuration to add a bean referencing the new class and providing the configuration properties needed.
  3. Add the bean to the list of workers invoked by the indexer.

Trying Out the Indexer

To get hands-on experience with the indexer and see updates synced with an external triplestore, you need three components. Each component will potentially run in its own application container. The three components are:

...

The triplestore and Fedora4 do not need to be aware of each other or of the jms-event-listener. However, the event-listener needs to know the web-endpoints of both the triplestore and Fedora 4. It is therefore important that you start the three components on different ports. 
Instructions on how to start up and configure the three components follows:

1. Triplestore
2. Fedora Repository

You can deploy Fedora4 either by downloading the latest war file and dropping it into an application container (e.g. Tomcat7). Or you can clone the Git fcrepo4 project and run the fcrepo-webapp directly within the code base.
See the following pages for details on either approach:

3. JMS Event Indexer

You can deploy the jms-event-listener by downloading the latest war file and dropping it into an application container (e.g. Tomcat7). Or you can clone the Git fcrepo-jms-indexer-pluggable project and run the fcrepo-jms-indexer-webapp directly within the code base.

...