Old Release

This documentation covers an old version of Fedora. Looking for another version? See all documentation.

This guide will help you get up and running with a Fedora 4 instance whose updates are automatically indexed in a Solr repository.  This guide glosses over many of the details and should be considered a starting point for testing this feature.  The document assumes a POSIX operating system with cURL, a text editor, Java, Git, and a download of Apache Solr 4.6.0.

The pattern for integrating Solr with Fedora 4 is to take advantage of the messages that Fedora 4 emits after every change to resources within the repository. The assumption is that you have three components running independently, in a completely decoupled fashion:

  1. Fedora 4
  2. Solr
  3. Indexing service

The Indexing service:

  1. Consumes the messages emitted by Fedora 4
  2. Makes a request back to Fedora 4 get a transformed representation of the Fedora 4 resource, then
  3. Submits the transformed resource to Solr for indexing

The most current documentation for the Indexing service can be found on GitHub

 

Install and Start Fedora 4

This guide assumes Fedora 4 is running on port 8080 (with JMS listening on port 61616).

Install, Configure and Start Solr

wget http://mirror.cogentco.com/pub/apache/lucene/solr/4.6.0/solr-4.6.0.tgz
tar -xzf solr-4.6.0.tgz

Edit solr-4.6.0/example/solr/collection1/conf/solrconfig.xml, un-commenting the schemaFactory element in lines 134-37, and commenting out the schemaFactory element in line 151, as shown below:

solrconfig.xml
  <!-- To enable dynamic schema REST APIs, use the following for <schemaFactory>: -->
  
       <schemaFactory class="ManagedIndexSchemaFactory">
         <bool name="mutable">true</bool>
         <str name="managedSchemaResourceName">managed-schema</str>
       </schemaFactory>
       
 <!--  When ManagedIndexSchemaFactory is specified, Solr will load the schema from
       the resource named in 'managedSchemaResourceName', rather than from schema.xml.
       Note that the managed schema resource CANNOT be named schema.xml.  If the managed
       schema does not exist, Solr will create it after reading schema.xml, then rename
       'schema.xml' to 'schema.xml.bak'. 
       
       Do NOT hand edit the managed schema - external modifications will be ignored and
       overwritten as a result of schema modification REST API calls.
       When ManagedIndexSchemaFactory is specified with mutable = true, schema
       modification REST API calls will be allowed; otherwise, error responses will be
       sent back for these requests. 
  -->
  <!-- <schemaFactory class="ClassicIndexSchemaFactory"/> -->

 

Start Solr and verify that it is running at http://localhost:8983/solr.

cd solr-4.6.0/example
java -jar start.jar

Install and Start the Indexing Service

See documentation

  • No labels

2 Comments

  1. Hi, previously it was mentioned that a message service should be there to communicate. Is the indexing service similar to that?

    And so, now is it better to use camel-toolbox indexing service than the previously mentioned fcepo-message-consumer? Also, I have tried fcrepo-message-consumer...As far I know, I have given all the correct settings. But the fedora data is not reflecting in solr. Ok, then I will try this. Thanks for the updated post.

     

    1. Yes, fcrepo-camel-toolbox is the integration you need.