Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info

This guide will help you get up and running with a Fedora 4 instance whose updates are automatically indexed in a Solr repository.  This guide glosses over many of the details and should be considered a starting point for testing using this feature.  The document assumes a POSIX operating system with cURL, a text editor, Java, Git, and a download of Apache Solr 4.10.3.

Versions

...

Install and Start Fedora 4

Assumptions

...

  • Fedora 4 is running on port 8080 at context "fcrepo" (with JMS

...

  • events published at port 61616)
  • Your Fedora instance has the transform service enabled. Since the transform service is not available in the core Fedora webapp, you will likely need to use the Fedora Webapp Plus.

Verify

  1. You should be able to view Fedora in a web browser at the following URL: http://localhost:8080/fcrepo/rest

Install, Configure and Start Solr

Download Solr

Code Block
languagebash
wget http://mirrorarchive.cogentcoapache.comorg/pub/apachedist/lucene/solr/4.610.03/solr-4.610.03.tgz
tar -xzfxzvf solr-4.610.03.tgz

...

The location of your untarred Solr installation will be hereinafter referenced as $SOLR_HOME.

Update Solr schema

Code Block
languagebash
wget https://raw.githubusercontent.com/fcrepo4-exts/fcrepo4-vagrant-base-box/master/config/schema.xml
cp schema.xml $SOLR_HOME/example/solr/collection1/conf/

...

Start Solr

<!-- To enable dynamic schema REST APIs, use the following for <schemaFactory>: -->
Code Block
languagexml
titlesolrconfig.xml
firstline132
linenumberstrue
bash
cd $SOLR_HOME/example
java -jar start.jar

Verify

Install and Start Karaf

Download Karaf

Code Block
languagebash
wget http://archive.apache.org/dist/karaf/4.0.5/apache-karaf-4.0.5.tar.gz
tar xvzf apache-karaf-4.0.5.tar.gz

The location of your untarred Karaf installation will be hereinafter referenced as $KARAF_HOME.

Start Karaf

Code Block
languagebash
cd $KARAF_HOME
./bin/karaf

Verify

After running the command above

  • you should be presented with some ASCII art in your terminal and
  • you should be put into the Karaf client shell, such as:

    No Format
            __ __      

...

  •  

...

  •          

...

  •  

...

  •  ____      
      

...

  •  

...

  •     / //_/____ 

...

  • __________ _/ __/      
     

...

  •    

...

  •  

...

  •  

...

  • / 

...

  • ,< 

...

  •  

...

  • / 

...

  • __ 

...

  • `/ ___/ __ `/ /_     

...

  •  

...

  •  

...

  •  

...

  • 
     

...

  •  

...

  •  

...

  •  

...

  •  

...

  • / /| |/ /_/ / /  / /_/ / __/        
        /_/ |_|\__,_/_/  

...

  •  \__,_/_/         
      Apache Karaf (4.0.2)
    
    
    karaf@root()>
  • Note, to exit the Karaf client shell, type: CTRL-D. This will stop the Karaf server, ending the indexing process.

  • To run Karaf as a system service please refer to the Karaf Service Wrapper documentation.

Install, Configure and Start Fedora Camel Toolbox

Install Toolbox

In the Karaf client shell type the following:

Code Block
languagebash
feature:repo-add mvn:org.fcrepo.camel/toolbox-features/4.6.2/xml/features
feature:install fcrepo-service-activemq
feature:install fcrepo-indexing-solr

Verify - Toolbox Installation

Still in the Karaf client shell, the following command

Code Block
languagebash
feature:list|grep fcrepo

should result in both the fcrepo-camel and fcrepo-indexing-solr features being in the Started state

No Format
fcrepo-camel      
Warning

The fcrepo-message-consumer SolrIndexer implementation does not commit upon updates.  In order to see the changes, you must configure Solr to have a commit strategy that is appropriate for your use.  Resource removal events do trigger a commit.

Code Block
languagexml
titlesolrconfig.xml
firstline349
    <!-- AutoCommit

         Perform a hard commit automatically under certain conditions.
      | 4.4.3  Instead of enabling autoCommit, consider using "commitWithin"
    |     when adding documents. 

  | Started      http://wiki.apache.org/solr/UpdateXmlMessages

| fcrepo-camel-4.4.3
fcrepo-indexing-solr         maxDocs - Maximum number of documents to add since the last
    | 4.6.2            | x commit before automatically triggering a new commit.

 | Started       maxTime - Maximum amount of time in ms that is allowed to pass
| toolbox-features-4.6.2
fcrepo-ldpath                           | 4.6.2  since a document was added before automatically
    |          | Started    triggering a new commit. 
         openSearcher - if false, the commit causes recent index changes
| toolbox-features-4.6.2
fcrepo-service-ldcache-file                   to be flushed to stable storage, but does not cause a new
 | 4.6.2            |          searcher| toStarted be opened to make those changes visible.

| toolbox-features-4.6.2
fcrepo-marmotta-osgi         If the updateLog is enabled, then it's highly recommended to
  | 4.6.2      have some sort of hard autoCommit to| limit the log size.
      -->
| Started    <autoCommit> 
       <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> 
       <openSearcher>false</openSearcher> 
     </autoCommit>

...

| toolbox-features-4.6.2

Verify - LDPath

  1. You should be able to create and navigate to an existing Fedora resource in the web browser, for example http://localhost:

...

Code Block
languagebash
cd solr-4.6.0/example
java -jar start.jar

Add the "uuid" field ("title" and "id" already exist).

...

languagebash

...

  1. 8080/fcrepo/rest/collection
  2. Assuming the resource is named "collection", you should be able to verify that the LDPath service is enabled by navigating to the following URL in a web browser: http://localhost:

...

  1. 9086/

...

titlesolr-fields.json

...

  1. ldpath/collection
    1. You should see a JSON document such as:

      Code Block
      [{"

...

    1. extent":[],"

...

    1. references":[],"

...

    1. prev":

...

    1. [],"

...

    1. altLabel":

...

    1. [],"

...

    1. type":

...

    1. ["http://fedora.info/definitions/v4/repository#Container","http://fedora.info/definitions/v4/repository#Resource","http://www.w3.org/ns/ldp#Container","http://www.w3.org/ns/ldp#RDFSource"],"narrowMatch":[],"relation":[],"accrualMethod":[],"notation":[],"id":["http://localhost:8080/fcrepo/rest/collection"],
      ...
      "lastModifiedBy":["bypassAdmin"],"prefLabel":[],"alternative":[],"label":[],"accessTo":[],"createdBy":["bypassAdmin"],"hiddenLabel":[],"comment":[],"accessRights":[],"sameAs":[]
      }]

Configure Toolbox

The main configuration of the fcrepo-indexing-solr feature is found at: $KARAF_HOME/etc/org.fcrepo.camel.indexing.solr.cfg

You will need to make updates to this configuration file if any of the following are true:

  • Your Solr is deployed at a URL different than the one detailed earlier in this document
  • Your Fedora is deployed at a URL different than the one detailed earlier in this document
  • Your Fedora has Authorization enabled, e.g. WebAC

For configuration details, please refer to the documentation found at the fcrepo-indexing-solr github page. If you updated the $KARAF_HOME/etc/org.fcrepo.camel.indexing.solr.cfg file, it is quite likely that you will also need to update the $KARAF_HOME/etc/org.fcrepo.camel.ldpath.cfg file, particularly the sections related to Fedora location and authorization. Please refer to the fcrepo-ldpath page on github for configuration details.

Success

You should now be able to create/update/delete resources in your Fedora repository, and subsequently see them in your Solr index!

Resources

For debugging purposes, you may want to inspect the logs of the various applications:

  • Fedora log (unless configured otherwise): /var/log/tomcat8/catalina.out
  • Solr log: $SOLR_HOME/example/logs/solr.log
  • Karaf log: $KARAF_HOME/data/log/karaf.log

 

 

Download, Build, Configure and Start fcrepo-message-consumer

Code Block
languagebash
git clone git@github.com:fcrepo4/fcrepo-message-consumer.git

 

Edit the configuration at fcrepo-message-consumer/fcrepo-message-consumer-webapp/src/main/resources/spring/indexer-core.xml to point to your Solr installation.

Code Block
languagexml
firstline31
linenumberstrue
  <!-- Solr Indexer START-->
  <bean id="solrIndexer" class="org.fcrepo.indexer.solr.SolrIndexer">
    <constructor-arg ref="solrServer" />
  </bean>
  <!--External Solr Server  -->
  <bean id="solrServer" class="org.apache.solr.client.solrj.impl.HttpSolrServer">
    <constructor-arg index="0" value="http://${fcrepo.host:localhost}:${solrIndexer.port:8983}/solr/" />
  </bean>
  <!-- Solr Indexer END-->

  <!-- Message Driven POJO (MDP) that manages individual indexers -->
   <bean id="indexerGroup" class="org.fcrepo.indexer.IndexerGroup">
    <constructor-arg name="indexers">
      <set>
      <!--
        <ref bean="jcrXmlPersist"/>
        <ref bean="fileSerializer"/>
        <ref bean="sparqlUpdate"/> -->
        <!--To enable solr Indexer, please uncomment line below  -->
        <ref bean="solrIndexer"/>
      </set>
    </constructor-arg>

    <!-- If your Fedora instance requires authentication, enter the
         credentials here. Leave blank if your repo is open. -->
    <constructor-arg name="fedoraUsername" value="${fcrepo.username:}" /> <!-- i.e., manager, tomcat, etc. -->
    <constructor-arg name="fedoraPassword" value="${fcrepo.password:}" />
  </bean>

 

Start the application (in this case on port 9999).

Code Block
languagebash
mvn clean install -DskipTests
cd fcrepo-message-consumer-webapp
mvn -Djetty.port=9999 jetty:run

Create an Indexable resource

Code Block
languagebash
curl -v -X PUT -H "Content-Type: text/turtle" --data-binary "@object.rdf" "http://localhost:8080/rest/indexableObject" 
Code Block
titleobject.rdf
linenumberstrue
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX indexing: <http://fedora.info/definitions/v4/indexing#>

<> indexing:hasIndexingTransformation "default"; rdf:type indexing:indexable; dc:title "This title will show up in the index."

Ensure that the records are committed to Solr (either through an explicit commit or waiting until the configured commit period is up) and then verify that they show up.