Info |
---|
This guide will help you get up and running with a Fedora 4 instance whose updates are automatically indexed in a Solr repository. This guide glosses over many of the details and should be considered a starting point for testing using this feature. The document assumes a POSIX operating system with cURL, a text editor, Java, Git, and a download of Apache Solr 4.10.3. |
Versions
...
Install and Start Fedora 4
Assumptions
...
- Fedora 4 is running on port 8080 at context "fcrepo" (with JMS
...
- events published at port 61616)
- Your Fedora instance has the transform service enabled. Since the transform service is not available in the core Fedora webapp, you will likely need to use the Fedora Webapp Plus.
Verify
- You should be able to view Fedora in a web browser at the following URL: http://localhost:8080/fcrepo/rest
Install, Configure and Start Solr
Download Solr
Code Block | ||
---|---|---|
| ||
wget http://mirrorarchive.cogentcoapache.comorg/pub/apachedist/lucene/solr/4.610.03/solr-4.610.03.tgz tar -xzfxzvf solr-4.610.03.tgz |
...
The location of your untarred Solr installation will be hereinafter referenced as $SOLR_HOME.
Update Solr schema
Code Block | ||
---|---|---|
| ||
wget https://raw.githubusercontent.com/fcrepo4-exts/fcrepo4-vagrant-base-box/master/config/schema.xml cp schema.xml $SOLR_HOME/example/solr/collection1/conf/ |
...
Start Solr
Code Block | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
| <!-- To enable dynamic schema REST APIs, use the following for <schemaFactory>: -->
| ||||||||||
cd $SOLR_HOME/example
java -jar start.jar |
Verify
- Navigating to the following URL in a web browser should show the Solr administrative interface: http://localhost:8983/solr/
Install and Start Karaf
Download Karaf
Code Block | ||
---|---|---|
| ||
wget http://archive.apache.org/dist/karaf/4.0.5/apache-karaf-4.0.5.tar.gz
tar xvzf apache-karaf-4.0.5.tar.gz |
The location of your untarred Karaf installation will be hereinafter referenced as $KARAF_HOME.
Start Karaf
Code Block | ||
---|---|---|
| ||
cd $KARAF_HOME
./bin/karaf |
Verify
After running the command above
- you should be presented with some ASCII art in your terminal and
you should be put into the Karaf client shell, such as:
No Format __ __
...
...
...
...
____
...
...
/ //_/____
...
__________ _/ __/
...
...
...
...
/
...
,<
...
...
/
...
__
...
`/ ___/ __ `/ /_
...
...
...
...
...
...
...
...
...
/ /| |/ /_/ / / / /_/ / __/ /_/ |_|\__,_/_/
...
\__,_/_/ Apache Karaf (4.0.2) karaf@root()>
Note, to exit the Karaf client shell, type:
CTRL-D
. This will stop the Karaf server, ending the indexing process.To run Karaf as a system service please refer to the Karaf Service Wrapper documentation.
Install, Configure and Start Fedora Camel Toolbox
Install Toolbox
In the Karaf client shell type the following:
Code Block | ||
---|---|---|
| ||
feature:repo-add mvn:org.fcrepo.camel/toolbox-features/4.6.2/xml/features
feature:install fcrepo-service-activemq
feature:install fcrepo-indexing-solr |
Verify - Toolbox Installation
Still in the Karaf client shell, the following command
Code Block | ||
---|---|---|
| ||
feature:list|grep fcrepo |
should result in both the fcrepo-camel
and fcrepo-indexing-solr
features being in the Started
state
No Format |
---|
fcrepo-camel |
Warning |
---|
The fcrepo-message-consumer SolrIndexer implementation does not commit upon updates. In order to see the changes, you must configure Solr to have a commit strategy that is appropriate for your use. Resource removal events do trigger a commit. |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<!-- AutoCommit Perform a hard commit automatically under certain conditions. | 4.4.3 Instead of enabling autoCommit, consider using "commitWithin" | when adding documents. | Started http://wiki.apache.org/solr/UpdateXmlMessages | fcrepo-camel-4.4.3 fcrepo-indexing-solr maxDocs - Maximum number of documents to add since the last | 4.6.2 | x commit before automatically triggering a new commit. | Started maxTime - Maximum amount of time in ms that is allowed to pass | toolbox-features-4.6.2 fcrepo-ldpath | 4.6.2 since a document was added before automatically | | Started triggering a new commit. openSearcher - if false, the commit causes recent index changes | toolbox-features-4.6.2 fcrepo-service-ldcache-file to be flushed to stable storage, but does not cause a new | 4.6.2 | searcher| toStarted be opened to make those changes visible. | toolbox-features-4.6.2 fcrepo-marmotta-osgi If the updateLog is enabled, then it's highly recommended to | 4.6.2 have some sort of hard autoCommit to| limit the log size. --> | Started <autoCommit> <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> <openSearcher>false</openSearcher> </autoCommit> |
...
| toolbox-features-4.6.2 |
Verify - LDPath
- You should be able to create and navigate to an existing Fedora resource in the web browser, for example http://localhost:
...
Code Block | ||
---|---|---|
| ||
cd solr-4.6.0/example
java -jar start.jar |
Add the "uuid" field ("title" and "id" already exist).
...
language | bash |
---|
...
- 8080/fcrepo/rest/collection
- Assuming the resource is named "collection", you should be able to verify that the LDPath service is enabled by navigating to the following URL in a web browser: http://localhost:
...
- 9086/
...
title | solr-fields.json |
---|
...
- ldpath/collection
You should see a JSON document such as:
Code Block [{"
...
extent":[],"
...
references":[],"
...
prev":
...
[],"
...
altLabel":
...
[],"
...
type":
...
["http://fedora.info/definitions/v4/repository#Container","http://fedora.info/definitions/v4/repository#Resource","http://www.w3.org/ns/ldp#Container","http://www.w3.org/ns/ldp#RDFSource"],"narrowMatch":[],"relation":[],"accrualMethod":[],"notation":[],"id":["http://localhost:8080/fcrepo/rest/collection"], ... "lastModifiedBy":["bypassAdmin"],"prefLabel":[],"alternative":[],"label":[],"accessTo":[],"createdBy":["bypassAdmin"],"hiddenLabel":[],"comment":[],"accessRights":[],"sameAs":[] }]
Configure Toolbox
The main configuration of the fcrepo-indexing-solr
feature is found at: $KARAF_HOME/etc/org.fcrepo.camel.indexing.solr.cfg
You will need to make updates to this configuration file if any of the following are true:
- Your Solr is deployed at a URL different than the one detailed earlier in this document
- Your Fedora is deployed at a URL different than the one detailed earlier in this document
- Your Fedora has Authorization enabled, e.g. WebAC
For configuration details, please refer to the documentation found at the fcrepo-indexing-solr github page. If you updated the $KARAF_HOME/etc/org.fcrepo.camel.indexing.solr.cfg
file, it is quite likely that you will also need to update the $KARAF_HOME/etc/org.fcrepo.camel.ldpath.cfg
file, particularly the sections related to Fedora location and authorization. Please refer to the fcrepo-ldpath page on github for configuration details.
Success
You should now be able to create/update/delete resources in your Fedora repository, and subsequently see them in your Solr index!
Resources
For debugging purposes, you may want to inspect the logs of the various applications:
- Fedora log (unless configured otherwise):
/var/log/tomcat8/catalina.out
- Solr log:
$SOLR_HOME/example/logs/solr.log
- Karaf log:
$KARAF_HOME/data/log/karaf.log
Download, Build, Configure and Start fcrepo-message-consumer
Code Block | ||
---|---|---|
| ||
git clone git@github.com:fcrepo4/fcrepo-message-consumer.git |
Edit the configuration at fcrepo-message-consumer/fcrepo-message-consumer-webapp/src/main/resources/spring/indexer-core.xml to point to your Solr installation.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
<!-- Solr Indexer START-->
<bean id="solrIndexer" class="org.fcrepo.indexer.solr.SolrIndexer">
<constructor-arg ref="solrServer" />
</bean>
<!--External Solr Server -->
<bean id="solrServer" class="org.apache.solr.client.solrj.impl.HttpSolrServer">
<constructor-arg index="0" value="http://${fcrepo.host:localhost}:${solrIndexer.port:8983}/solr/" />
</bean>
<!-- Solr Indexer END-->
<!-- Message Driven POJO (MDP) that manages individual indexers -->
<bean id="indexerGroup" class="org.fcrepo.indexer.IndexerGroup">
<constructor-arg name="indexers">
<set>
<!--
<ref bean="jcrXmlPersist"/>
<ref bean="fileSerializer"/>
<ref bean="sparqlUpdate"/> -->
<!--To enable solr Indexer, please uncomment line below -->
<ref bean="solrIndexer"/>
</set>
</constructor-arg>
<!-- If your Fedora instance requires authentication, enter the
credentials here. Leave blank if your repo is open. -->
<constructor-arg name="fedoraUsername" value="${fcrepo.username:}" /> <!-- i.e., manager, tomcat, etc. -->
<constructor-arg name="fedoraPassword" value="${fcrepo.password:}" />
</bean> |
Start the application (in this case on port 9999).
Code Block | ||
---|---|---|
| ||
mvn clean install -DskipTests
cd fcrepo-message-consumer-webapp
mvn -Djetty.port=9999 jetty:run |
Create an Indexable resource
Code Block | ||
---|---|---|
| ||
curl -v -X PUT -H "Content-Type: text/turtle" --data-binary "@object.rdf" "http://localhost:8080/rest/indexableObject" |
Code Block | ||||
---|---|---|---|---|
| ||||
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX indexing: <http://fedora.info/definitions/v4/indexing#>
<> indexing:hasIndexingTransformation "default"; rdf:type indexing:Indexable; dc:title "This title will show up in the index."
|
Ensure that the records are committed to Solr (either through an explicit commit or waiting until the configured commit period is up) and then verify that they show up.