Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

About This Service
New Features in Version 2.1
New Features in Version 2.0
Installation
Configuring the Service
Configuring the Service for Automatic Updates
Configuring Fedora for Automatic Updates
Additional Information

About This Service
Anchor
about
about

The Fedora Generic Search Service, abbreviated GSearch, is part of the Fedora Service Framework. It was developed by Gert Schmeltz Pedersen at the Technical University of Denmark, with feedback and contributions from members of the Fedora community, including Beth Kirschner, Binaya Poudyal, Blake Anderson, Boon Low, Christian Tønsberg, Eric Brown, Jun Yamog, Junran Lei, Leire Urcelay, Luis Zorita, Matt Zumwalt, Matthias Razum, Michael Appleby, Michael Hoppe, Nikolai Schwertner, Patrick Monbaron, Pierre-Yves Landron, Ranju Upadhyaya, Robert Sherratt, Ryan E. Scherle, Sam Liberman, Shunde Zhang, Steve DiDomenico Thierry Michel, and Xinjian Guo.

...

GSearch may run in a separate web server and may index more than one Fedora repository, and it may update more than one index in parallel. For further architectural details, see Additional Information

New Features in Version 2.1
Anchor
new21
new21

  • Fedora 3.0 compatibility
  • Added an update listener which uses the Fedora Messaging Client to listen for updates being performed through API-M. These update messages contain the information needed to perform index updates, thereby keeping GSearch up-to-date with the Fedora repository.
  • Enhanced the sortFields parameter to gfindObjects for Lucene, sorting search results by a custom Comparator class, see the index.properties file in configTestOnLucene and the test class dk.defxws.fedoragsearch.test.ComparatorSourceTest.
  • Enhanced the fromFoxmlFiles action of updateIndex for Lucene, so that all files are attempted to be indexed, even though one or more may fail, in which case log messages are given. Before, one failure would cause abortion.

New Features in Version 2.0
Anchor
new20
new20

  • Added a plugin for the Apache Solr search server.
  • Added easier configuration, so that you need only edit one file with property values, then run it with ant.
  • Updated to Lucene version 2.3.0.
  • Added params to indexing in the format: ...&indexDocXslt=xslt-name[(paramname1=value1[,paramname2=value2,...])] Use the parameters at indexing time by putting xsl:param statement in the indexing xslt stylesheet, like this: <xsl:param name="someparamname" select="defaultvalue"/>
  • Added optimize options for Lucene indexing:
    fgsindex.mergeFactor and fgsindex.maxBufferedDocs will affect performance, see the index.properties file in configTestOnLucene. Also added ...?operation=updateIndex&action=optimize which will perform IndexWriter.optimize() which merges all segments together into a single segment, optimizing an index for search. Removed the optimize() call after each updateIndex.
  • Added untokenizedFields property to Lucene index.properties files. Adding the property with a list of all untokenized fields will ensure that they all select the appropriate analyzer.
  • Added a sortFields parameter to gfindObjects for Lucene, sorting search results as specified, see the index.properties file in configTestOnLucene.
  • Added properties snippetBegin and snippetEnd, making highlight code configurable, see the index.properties file in configTestOnLucene.
  • Added property for custom URIResolver used by xslt transformers for basic auth and SSL, see the example dk.defxws.fedoragsearch.server.URIResolverImpl class and the index.properties file in configTestOnLucene.
  • Removed encoding of special characters in indexFields. Snippets now show special characters without modification. Indexes should be reindexed.

Installation
Anchor
installation
installation

To install the service:

  • Deploy fedoragsearch.war to the webapps directory of your web server, e.g. the tomcat supplied with Fedora, or similar. You may rename the .war file, before you copy it into the webapps directory, in order to give it another webapp name.
  • Edit the configuration settings.

The SOAP service operations are deployed with the .war file, and the .wsdl file is available here.

Configuring the Service
Anchor
config
config

  • Edit the property values in the configvalues.xml file in .../webapps/<WEBAPPNAME>/ (where <WEBAPPNAME> by default is 'fedoragsearch'):
    o Set the property values for your environment.
    o Select the default config in configDefault.
    o Save this edited file outside of the web server.
    o Run target configOnWebServer after deployment from command line:
    >ant -f configvalues.xml configOnWebServer
    This will set your values into fedoragsearch.properties, repository.properties, and index.properties. Read these files to make sure they are correct.
  • Then you may restart <WEBAPPNAME> and call http://<HOSTPORT>/<WEBAPPNAME>/rest in order to index and search. The name "rest" may be reconfigured in .../webapps/<WEBAPPNAME>/WEB-INF/web.xml
  • Try the demo command line client. Change directory to .../webapps/<WEBAPPNAME>/client/ make the file executable, and run sh runRESTClient.sh then you will get the usage instruction.
  • Tailor the demos for your own purpose by editing renamed copies of the demo xslt stylesheets in .../webapps/<WEBAPPNAME>/WEB-INF/classes/config/rest/ Then edit fedoragsearch.properties.
  • Tailor the demo Lucene indexing stylesheet for your own purpose by editing a renamed copy of the demo xslt stylesheet in .../webapps/<WEBAPPNAME>/WEB-INF/classes/config/index/<INDEXNAME>/demoFoxmlToLucene.xslt For the sake of the example, the stylesheet indexes only active Fedora objects with PID starting with "demo" The options for tailoring include fields from other metadata datastreams than DC, field types and field boosts, see the stylesheet for options. Then edit index.properties.
  • For your real applications, you must carefully edit all stylesheets for your purpose.
  • Inspect the Lucene index with Luke. Notice, Luke cannot open an empty Lucene index.
  • You may tailor the highlight of search terms in demo.css.
  • You may want to experiment with more than one configuration, in which case you may maintain them under different names in parallel to the "config" configuration, which is the default configuration. In order to activate an alternative configuration you may use the semi-secret operation configure with parameter configName, either using the demo command line client or the REST interface.

Configuring the service for Automatic Updates
Anchor
sauto
sauto

As of version 2.1, GSearch has the ability to listen to update messages provided by Fedora. These messages are sent via JMS, so a JMS provider must be available (a JMS provider is included with Fedora 3.0). In order to configure the update listener, open updater.properties and set the following property values. These values will most likely be the same as those specified in your Fedora configuration.

...

If you decide not to use the automatic updates feature in GSearch, you'll need to open fedoragsearch.properties and remove (or comment out) the line specifying fedoragsearch.updaternames. This will disable the update listener.

Configuring Fedora for Automatic Updates
Anchor
fauto
fauto

Fedora 3.0 added the ability to send a message whenever a change is made to the content of the repository (through API-M.) This messaging capability must be enabled and configured to work properly. See the Fedora documentation for instructions on configuring messaging.

...

  • gSearchRESTURL - The REST endpoint for GSearch, for example, http://localhost:8080/fedoragsearch/rest
  • gSearchUsername - If GSearch is protected by authentication, this is the username that Fedora should use to authenticate.
  • gSearchPassword - The password for the above user, if applicable

Additional Information
Anchor
additional
additional

Search Engine Plugins
Architectural Snapshots
Multilingual Configuration

...