Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

...

Solr in DSpace

What is Solr: http://lucene.apache.org/solr/features.htmlImage Removed

DSpace uses Solr as a part of Discovery as index to speed up access to content metadata and data about access to DSpace (for statistics). It also provides faceting and search results filtering. If Discovery is enabled, the DSpace search field accepts Solr search syntax.
Discovery is an optional part of DSpace since 1.7 (with big improvements and configuration format changes in 1.8). When enabled, Discovery replaces DSpace Search and Browse and provides Solr-based statistics.

Info
titleDo I need to read this page?

To gain the benefits of faceting and filtering in XMLUI, all you need to do is enable Discovery. The rest of these page describes some advanced uses of Solr - if you want to query Solr directly for theme customization or read DSpace metadata from outside DSpace.

Connecting to Solr

...

By default, the DSpace Solr server is configured to listen only on localhost, port 8080 (unless you specified another port in Tomcat configuration and the {{\[dspace\]/config/modules/discovery.cfg}} config file). That means that you cannot connect from another machine to the dspace server port 8080 and request a Solr URL - you'll get a HTTP 403 error. This configuration was done for security considerations - Solr index contains some data that is not accessible via public DSpace interfaces and some of the data might be sensitive.

Bypassing localhost restriction temporarily

...

  1. OpenSSH client - port forwarding
    connect to DSpace server and forward its port 8080 to localhost (machine we're connecting from) port 1234
    Code Block
    ssh -L 1234:127.0.0.1:8080 mydspace.edu
    makes mydspace.edu:8080 accessible via localhost:1234 (type http://localhost:1234Image Removed in browser address bar); also opens ssh shell
    exit ssh to terminate port forwarding
    Alternatively:
    Code Block
    ssh -N -f -L 1234:127.0.0.1:8080 mydspace.edu
    run with -N and -f flags if you want ssh to go to background
    kill the ssh process to terminate port forwarding
  2. Putty client - port forwarding
    The same with Putty:
    Code Block
    Connection - SSH - Tunnels
    Source port: 8080
    Destination: localhost:1234
    Local
    Auto
    Add
    
  3. OpenSSH client - SOCKS proxy
    connect to DSpace server and run a SOCKS proxy server on localhost port 1234; configure browser to use localhost:1234 as SOCKS proxy and remove "localhost" and "127.0.0.1" from addresses that bypass this proxy
    all browser requests now originate from dspace server (source IP is dspace server's IP) - dspace is the proxy server
    type http://localhost:8080Image Removed in browser address bar - localhost here is the dspace server
    Code Block
    ssh -D 1234 mydspace.edu

...

(depending on your OS, Tomcat installation method and logging settings, the path may be different)

Solr responses

Wiki MarkupBy default, Solr responses are returned in XML format. However, Solr can provide several other output formats including JSON and CSV. Discovery uses the javabin format. The Solr request parameter is wt (e.g. &wt=json). For more information, see [Response Writers|http://lucidworks.lucidimagination.com/display/solr/Response+Writers], [QueryResponseWriters|http://wiki.apache.org/solr/QueryResponseWriters]. An interesting option is to specify an XSLT stylesheet that can transform the XML response .
An interesting option is to specify an XSLT stylesheet that can transform the XML response (server-side) to any format you choose, typically HTML. Append &wt=xslt&tr=example.xsl to the Solr request URL. The .xsl files must be provided in the {{\[dspace\]/solr/search/conf/xslt/}} directory.
For more information, see [XsltResponseWriter|http://wiki.apache.org/solr/XsltResponseWriter].

Examples

Date of last deposited item

...

Furthermore, we don't want to hardcode the http://localhost:8080Image Removed Solr URL, because this can be changed in config file and that would break the template. So we'll call a Java function from XSLT to retrieve the configured Solr URL. See the complete example in the next section.

...

...