Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Excerpt
hiddentrue

DSpace Discovery is a Maintained Addon for DSpace XMLUI that replaces the default Search and Browse behavior with Apache Solr.


DSpace Discovery is an Maintained Addon for DSpace XMLUI that replaces the default Search and Browse behavior with Apache Solr.

...

There is one thing that remains unfinished and that is the related items, I'm still thinking on the best way to implement that with the DiscoveryQuery/DiscoveryResult objects, if anybody has some suggestion I'm always willing to listen.

New discovery configuration (unconfirmed)

The configuration for discovery is located in 2 separate files.

  • The discovery.cfg file located in thedspace.dir/config/modules directory, this file contains general discovery settings (the location of the solr server, which fields not to index, ...)
  • The spring-dspace-addon-discovery-configuration-services.xml file located in dspace.dir/config/spring directory. This is a spring file that contains all the configuration for the user interface (Sidebar facet configuration, sort options, search filters, ...)

When changes are made to one of these files the tomcat needs to be restarted & a complete re index of the repository is required. To do this use the command line and navigate to the dspace directory and run the command below.

Code Block
./bin/dspace update-discovery-index \-f

The general discovery settings (discovery.cfg)

The discovery.cfg file is located in the dspace.dir/config/modules directory, it contains the following properties: 

 

Property:

 

 

search.server

 

 

Example Value:


 

http://localhost:8080/solr/search


 

Informational Note:


 

Discovery relies on a SOLR index. This parameter determines the location of the SOLR index.


 

Property:


 

search.default.sort.order


 

Example Value:


 

search.default.sort.order=DESC


 

Informational Note:


 

The default sort order when searching in discovery, it can either be DESC or ASC.


 

Property:


 

index.ignore


 

Example Value:


 

dc.description.provenance,dc.language


 

Informational Note:


 

A comma separated list containing the metadata fields which are not to be indexed.


The User Interface settings (spring-dspace-addon-discovery-configuration-services.xml)

The file is located in the dspace.dir/config/spring directory.

The Structure of spring-dspace-addon-discovery-configuration-services.xml

Code Block
langxml
<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:context="http://www.springframework.org/schema/context"
    xsi:schemaLocation="http://www.springframework.org/schema/beans
           http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
           http://www.springframework.org/schema/context
           http://www.springframework.org/schema/context/spring-context-2.5.xsd"
    default-autowire-candidates="*Service,*DAO,javax.sql.DataSource">
    <context:annotation-config /> <!-- allows us to use spring annotations in beans -->


<!--Bean that is used for mapping communities/collections to certain discovery configurations-->
<bean id="org.dspace.discovery.configuration.DiscoveryConfigurationService" class="org.dspace.discovery.configuration.DiscoveryConfigurationService">
        <property>
            <map>
                <!--The map containing all the settings,
                    the key is used to refer to the page (the "site" or a community/collection handle)
                    the value-ref is a reference to an identifier of the DiscoveryConfiguration format
                    -->
                <!--The default entry, DO NOT REMOVE the system requires this-->
               <entry key="default" value-ref="defaultConfiguration" />
                        ......
            </map>
        </property>
    </bean>


<bean id="defaultConfiguration" class="org.dspace.discovery.configuration.DiscoveryConfiguration" scope="prototype">
        <!--Which sidebar facets are to be displayed-->
        <property name="sidebarFacets">
            <list>
            </list>
        </property>
        <!--The search filters which can be used on the discovery search page-->
        <property name="searchFilters">
            <list>
            ....
            </list>
        </property>
        <!--The sort filters for the discovery search-->
        <property name="searchSortFields">
            <list>
            ....
            </list>
        </property>
        <!--Any default filter queries, these filter queries will be used for all queries done by discovery for this configuration-->
        <!--<property name="defaultFilterQueries">-->
            <!--<list>-->
            ....
            <!--</list>-->
        <!--</property>-->
        <!--The configuration for the recent submissions-->
        <property name="recentSubmissionConfiguration">
            <bean class="org.dspace.discovery.configuration.DiscoveryRecentSubmissionsConfiguration">
             ...
            </bean>
        </property>
    </bean>

Because this file is in XML format, you should be familiar with XML before editing this file. By default, this file contains the "defaultConfiguration" for discovery which contains the following settings:

  • Sidebar facets configured:
    • sidebarFacetAuthor:  contains the metadata fields dc.contributor.author & dc.creator with a facet limit of 10 and sorted by count
    • sidebarFacetSubject: contains all subject metadata fields (dc.subject.*) with a facet limit of 10 and sorted by count
    • sidebarFacetDateIssued: contains the dc.date.issued metadata field has a type "date" and is sorted by value
  • The configured search filters:
    • searchFilterTitle: contains the dc.title metadata field and has a tokenized autocomplete
    • searchFilterAuthor: contains the dc.contributor.author & dc.creator metadata fields and has a non tokenized autocomplete configured
    • searchFilterSubject: contains the dc.subject.* metadata fields and has a non tokenized autocomplete configured
    • searchFilterIssued: contains the dc.date.issued metadata field with the type "date" and has a tokenized autocomplete
  • The configured sort fields:
    • sortTitle: contains the dc.title metadata field
    • sortDateIssued: contains the dc.date.issued metadata field, this sort has the type date configured.
  • The default filter queries are disabled by default but there is an example in the default configuration in comments which allows discovery to only return items (as opposed to also communities/collections).
  • The recent submissions configuration is sorted by dc.date. accessioned which is a date and a maximum number of 5 recent submissions are displayed.

Many of the properties contain lists which use references to point to the configuration elements. This way a certain configuration type can be used in multiple discovery configurations so there is no need to duplicate these.Adding a new discovery configuration

Mapping a discovery configuration to the home page or a specified community/collection

Code Block
langxml
<bean id="org.dspace.discovery.configuration.DiscoveryConfigurationService" class="org.dspace.discovery.configuration.DiscoveryConfigurationService">
        <property name="map">
            <map>
                <!--The map containing all the settings,
                    the key is used to refer to the page (the "site" or a community/collection handle)
                    the value-ref is a reference to an identifier of the DiscoveryConfiguration format
                    -->
                <!--The default entry, DO NOT REMOVE the system requires this-->
               <entry key="default" value-ref="defaultConfiguration" />

               <!--Use site to override the default configuration for the home page & default discovery page-->
               <!--<entry key="site" value-ref="defaultConfiguration1" />-->
               <!--<entry key="123456789/7621" value-ref="defaultConfiguration2"/>-->
            </map>
        </property>
    </bean>

When adding a new discovery configuration an additional entry in the map of the bean with id org.dspace.discovery.configuration.DiscoveryConfigurationService is required. This map can contain as many entries as there are communties or collections.

The map contains one entry already the default one, it is not recommended to remove this one. Each entry requires 2 attributes. The first one is the key the key can contain the following values:

  • Default: the default if no specific match is found
  • site: the discovery configuration for the home page and also for dspace.url/discover
  • a handle: the handle for a community or collection

The second attribute is the value-ref this value must refer to an existing configuration bean which contains the configuration for the facets, filters, ....

Creating a new discovery configuration bean

...

The structure of the discovery configuration bean

...

Code Block
langxml
<bean id="defaultConfiguration" class="org.dspace.discovery.configuration.DiscoveryConfiguration" scope="prototype">
        <!--Which sidebar facets are to be displayed-->
        <property name="sidebarFacets">
	...
        </property>
        <!--The search filters which can be used on the discovery search page-->
        <property name="searchFilters">
	...
        </property>
        <!--The sort filters for the discovery search-->
        <property name="searchSortFields">
            ...
        </property>
        <!--Any default filter queries, these filter queries will be used for all queries done by discovery for this configuration-->
        <!--<property name="defaultFilterQueries">-->
            ...
        <!--</property>-->
        <!--The configuration for the recent submissions-->
        <property name="recentSubmissionConfiguration">
            <bean class="org.dspace.discovery.configuration.DiscoveryRecentSubmissionsConfiguration">
                ...
            </bean>
        </property>
    </bean>

Creating a new discovery bean

Start by creating a new bean with an identifier specified in the mapping section from the previous point and ensure that it has the following attributes:

  • class: org.dspace.discovery.configuration.DiscoveryConfiguration
  • scope: prototype
    Code Block
    langxml
    <bean id="{identifier}" class="org.dspace.discovery.configuration.DiscoveryConfiguration" scope="prototype">
    </bean>
    Configuring sidebar facets

Add a new element named property and the attribute name="sidebarFacets" and add a subelement list. This property is mandatory by the discovery configuration.

Code Block
langxml
<bean id="{identifier}" class="org.dspace.discovery.configuration.DiscoveryConfiguration" scope="prototype">
	<property name="sidebarFacets">
		<list>
		</list>
	</property>
</bean>

In this list the user can add sidebar configuration beans, if the list is left empty no sidebar facets will be displayed. Each subelement of the list is a ref which has one attribute named "bean" the value of this bean is a reference to an identifier which will contain all the configuration of the sidebar facet.

Below is an example of how the list can be configured.

Code Block
langxml
<property name="sidebarFacets">
    <list>
        <ref bean="sidebarFacetAuthor" />
        <ref bean="sidebarFacetSubject" />
        <ref bean="sidebarFacetDateIssued" />
    </list>
</property>

Each of these properties refers to another bean which must be configured in the file.

The structure of a sidebar facet bean looks like this:

Code Block
langxml
<bean id="{sidebar.facet.identifier}" class="org.dspace.discovery.configuration.SidebarFacetConfiguration">
    <property name="indexFieldName" value="{index.field.name}"/>
    <property name="metadataFields">
        <list>
            <value>{metadata.field}</value>
            <value>{metadata.field}</value>
        </list>
    </property>
    <property name="facetLimit" value="{facet.limit}"/>
    <property name="sortOrder" value="{ COUNT  or VALUE} "/>
    <property name="type" value="{text or value} "/>
</bean>

The id & class attributes are mandatory for this type of bean. The properties that it contains are discussed below.

  • indexFieldName (Required): A unique sidebarfacet field name, the metadata will be indexed under this field name
  • metadataFields (Required): A list containing the metadata fields which are to be shown in the sidebar facets.
  • facetLimit (optional): The maximum number of sidebar facets to be shown, this property is optional, if none is specified 10 will be used. When a type of date is given, this property will not be used since dates are grouped years.
  • sortOrder (optional): The sort order for the sidebar facets, it can either be COUNT or VALUE. If none is given the COUNT value is used as a default.
    • COUNT Facets will be sorted by the amount of times they appear in the repository
    • VALUE Facets will be sorted alphanumeric
  • type (optional): the type of the sidebar facet it can either be date or text, if none is defined text will be used.
    • text: The facets will be treated as is
    • date: With dates only the year is indexed and sidebar facets are grouped by these years and a drill down is used

Example of a sidebar facet configuration bean

Code Block
langxml
    <bean id="sidebarFacetAuthor" class="org.dspace.discovery.configuration.SidebarFacetConfiguration">
        <property name="indexFieldName" value="author"/>
        <property name="metadataFields">
            <list>
                <value>dc.contributor.author</value>
                <value>dc.creator</value>
            </list>
        </property>
        <property name="facetLimit" value="10"/>
        <property name="sortOrder" value="COUNT"/>
    </bean>

Configuring search filters 

Search filters can be used on the discovery search page to further filter the discovery results. These filters have an autocomplete option.

Start of by adding an element named property with the attribute name="searchFilters" afterworths create a sub element list. The searchFilters property is mandatory.

Code Block
langxml
<property name="searchFilters">
    <list>
        ...
    </list>
</property>

Like the sidebar facets the list also contains sublements named ref and with the attribute bean referencing (in this case) a search filter configuration. If this list is empty no search filters will be displayed. Below is an example of the filters.

Code Block
langxml
<list>
    <ref bean="searchFilterTitle"/>
    <ref bean="searchFilterAuthor"/>
    <ref bean="searchFilterSubject"/>
    <ref bean="searchFilterIssued"/>
</list> 

Each of these properties refers to another bean which must be configured in the file.

The structure of a sidebar facet bean looks like this:

Code Block
langxml
<bean id="{bean.identifier}" class="org.dspace.discovery.configuration.DiscoverySearchFilter">
    <property name="indexFieldName" value="{index.field.name}"/>
    <property name="metadataFields">
        <list>
            <value>{metadata.field.1}</value>
            <value>{metadata.field.2}</value>
        </list>
    </property>
    <property name="fullAutoComplete" value="{true or false}"/>
    <property name="type" value="{text or value} "/>
</bean>

The id & class attributes are mandatory for this type of bean. The properties that it contains are discussed below.

  • indexFieldName (Required): A unique search filter field name, the metadata will be indexed under this field name
  • metadataFields (Required): A list containing the metadata fields which can be used in this filter
  • fullAutoComplete (optional): If set to true the values indexed for autocomplete will not be tokenized, if set to false tokenization will occur
  • type (optional): the type of the search filter it can either be date or text, if none is defined text will be used.
    • text: The metadata will be treated as is
    • date: With a type of date the dates will receive the following format: yyyy-MM-dd (2011-07-01)

Example of a search filter configuration bean

Code Block
langxml
<bean id="searchFilterAuthor" class="org.dspace.discovery.configuration.DiscoverySearchFilter">
    <property name="indexFieldName" value="author"/>
    <property name="metadataFields">
        <list>
            <value>dc.contributor.author</value>
            <value>dc.creator</value>
        </list>
    </property>
    <property name="fullAutoComplete" value="true"/>
</bean>

Configuring sort options

Sort options are used in the discovery search page, by default there is always one sort option (relevance). The structure of the sort options looks like this:

Code Block
langxml
<property name="searchSortFields">
    <list>
        ...
    </list>
</property>

Like the other properties the list also contains sublements named ref and with the attribute bean referencing (in this case) a sort option configuration. If this list is empty the only sort option available will be. Below is an example of the sort options.

Code Block
langxml
<list>
    <ref bean="sortTitle"/>
    <ref bean="sortDateIssued"/>
</list>

Each of these properties refers to another bean which must be configured in the file. The structure of a sort option bean looks like this:

Code Block
langxml
<bean id="{bean.identifier}" class="org.dspace.discovery.configuration.DiscoverySortConfiguration">
    <property name="metadataField" value="{metadata.field}"/>
    <property name="defaultSort" value="{true or false} "/>
    <property name="type" value="{text or date}"/>
</bean>

The id & class attributes are mandatory for this type of bean. The properties that it contains are discussed below.

  • metadataField (Required): The metadata field indicating the sort values
  • defaultSort (Optional): A boolean indicating which sort filter should be the default one.
  • type (optional): the type of the search filter it can either be date or text, if none is defined text will be used.

Example of a sort option configuration bean.

Code Block
langxml
<bean id="sortTitle" class="org.dspace.discovery.configuration.DiscoverySortConfiguration">
        <property name="metadataField" value="dc.title"/>
        <property name="defaultSort" value="true"/>
 </bean>

Default filter queries

The default queries are queries that are used on all queries linked to the configuration block they are in. So these queries are used to retrieve the results, the sidebar filters, ...

The filter queries element is an entirely optional property.

The layout of this property is displayed below.

Code Block
langxml
<property name="defaultFilterQueries">
    <list>
        <value>query1</value>
        <value>query2</value>
    </list>
</property>

This property contains a simple list which in turn contains the queries. Some examples of queries:

  • search.resourcetype:2
  • dc.subject:test
  • dc.contributor.author: "Van de Velde, Kevin"
  • ...

Recent submissions configuration

The recent submissions configuration element contains all the configuration settings to display the list of recently submitted items on the home page or community/collection page. Because the recent submission configuration is in the discovery configuration block, it is possible to show 10 recently submitted items on the home page but 5 on the community/collection pages.

The layout of the recently submitted is displayed below:

Code Block
langxml
<property name="recentSubmissionConfiguration">
    <bean class="org.dspace.discovery.configuration.DiscoveryRecentSubmissionsConfiguration">
        <property name="metadataSortField" value="{metadata.field}"/>
        <property name="type" value="{text or date} "/>
        <property name="max" value="{max}"/>
    </bean>
</property>

The property name & the bean class are mandatory. The property field names are discusses below.

  • *metadataSortField (*mandatory): The metadata field to sort on to retrieve the recent submissions
  • max (mandatory): The maximum number of results to be displayed as recent submissions
  • type (optional): the type of the search filter it can either be date or text, if none is defined text will be used.

Below is an example configuration of the recent submissions.

Code Block
langxml
<property name="recentSubmissionConfiguration">
    <bean class="org.dspace.discovery.configuration.DiscoveryRecentSubmissionsConfiguration">
        <property name="metadataSortField" value="dc.date.accessioned"/>
        <property name="type" value="date"/>
        <property name="max" value="5"/>
    </bean>
</property>

Deploying the custom discovery configuration

The DSpace web application only reads your custom configuration when it starts up, so it is important to remember:

Code Block
You must always restart Tomcat (or whatever servlet container you are using) for changes made to the spring file to take effect.

When the tomcat has restarted there is an option to check if the changes you made to the spring file are indeed valid. You can do this by running the command below in a command line interface.

Code Block
./bin/dspace dsrun org.dspace.discovery.configuration.DiscoveryConfigurationService

This command will print the current configuration if it is indeed valid. After verifying that the configuration is correct a complete re index of the discovery index is required. 

To do this use the command line and navigate to the dspace directory and run the command below.

Code Block
./bin/dspace update-discovery-index -f

Other Resources