Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: remove references to xmlui and jspui

...

 This is a classic "tag cloud" facet in a DSpace repository.

Discovery Changelist

DSpace 6.0

The legacy search engine (based on Apache Lucene) and legacy Browse system (based on database tables) have been removed from DSpace 6.0 or above. Instead, DSpace now only uses Discovery for all Search/Browse capabilities.

In addition, to support the new Configuration options, all of the Discovery configurations in discovery.cfg have been prefixed with "discovery." (see configuration below).

DSpace 5.0

The new JSPUI-only tag cloud facet feature is disabled by default. In order to enable it, you will need to set up the corresponding processor that the PluginManager will load to actually perform the tag cloud query on the relevant pages. This is configured in the dspace.cfg configuration file using the following properties:

The tag cloud has been declared there for you but it is commented out.

DSpace 4.0 

Info
Starting from DSpace 4.0, Discovery is the default search and browse solution for DSpace.

General improvements:

  • Browse interfaces now also use Discovery index (rather than the legacy, now retired, Lucene index)
  • "Did you means" spell check aid for search

DSpace 3.0 

Info
Starting from DSpace 3.0, Discovery is also supported in JSPUI.

General improvements:

  • Hierarchical facets sidebar facets
  • Improved & more intuitive user interface
  • Access Rights Awareness (enabled by default). Access restricted or embargoed content is hidden from anonymous search/browse.
  • Authority control & variants awareness (homonyms are shown separately in a facet if they have different authority ID). All variant forms as recognized by the authority framework are indexed. See Authority Framework

XMLUI-only:

  • Hit highlighting and search snippets support
  • "More like this" (related items)

Bugfixes and other changes

  • Auto-complete functionality has been removed in XMLUI from search queries due to performance issues. JSPUI still supports auto-complete functionality without performance issues.

DSpace 1.8 

  • Configuration moved from dspace.cfg into config/modules/discovery.cfg and config/spring/api/discovery.xml
  • Individual communities and collections can have their own Discovery configuration.
  • Tokenization for Auto-complete values (see SearchFilter)
  • Alphanumeric sorting for Sidebarfacets
  • Possibility to avoid indexation of specific metadata fields.
  • Grouping of multiple metadata fields under the same SidebarFacet

DSpace 1.7 

  • Sidebar browse facets that can be configured to use contents from any metadata field
    • Dynamically generated timespans for dates
  • Customizable "recent submissions" view on the repository homepage, collection and community pages
  • Hit highlighting & search snippets

 

Configuration files

The configuration for discovery is located in 2 separate files.

  • General settings: The discovery.cfg file located in the [dspace-install-dir]/config/modules directory.
  • User Interface Configuration: The discovery.xml file is located in [dspace-install-dir]/config/spring/api/ directory.

General Discovery settings (config/modules/discovery.cfg)

The discovery.cfg file is located in the [dspace]/config/modules directory and contains following properties. Any of these properties may be overridden in your local.cfg (see Configuration Reference):

...

Property:

...

discovery.search.server

...

Example Value:

...

discovery.search.server=[http://localhost:8080/solr/search]

...

Informational Note:

...

Discovery relies on a Solr index for storage and retrieval of its information. This parameter determines the location of the Solr index.

If you are uncertain whether this property is set correctly, you can use a commandline tool like "wget" to perform a query against the Solr index (and ensure Solr responds). For example, the below query searches the Solr index for "test" and returns the response on standard out:

wget -O - http://localhost:8080/solr/search/select?q=test

...

discovery.index.authority.ignore[.field]

...

discovery.index.authority.ignore=true

discovery.index.authority.ignore.dc.contributor.author=false

...

By default, Discovery will use the authority information in the metadata to disambiguate homonyms. Setting this property to false will make the indexing process the same as the metadata  doesn't include authority information. The configuration can be different on a field (<schema>.<element>.<qualifier>) basis, the property without field set the default value.

...

discovery.index.authority.ignore-prefered[.field]

...

discovery.index.authority.ignore-prefered=true

discovery.index.authority.ignore-prefered.dc.contributor.author=false

...

By default, Discovery will use the authority information in the metadata to query the authority for the preferred label. Setting this property to false will make the indexing process the same as the metadata  doesn't include authority information (i.e. the preferred form is the one recorded in the metadata value). The configuration can be different on a field (<schema>.<element>.<qualifier>) basis, the property without field set the default value. If the authority is a remote service, disabling this feature can greatly improve performance.

...

discovery.index.authority.ignore-variants[.field]

...

discovery.index.authority.ignore-variants=true

discovery.index.authority.ignore-variants.dc.contributor.author=false

...

By default, Discovery will use the authority information in the metadata to query the authority for variants. Setting this property to false will make the indexing process the same, as the metadata  doesn't include authority information. The configuration can be different on a per-field (<schema>.<element>.<qualifier>) basis, the property without field set the default value. If authority is a remote service, disabling this feature can greatly improve performance.


Configuration files

The configuration for discovery is located in 2 separate files.

  • General settings: The discovery.cfg file located in the [dspace-install-dir]/config/modules directory.
  • User Interface Configuration: The discovery.xml file is located in [dspace-install-dir]/config/spring/api/ directory.

General Discovery settings (config/modules/discovery.cfg)

The discovery.cfg file is located in the [dspace]/config/modules directory and contains following properties. Any of these properties may be overridden in your local.cfg (see Configuration Reference):

Property:

discovery.search.server

Example Value:

discovery.search.server=[http://localhost:8080/solr/search]

Informational Note:

Discovery relies on a Solr index for storage and retrieval of its information. This parameter determines the location of the Solr index.

If you are uncertain whether this property is set correctly, you can use a commandline tool like "wget" to perform a query against the Solr index (and ensure Solr responds). For example, the below query searches the Solr index for "test" and returns the response on standard out:

wget -O - http://localhost:8080/solr/search/select?q=test

Property:

discovery.index.authority.ignore[.field]

Example Value:

discovery.index.authority.ignore=true

discovery.index.authority.ignore.dc.contributor.author=false

Informational Note:

By default, Discovery will use the authority information in the metadata to disambiguate homonyms. Setting this property to false will make the indexing process the same as the metadata  doesn't include authority information. The configuration can be different on a field (<schema>.<element>.<qualifier>) basis, the property without field set the default value.

Property:

discovery.index.authority.ignore-prefered[.field]

Example Value:

discovery.index.authority.ignore-prefered=true

discovery.index.authority.ignore-prefered.dc.contributor.author=false

Informational Note:

By default, Discovery will use the authority information in the metadata to query the authority for the preferred label. Setting this property to false will make the indexing process the same as the metadata  doesn't include authority information (i.e. the preferred form is the one recorded in the metadata value). The configuration can be different on a field (<schema>.<element>.<qualifier>) basis, the property without field set the default value. If the authority is a remote service, disabling this feature can greatly improve performance.

Property:

discovery.index.authority.ignore-variants[.field]

Example   Value:

discovery.index.authority.ignore-variants=true

discovery.index.authority.ignore-variants.dc.contributor.author=false

Informational Note:

By default, Discovery will use the authority information in the metadata to query the authority for variants. Setting this property to false will make the indexing process the same, as the metadata  doesn't include authority information. The configuration can be different on a per-field (<schema>.<element>.<qualifier>) basis, the property without field set the default value. If authority is a remote service, disabling this feature can greatly improve performance.

Modifying the Discovery User Interface (config/spring/api/discovery.xml)

The discovery.xml file is located in the [dspace]/config/spring/api directory.

Structure Summary

This file is in XML format, you should be familiar with XML before editing this file. The configurations are organized together in beans, depending on the purpose these properties are used for.
This purpose can be derived from the class of the beans. Here's a short summary of classes you will encounter throughout the file and what the corresponding properties in the bean are used for.

Download the configuration file and review it together with the following parameters

Class:

DiscoveryConfigurationService

Purpose:

Defines the mapping between separate Discovery configurations and individual collections/communities

Default:

All communities, collections and the homepage (key=default) are mapped to defaultConfiguration, also controls the metadata fields that should not be indexed in the search core (item provenance for example).

Class:

DiscoveryConfiguration

Purpose:

Groups configurations for sidebar facets, search filters, search sort options and recent submissions

Default:

There is one configuration by default called defaultConfiguration

Class:

DiscoverySearchFilter

Purpose:

Defines that specific metadata fields should be enabled as a search filter

Default:

dc.title, dc.contributor.author, dc.creator, dc.subject.* and dc.date.issued are defined as search filters

Class:

DiscoverySearchFilterFacet

Purpose:

Defines which metadata fields should be offered as a contextual sidebar browse options, each of these facets has also got to be a search filter

Default:

dc.contributor.author, dc.creator, dc.subject.* and dc.date.issued

Class:

HierarchicalSidebarFacetConfiguration

Purpose:

Defines which metadata fields contain hierarchical data and should be offered as a contextual sidebar option

Class:

DiscoverySortConfiguration

Purpose:

Further specifies the sort options to which a DiscoveryConfiguration refers

Default:

dc.title and dc.date.issued are defined as alternatives for sorting, other than Relevance (hard-coded)

Class:

DiscoveryHitHighlightingConfiguration

Purpose:

Defines which metadata fields can contain hit highlighting & search snippets

Default:

dc.title, dc.contributor.author, dc.subject, dc.description.abstract & full text from text files.

Class:

TagCloudFacetConfiguration

Purpose:Defines the tag cloud appearance configuration bean and the search filter facets to appear in the tag cloud form. You can have different "TagCloudFacetConfiguration" per community or collection or the home page

Default settings

In addition to the summarized descriptions of the default values, following details help you to better understand these defaults. If you haven't already done so, download the configuration file and review it together with the following parameters.
The file contains one default configuration that defines following sidebar facets, search filters, sort fields and recent submissions display:

  • Sidebar facets
    • searchFilterAuthor: groups the metadata fields dc.contributor.author & dc.creator with a facet limit of 10, sorted by occurrence count
    • searchFilterSubject: groups all subject metadata fields (dc.subject.*) with a facet limit of 10, sorted by occurrence count
    • searchFilterIssued: contains the dc.date.issued metadata field, which is identified with the type "date" and sorted by specific date values
  • Search filters
    • searchFilterTitle: contains the dc.title metadata field
    • searchFilterAuthor: contains the dc.contributor.author & dc.creator metadata fields
    • searchFilterSubject: contains the dc.subject.* metadata fields
    • searchFilterIssued: contains the dc.date.issued metadata field with the type "date"
  • Sort fields
    • sortTitle: contains the dc.title metadata field
    • sortDateIssued: contains the dc.date.issued metadata field, this sort has the type date configured.
  • defaultFilterQueries
    • The default configuration contains no defaultFilterQueries
    • The default filter queries are disabled by default but there is an example in the default configuration in comments which allows discovery to only return items (as opposed to also communities/collections).
  • Recent Submissions
    • The recent submissions are sorted by dc.date. accessioned which is a date and a maximum number of 5 recent submissions are displayed.
  • Hit highlighting
    • The fields dc.title, dc.contributor.author & dc.subject can contain hit highlighting.
    • The dc.description.abstract & full text field are used to render search snippets.
  • Non indexed metadata fields
    • Community/Collections: dc.rights (copyright text)
    • Items: dc.description.provenance

Many of the properties contain lists that use references to point to the configuration elements. This way a certain configuration type can be used in multiple discovery configurations so there is no need to duplicate them.

Non indexed metadata fields

The discovery.xml file has configuration to not index certain metadata fields for communities/collections/items. The configuration is handled in the "toIgnoreMetadataFields" property located in the "org.dspace.discovery.configuration.DiscoveryConfigurationService" bean. Below is an example configuration that excludes dc.description.provenance for items & dc.rights for communities/collections:

Code Block
langxml
<property name="toIgnoreMetadataFields">
    <map>
        <entry>
            <key><util:constant static-field="org.dspace.core.Constants.COMMUNITY"/></key>
            <list>
                <!--Introduction text-->
                <!--<value>dc.description</value>-->
                <!--Short description-->
                <!--<value>dc.description.abstract</value>-->
                <!--News-->
                <!--<value>dc.description.tableofcontents</value>-->
                <!--Copyright text-->
                <value>dc.rights</value>
                <!--Community name-->
                <!--<value>dc.title</value>-->
            </list>
        </entry>

Modifying the Discovery User Interface (config/spring/api/discovery.xml)

The discovery.xml file is located in the [dspace]/config/spring/api directory.

Structure Summary

This file is in XML format, you should be familiar with XML before editing this file. The configurations are organized together in beans, depending on the purpose these properties are used for.
This purpose can be derived from the class of the beans. Here's a short summary of classes you will encounter throughout the file and what the corresponding properties in the bean are used for.

Download the configuration file and review it together with the following parameters

Class:

DiscoveryConfigurationService

Purpose:

Defines the mapping between separate Discovery configurations and individual collections/communities

Default:

All communities, collections and the homepage (key=default) are mapped to defaultConfiguration, also controls the metadata fields that should not be indexed in the search core (item provenance for example).

Class:

DiscoveryConfiguration

Purpose:

Groups configurations for sidebar facets, search filters, search sort options and recent submissions

Default:

There is one configuration by default called defaultConfiguration

Class:

DiscoverySearchFilter

Purpose:

Defines that specific metadata fields should be enabled as a search filter

Default:

dc.title, dc.contributor.author, dc.creator, dc.subject.* and dc.date.issued are defined as search filters

Class:

DiscoverySearchFilterFacet

Purpose:

Defines which metadata fields should be offered as a contextual sidebar browse options, each of these facets has also got to be a search filter

Default:

dc.contributor.author, dc.creator, dc.subject.* and dc.date.issued

Class:

HierarchicalSidebarFacetConfiguration

Purpose:

Defines which metadata fields contain hierarchical data and should be offered as a contextual sidebar option

Class:

DiscoverySortConfiguration

Purpose:

Further specifies the sort options to which a DiscoveryConfiguration refers

Default:

dc.title and dc.date.issued are defined as alternatives for sorting, other than Relevance (hard-coded)

Class:

DiscoveryHitHighlightingConfiguration

Purpose:

Defines which metadata fields can contain hit highlighting & search snippets

Default:

dc.title, dc.contributor.author, dc.subject, dc.description.abstract & full text from text files.

Class:

TagCloudFacetConfiguration

Purpose:Defines the tag cloud appearance configuration bean and the search filter facets to appear in the tag cloud form. You can have different "TagCloudFacetConfiguration" per community or collection or the home page

Default settings

In addition to the summarized descriptions of the default values, following details help you to better understand these defaults. If you haven't already done so, download the configuration file and review it together with the following parameters.
The file contains one default configuration that defines following sidebar facets, search filters, sort fields and recent submissions display:

  • Sidebar facets
    • searchFilterAuthor: groups the metadata fields dc.contributor.author & dc.creator with a facet limit of 10, sorted by occurrence count
    • searchFilterSubject: groups all subject metadata fields (dc.subject.*) with a facet limit of 10, sorted by occurrence count
    • searchFilterIssued: contains the dc.date.issued metadata field, which is identified with the type "date" and sorted by specific date values
  • Search filters
    • searchFilterTitle: contains the dc.title metadata field
    • searchFilterAuthor: contains the dc.contributor.author & dc.creator metadata fields
    • searchFilterSubject: contains the dc.subject.* metadata fields
    • searchFilterIssued: contains the dc.date.issued metadata field with the type "date"
  • Sort fields
    • sortTitle: contains the dc.title metadata field
    • sortDateIssued: contains the dc.date.issued metadata field, this sort has the type date configured.
  • defaultFilterQueries
    • The default configuration contains no defaultFilterQueries
    • The default filter queries are disabled by default but there is an example in the default configuration in comments which allows discovery to only return items (as opposed to also communities/collections).
  • Recent Submissions
    • The recent submissions are sorted by dc.date. accessioned which is a date and a maximum number of 5 recent submissions are displayed.
  • Hit highlighting
    • The fields dc.title, dc.contributor.author & dc.subject can contain hit highlighting.
    • The dc.description.abstract & full text field are used to render search snippets.
  • Non indexed metadata fields
    • Community/Collections: dc.rights (copyright text)
    • Items: dc.description.provenance

Many of the properties contain lists that use references to point to the configuration elements. This way a certain configuration type can be used in multiple discovery configurations so there is no need to duplicate them.

Non indexed metadata fields

The discovery.xml file has configuration to not index certain metadata fields for communities/collections/items. The configuration is handled in the "toIgnoreMetadataFields" property located in the "org.dspace.discovery.configuration.DiscoveryConfigurationService" bean. Below is an example configuration that excludes dc.description.provenance for items & dc.rights for communities/collections:

Code Block
langxml
<property name="toIgnoreMetadataFields">
    <map>
        <entry>
            <key><util:constant static-field="org.dspace.core.Constants.COMMUNITYCOLLECTION"/></key>
            <list>
                <!--Introduction text-->
                <!--<value>dc.description</value>-->
                <!--Short description-->
                <!--<value>dc.description.abstract</value>-->
                <!--News-->
                <!--<value>dc.description.tableofcontents</value>-->
                <!--Copyright text-->
                <value>dc.rights</value>
                <!--CommunityCollection name-->
                <!--<value>dc.title</value>-->
            </list>
        </entry>
        <entry>
            <key><util:constant static-field="org.dspace.core.Constants.COLLECTIONITEM"/></key>
            <list>
                <!--Introduction text-->
<value>dc.description.provenance</value>
            </list>
     <!--<value>dc.description</value>-->
                <!--Short description-->
  </entry>
              <!--<value>dc.description.abstract</value>-->
                <!--News-->
                <!--<value>dc.description.tableofcontents</value>-->
                <!--Copyright text--/map>
</property>

By adding additional values to the appropriate lists additional metadata can be excluded from the search core, a reindex is required after altering this file to ensure that the values are removed from the index.

Search filters & sidebar facets Customization

This section explains the properties for search filters & sidebar facets. Each sidebar facet must occur in the reference list of the search filters. Below is an example configuration of a search filter that is not used as a sidebar facet.

Code Block
langxml
<bean id="searchFilterTitle" class="org.dspace.discovery.configuration.DiscoverySearchFilter">
    <property            <value>dc.rights</value>name="indexFieldName" value="title"/>
    <property name="metadataFields">
           <!--Collection name--><list>
                <!--<value>dc.title</value>-->
            </list>
        </entry>
        <entry>
     property>
</bean>

The id & class attributes are mandatory for this type of bean. The properties that it contains are discussed below.

  • indexFieldName (Required): A unique search filter name, the metadata will be indexed in Solr under this field name.
  • metadataFields (Required): A list of the metadata fields that need to be included in the facet.

Sidebar facets extend the search filter and add some extra properties to it, below is an example of a search filter that is also used as a sidebar facet.

Code Block
langxml
<bean id="searchFilterAuthor" class="org.dspace.discovery.configuration.SidebarFacetConfiguration">
        <property name="indexFieldName" value="author"/>       <key><util:constant static-field="org.dspace.core.Constants.ITEM"/></key>
        <property name="metadataFields">
            <list>
                <value>dc.descriptioncontributor.provenance<author</value>
            </list>    <value>dc.creator</value>
        </entry>
    </map>
</property>

By adding additional values to the appropriate lists additional metadata can be excluded from the search core, a reindex is required after altering this file to ensure that the values are removed from the index.

Search filters & sidebar facets Customization

This section explains the properties for search filters & sidebar facets. Each sidebar facet must occur in the reference list of the search filters. Below is an example configuration of a search filter that is not used as a sidebar facet.

Code Block
langxml
<bean id="searchFilterTitle" class="org.dspace.discovery.configuration.DiscoverySearchFilter">
    <property name="indexFieldName" value="title"/>
    <property name="metadataFields">
        <list>
            <value>dc.title</value>
        </list>
    </property>
</bean>

The id & class attributes are mandatory for this type of bean. The properties that it contains are discussed below.

  • indexFieldName (Required): A unique search filter name, the metadata will be indexed in Solr under this field name.
  • metadataFields (Required): A list of the metadata fields that need to be included in the facet.
list>
        </property>
        <property name="facetLimit" value="10"/>
        <property name="sortOrder" value="COUNT"/>
        <property name="type" value="text"/>
    </bean>

Note that the class has changed from DiscoverySearchFilter to SidebarFacetConfiguration this is needed to support the extra properties.

  • facetLimit (optional): The maximum number of values to be shown. This property is optional, if none is specified the default value "10" will be used. If the filter has the type date, this property will not be used since dates are automatically grouped together.
  • sortOrder (optional):The sort order for the sidebar facets, it can either be COUNT or VALUE. The default value is COUNT.
    • COUNT Facets will be sorted by the amount of times they appear in the repository
    • VALUE Facets will be sorted alphabetically
  • type(optional): the type of the sidebar facet it can either be "date" or "text", "text" is the default value.
    • text: The facets will be treated as is
    • date: Only the year will be stored in the Solr index. These years are automatically displayed in ranges that get smaller when you select one.

Hierarchical (taxonomies based) sidebar facets

Discovery supports specialized drill down in hierarchically structured metadata fields. For this drill down to work, the metadata in the field for which you enable this must be composed out of terms, divided by a splitter. For example, you could have a dc.subject.taxonomy field in which you keep metadata like "CARTOGRAPHY::PHOTOGRAMMETRY", in which Cartography and Photogrammetry are both terms, divided by the splitter "::". The sidebar will only display the top level facets, when clicking on view more all the facet options will be displayedSidebar facets extend the search filter and add some extra properties to it, below is an example of a search filter that is also used as a sidebar facet.

Code Block
langxml
<bean id="searchFilterAuthorsearchFilterSubject" class="org.dspace.discovery.configuration.SidebarFacetConfigurationHierarchicalSidebarFacetConfiguration">
        <property name="indexFieldName" value="authorsubject"/>
        <property name="metadataFields">
            <list>
                <value>dc.contributor.author<subject</value>
                <value>dc.creator</value></list>
            </list>property>
        </property>
        <property name="facetLimitsortOrder" value="10COUNT"/>
        <property name="sortOrdersplitter" value="COUNT::"/>
        <property name="typeskipFirstNodeLevel" value="textfalse"/>
    </bean>

Note that the class has changed from DiscoverySearchFilter SidebarFacetConfiguration to SidebarFacetConfiguration HierarchicalSidebarFacetConfiguration this is needed to support the extra properties.

  • splitter (required): The splitter used to split up the separate nodes
  • skipFirstNodeLevel facetLimit (optional): The maximum number of values to be shown Whether or not to show the root node level. For some hierarchical data there is a single root node. In most cases it doesn't need to be shown since it isn't relevant. This property is optional, if none is specified the default value "10" will be used. If the filter has the type date, this property will not be used since dates are automatically grouped together.
  • sortOrder (optional):The sort order for the sidebar facets, it can either be COUNT or VALUE. The default value is COUNT.
    • COUNT Facets will be sorted by the amount of times they appear in the repository
    • VALUE Facets will be sorted alphabetically
  • type(optional): the type of the sidebar facet it can either be "date" or "text", "text" is the default value.
    • text: The facets will be treated as is
    • date: Only the year will be stored in the Solr index. These years are automatically displayed in ranges that get smaller when you select one.

Hierarchical (taxonomies based) sidebar facets

Discovery supports specialized drill down in hierarchically structured metadata fields. For this drill down to work, the metadata in the field for which you enable this must be composed out of terms, divided by a splitter. For example, you could have a dc.subject.taxonomy field in which you keep metadata like "CARTOGRAPHY::PHOTOGRAMMETRY", in which Cartography and Photogrammetry are both terms, divided by the splitter "::". The sidebar will only display the top level facets, when clicking on view more all the facet options will be displayed.

Code Block
langxml
<bean id="searchFilterSubject" class="org.dspace.discovery.configuration.HierarchicalSidebarFacetConfiguration">
    <property name="indexFieldName" value="subject"/>
    <property name="metadataFields">
        <list>
            <value>dc.subject</value>
        </list>
    </property>
    <property name="sortOrder" value="COUNT"/>
    <property name="splitter" value="::"/>
    <property name="skipFirstNodeLevel" value="false"/>
</bean>

Note that the class has changed from SidebarFacetConfiguration to HierarchicalSidebarFacetConfiguration this is needed to support the extra properties.

  • splitter (required): The splitter used to split up the separate nodes
  • skipFirstNodeLevel (optional): Whether or not to show the root node level. For some hierarchical data there is a single root node. In most cases it doesn't need to be shown since it isn't relevant. This property is true by default.

Sort option customization for search results

This section explains the properties of an individual SortConfiguration, like sortTitle and sortDateIssued from the default configuration. In order to create custom sort options, you can either modify specific properties of those that already exist or create a totally new one from scratch.

Here's what the sortTitle SortConfiguration looks like:

Code Block
langxml
<bean id="sortTitle" class="org.dspace.discovery.configuration.DiscoverySortFieldConfiguration">
        <property name="metadataField" value="dc.title"/>
        <property name="type" value="text"/>
 </bean>

The id & class attributes are mandatory for this type of bean. The properties that it contains are discussed below.

  • metadataField (Required): The metadata field indicating the sort values
  • type (optional): the type of the sort option can either be date or text, if none is defined text will be used.

DiscoveryConfiguration

The DiscoveryConfiguration Groups configurations for sidebar facets, search filters, search sort options and recent submissions. If you want to show the same sidebar facets, use the same search filters, search options and recent submissions everywhere in your repository, you will only need one DiscoveryConfiguration and you might as well just edit the defaultConfiguration.

  • true by default.

Sort option customization for search results

This section explains the properties of an individual SortConfiguration, like sortTitle and sortDateIssued from the default configuration. In order to create custom sort options, you can either modify specific properties of those that already exist or create a totally new one from scratch.

Here's what the sortTitle SortConfiguration looks like:

Code Block
langxml
<bean id="sortTitle" class="org.dspace.discovery.configuration.DiscoverySortFieldConfiguration">
        <property name="metadataField" value="dc.title"/>
        <property name="type" value="text"/>
 </bean>

The id & class attributes are mandatory for this type of bean. The properties that it contains are discussed below.

  • metadataField (Required): The metadata field indicating the sort values
  • type (optional): the type of the sort option can either be date or text, if none is defined text will be used.

DiscoveryConfiguration

The DiscoveryConfiguration Groups configurations for sidebar facets, search filters, search sort options and recent submissions. If you want to show the same sidebar facets, use the same search filters, search options and recent submissions everywhere in your repository, you will only need one DiscoveryConfiguration and you might as well just edit the defaultConfiguration.

The DiscoveryConfiguration makes it very easy to use custom sidebar facets, search filters, ... on specific communities or collection homepage. This is particularly useful if your collections are heterogeneous. For example, in a collection with conference papers, you might want to offer a sidebar The DiscoveryConfiguration makes it very easy to use custom sidebar facets, search filters, ... on specific communities or collection homepage. This is particularly useful if your collections are heterogeneous. For example, in a collection with conference papers, you might want to offer a sidebar facet for conference date, which might be more relevant than the actual issued date of the proceedings. In a collection with papers, you might want to offer a facet for funding bodies or publisher, while these fields are irrelevant for items like learning objects.

...

When searching in discovery all the groups the user belongs to will be added as a filter query as well as the users identifier. If the user is an admin all items will be returned since an admin has read rights on everything.

Customizing the Recent Submissions display

...

identifier. If the user is an admin all items will be returned since an admin has read rights on everything.

Customizing the Recent Submissions display

The recent submissions configuration element contains all the configuration settings to display the list of recently submitted items on the home page or community/collection page. Because the recent submission configuration is in the discovery configuration block, it is possible to show 10 recently submitted items on the home page but 5 on the community/collection pages.

...

The hit highlighting configuration element contains all settings necessary to display search snippets & enable hit highlighting.

...

highlighting

...

.

Info
titleDisabling hit highlighting / search snippets

You can disable hit highlighting / search snippets by commenting out the entire <property name="hitHighlightingConfiguration"> Configuration in the [dspace]/config/spring/api/discovery.xml configuration file.

PLEASE BE AWARE there are two sections where this <property> definition exists. You should comment out both. One is under the <bean id="defaultConfiguration"> and one is under the <bean id="homepageConfiguration">

Alternatively, you may also choose to tweak which fields are shown in hit highlighting, or modify the number of matching words shown (snippets) and/or number of characters shown around the matching word (maxSize).

For this change to take effect in the User Interface, you will need to restart Tomcat.

...

"More like this" configuration

...

...

The

...

"

...

The "more like this"-configuration element contains all the settings for displaying related items on an item display page.
Below is an example of the "more like this" configuration.

...

The org.dspace.discovery.SearchService object has received a getRelatedItems() method. This method requires an item & the more-like-this configuration bean from above. This method is implemented in the org.dspace.discovery.SolrServiceImpl which uses the item as a query & uses the default Solr parameters for more-like-this to pass the bean configuration to solr (https://cwiki.apache.org/confluence/display/solr/MoreLikeThis). The result will be a list of items or if none found an empty list. The rendering of this list is handled in the org.dspace.app.xmlui.aspect.discovery.RelatedItems class.items or if none found an empty list. 

"Did you mean" spellcheck aid for search configuration

...

This paragraph only applies to JSPUI

 


Code Block
languagexml
<!-- Set TagCloud configuration per discovery configuration -->
<property name="tagCloudFacetConfiguration" ref="defaultTagCloudFacetConfiguration"/>

...

Declare the bean (of class: TagCloudFacetConfiguration) that holds the configuration for the tag cloud facet. 


Code Block
languagexml
<!--TagCloud configuration bean for homepage discovery configuration-->
<bean id="homepageTagCloudFacetConfiguration" class="org.dspace.discovery.configuration.TagCloudFacetConfiguration">
        <!-- Actual configuration of the tagcloud (colors, sorting, etc.) -->
        <property name="tagCloudConfiguration" ref="tagCloudConfiguration"/>
        <!-- List of tagclouds to appear, one for every search filter, one after the other -->
        <property name="tagCloudFacets">
            <list>
                <ref bean="searchFilterSubject" />
            </list>
        </property>
</bean>

...

When tagCloud is rendered there are some CSS classes that you can change in order to change the appearance of the tag cloud.

Class
Note
tagcloudGeneral class for the whole tagcloud
tagcloud_1Specific tag class for tag of type 1 (based on score)
tagcloud_2Specific tag class for tag of type 2 (based on score)
tagcloud_3Specific tag class for tag of type 3 (based on score)

Disabling the "Has file(s)" facet

...

Should you want to turn this off, you can edit [dspace]/config/spring/api/discovery.xml to remove the following line from the defaultConfiguration and homepageConfiguration beans (in the sidebarFacets property):

 

Code Block
languagexml
<ref bean="searchFilterContentInOriginalBundle"/>

Then restart your servlet container.

Discovery Solr Index Maintenance

...

Command used:

...

[

...

dspace]/config/spring/api/discovery.xml to remove the following line from the defaultConfiguration and homepageConfiguration beans (in the sidebarFacets property):


Code Block
languagexml
<ref bean="searchFilterContentInOriginalBundle"/>

Then restart your servlet container.

Discovery Solr Index Maintenance

Command used:

[dspace]/bin/dspace index-discovery [-cbhf[r <item handle>]]

Java class:

org.dspace.discovery.IndexClient

Arguments (short and long forms):

Description


called without any options, will update/clean an existing index

-b

(re)build index, wiping out current one if it exists

-c

clean existing index removing any documents that no longer exist in the db

-f

if updating existing index, force each handle to be reindexed even if uptodate

-h

print this help message

-i <object handle>Reindex an individual object (and any child objects).  When run on an Item, it just reindexes that single Item. When run on a Collection, it reindexes the Collection itself and all Items in that Collection. When run on a Community, it reindexes the Community itself and all sub-Communities, contained Collections and contained Items.

-o

optimize search core

-r <item handle>

remove an Item, Collection or Community from index based on its handle

-sRebuild the spellchecker, can be combined with -b and -f.

It is recommended to run maintenance on the Discovery Solr index occasionally (from crontab or your system's scheduler), to prevent your servlet container from running out of memory:

[dspace]/bin/dspace index-discovery -o

(Since Solr 4, the underlying optimize operation has been discouraged as mostly unnecessary and renamed. See https://issues.apache.org/jira/browse/SOLR-3141).

Advanced Solr Configuration

Discovery is built as an application layer on top of the Solr open source enterprise search server. Therefore, Solr configuration can be applied to the Solr cores that are shipped with DSpace.
The DSpace Solr instance currently runs several cores (which means indexes in Solr parlance). The "statistics" core is for collection of DSpace usage events for statistical purposes (if you have been collecting statistics for multiple years, you may have chosen to use sharding and you will see one core per each year collected). The "search" core is used by Discovery for for search and  faceting, for displaying the collection/community hierarchy and item counts. The "authority" core is used by SolrAuthority to store information about authors, including their data imported from the ORCID registry.

Code Block
solr
├── solr.xml
├── search
│   └── conf
│

...

Java class:

...

org.dspace.discovery.IndexClient

...

Arguments (short and long forms):

...

Description

...

 

...

called without any options, will update/clean an existing index

...

-b

...

(re)build index, wiping out current one if it exists

...

-c

...

clean existing index removing any documents that no longer exist in the db

...

-f

...

if updating existing index, force each handle to be reindexed even if uptodate

...

-h

...

print this help message

...

-o

...

optimize search core

...

-r <item handle>

...

remove an Item, Collection or Community from index based on its handle

...

It is recommended to run maintenance on the Discovery Solr index occasionally (from crontab or your system's scheduler), to prevent your servlet container from running out of memory:

[dspace]/bin/dspace index-discovery -o

(Since Solr 4, the underlying optimize operation has been discouraged as mostly unnecessary and renamed. See https://issues.apache.org/jira/browse/SOLR-3141).

Advanced Solr Configuration

Discovery is built as an application layer on top of the Solr open source enterprise search server. Therefore, Solr configuration can be applied to the Solr cores that are shipped with DSpace.
The DSpace Solr instance currently runs several cores (which means indexes in Solr parlance). The "statistics" core is for collection of DSpace usage events for statistical purposes (if you have been collecting statistics for multiple years, you may have chosen to use sharding and you will see one core per each year collected). The "search" core is used by Discovery for for search and  faceting, for displaying the collection/community hierarchy and item counts. The "authority" core is used by SolrAuthority to store information about authors, including their data imported from the ORCID registry.

Code Block
solr
├── solr.xml
├── search
│   └── conf
│       ├── admin-extra.html
│       ├── elevate.xml
│       ├── protwords.txt
│       ├── schema.xml
│       ├── scripts.conf
│       ├── solrconfig.xml
│       ├── spellings.txt
│       ├── stopwords.txt
│       ├── synonyms.txt
│       └── xslt
│           ├── DRI.xsl
│           ├── example.xsl
│           ├── example_atom.xsl
│           ├── example_rss.xsl
│           └── luke.xsl
├── ...
└── statistics
    └── conf
        ├── admin-extra.html
        ├── elevate.xml
        ├── protwords.txt
        ├── schema.xml
        ├── scripts.conf
        ├── solrconfig.xml
        ├── spellings.txt
        ├── stopwords.txt
        ├── synonyms.txt
        └── xslt
          ├── DRI.xsl
│           ├── example.xsl
            ├── example_atom.xsl
            ├── example_rss.xsl
            └── luke.xsl

Internationalization

Discovery has its own messages.xml file, located at dspace-xmlui/src/main/resources/aspects/Discovery/i18n/messages.xml.  To add your own labels for new fields and facets in a Maven overlay, copy this file to dspace/modules/xmlui/src/main/resources/aspects/Discovery/i18n/messages.xml and modify this file. Alternatively, you may add them to the main messages.xml file. Same goes for translations - it's encouraged to submit a single messages_XX.xml file including messages from all the separate messages.xml files in DSpace.

Advanced search related keys (change "author" to desired field)

...

 └── luke.xsl
├── ...
└── statistics
    └── conf
        ├── admin-extra.html
        ├── elevate.xml
        ├── protwords.txt
        ├── schema.xml
        ├── scripts.conf
        ├── solrconfig.xml
        ├── spellings.txt
        ├── stopwords.txt
        ├── synonyms.txt
        └── xslt
            ├── example.xsl
            ├── example_atom.xsl
            ├── example_rss.xsl
            └── luke.xsl