Contribute to the DSpace Development Fund
The newly established DSpace Development Fund supports the development of new features prioritized by DSpace Governance. For a list of planned features see the fund wiki page.
DSpace Discovery is an Maintained Addon for DSpace XMLUI that replaces the default Search and Browse behavior with Apache Solr.
Proposal For Inclusion into DSpace 1.7.0
Recent work on porting DSpace to an Asyncronous build process prooved too large a task to be completed in DSpace 1.7 with all the other significant changes, with this in mind. It is proposed that a version of Discovery be delivered within the DSpace 1.7.0 codebase and initially maintained there for the 1.7.x development path.
This proposal includes the following features:
- Discovery will be compiled/installed into DSpace 1.7.0 XMLUI but left disabled by default
- The Appropriate Solr Search Core will be available and prepared to be started
- All dependencies for turning on Discovery will be available without recompilation by doing the following steps:
- Solr will be upgraded to the latest 1.4.1 release
- Documentation will be included which outlines exactly how to turn on Discovery and run it
Instructions for enabling Discovery in DSpace 1.7.0
- Enable the Discovery Aspects in the XMLUI by changing the following settings in xmlui.xconf
<xmlui> <aspects> <aspect name="Artifact Browser" path="resource://aspects/ArtifactBrowser/" /> <aspect name="Administration" path="resource://aspects/Administrative/" /> <aspect name="E-Person" path="resource://aspects/EPerson/" /> <aspect name="Submission and Workflow" path="resource://aspects/Submission/" /> <aspect name="Statistics" path="resource://aspects/Statistics/" /> <!-- To enable Discovery, uncomment this Aspect that will enable it within your existing XMLUI <aspect name="Discovery" path="resource://aspects/Discovery/" /> --> <!-- This aspect tests the various possible DRI features, it helps a theme developer create themes --> <!-- <aspect name="XML Tests" path="resource://aspects/XMLTest/"/> --> </aspects>
- Enable the Discovery Indexing Consumer that will update Discovery Indexes on changes to content in XMLUI, JSPUI, SWORD, and LNI.
#### Event System Configuration #### # default synchronous dispatcher (same behavior as traditional DSpace) event.dispatcher.default.class = org.dspace.event.BasicDispatcher event.dispatcher.default.consumers = search, browse, eperson, harvester # comment out above and uncomment below to enable discovery indexing # event.dispatcher.default.consumers = search, browse, discovery, eperson, harvester
- Reindex you DSpace instance into Discovery by executing the commandline reindex using the discovery IndexClient class
./dspace dsrun org.dspace.discovery.IndexClient
Instructions for Configuring Discovery
Discovery can be configured at multiple levels of the application. Outlined below will be where in Discovery changes can be made that will alter the presentation. The primary place that the user experience is altered in XMLUi is through the dspace-solr-search.cfg file
Configuring Facets that are Exposed for Search Results
##### Search Indexing ##### solr.search.server = http://localhost:8080/solr/search # Should no solr facet be configured for a certain page, this one will be used as default #Every solr facet field which ends with :date will be handled as a date #Handeling as date implies that {field.name}.year will be used for faceting solr.facets.search.1=dc.contributor.author_lc,dc.subject_lc,dateissued:date solr.facets.community=dc.contributor.author_lc,dc.subject_lc,dateissued:date solr.facets.collection=dc.contributor.author_lc,dc.subject_lc,dateissued:date # solr.facets.item=dc.contributor.author,dc.subject,dc.date.issued_dt # solr.facets.site=dc.contributor.author,dc.subject,dc.date.issued_dt # Makes sure that we have a gap from 6 years for our date fields (past 5 & the current) solr.date.gap=5 solr.date.skip.empty = true # Put any default search filters here, these filters will be applied to any search in discovery # You can specify multiple filters by separating them using ; #solr.default.filter=location:l2 # You can also specify (additional) filter(s) ## for homepage recent submissions #solr.site.default.filter= ## for community recent submissions #solr.community.default.filter= ## for collection recent submissions #solr.collection.default.filter= ## for searches #solr.search.default.filter= ## for browsing #solr.browse.default.filter= # The filters which can be selected in the search form solr.search.filter.type.1=dc.title.split solr.search.filter.type.2=dc.contributor.author.split solr.search.filter.type.3=dc.subject.split solr.search.filter.type.4=dateissued.year
Advanced Configuration in Solr
Solr itself now runs two cores. One for collection DSpace Solr based "statistics", the other for Discovery Solr based "search"
solr ├── search │ ├── conf │ │ ├── admin-extra.html │ │ ├── elevate.xml │ │ ├── protwords.txt │ │ ├── schema.xml │ │ ├── scripts.conf │ │ ├── solrconfig.xml │ │ ├── spellings.txt │ │ ├── stopwords.txt │ │ ├── synonyms.txt │ │ └── xslt │ │ ├── DRI.xsl │ │ ├── example.xsl │ │ ├── example_atom.xsl │ │ ├── example_rss.xsl │ │ └── luke.xsl │ └── conf2 ├── solr.xml └── statistics └── conf ├── admin-extra.html ├── elevate.xml ├── protwords.txt ├── schema.xml ├── scripts.conf ├── solrconfig.xml ├── spellings.txt ├── stopwords.txt ├── synonyms.txt └── xslt ├── example.xsl ├── example_atom.xsl ├── example_rss.xsl └── luke.xsl
Design Premis for Discovery
The Design premis behind Discovery is to keep as much the implementation of Search and Browse independent of DSpace as possible. The basis for this is to twofold. (a) to reduce cost in maintaining any customized code and (b) to repurpose third party solutions wherever possible (a.k.a. standing on shoulder of giants). So, the basic tenants are:
- Keep as much of the customization and configuration in Solr as possible.
- Keep it as generic as possible.
- Keep it as simple as possible
- In cases where configuration is outside Solr, Provide pluggability to replace functionality easily at end user deployment.
- Align Search/Browse capabilities with Solr capabilities, not other way around. This means, possibly abandon certain strategies for navigating via Browse if it proves these do not fit well with solr.
RoadMap
Discovery is currently and addon for DSpace that still requires significant addition of configuration files to support. Planned releases will initially coincide with DSpace Scheduled Releases. Eventually, once completely stabilized. Discovery may be included into DSpace releases as a replacement for DSpace Search and Browse out of the box.
Issue Management
Subversion Access
http://scm.dspace.org/svn/repo/modules/dspace-discovery
Installation
Under development...
Documentation
- DSpace Discovery HowTo
- Under development...
Examples in Production
- Dryad Data Repository: http://www.datadryad.org/search?query=&rpp=10&group_by=none&sort_by=score&order=DESC&submit=Go
Other Resources
- http://lucene.apache.org/solr/
- AJAX Integration: http://github.com/evolvingweb/ajax-solr http://solrjs.solrstuff.org/
- Integration With DSpace REST project for unified Search and Browse in the REST webapplication as well.
- Access Control Request Handler for Solr Access control for Solr Service, Documents and Fields.
- General specifications
- FieldType JAVADOC
- Predefined FieldTypes in SolR
- http://wiki.github.com/evolvingweb/ajax-solr/reuters-tutorial-step-9 Integrate AutoSuggest, Tag Clouds, Google Maps and dynamic Paging into DSpace Search Results.