Panel

Excerpt
VIVO and Solr are two distinct web applications that act as one. Solr gives users the ability to search the VIVO data. VIVO also uses Solr for some of its internal data retrieval.

What is Solr?

Solr is an open-source, enterprise level search platform, available from Apache. It is based on the popular Lucene search engine. VIVO uses a standard instance of Solr, without modification. You can learn more about Solr at the Apache Solr home page.

...

In a typical VIVO installation, Solr is hidden behind VIVO, and the users cannot access it directly. In general, they don't know that Solr exists as an application.

How does VIVO use Solr?

VIVO uses the Solr search engine in two ways:

as a service to the end user,
as a tool within the structure of the application.

Solr for the end user.

Like many web sites, VIVO includes a search box on every page. The person using VIVO can type a search term, and see the results. This search is conducted by Solr, and the results are formatted and displayed by VIVO.

...

Solr allows for a "faceted" search, and VIVO displays the facets on the right side of the results page. These allow the user to filter the search results, showing only entries for people, or for organizations, etc.

Solr within VIVO

VIVO is based around an RDF triple-store, which holds all of its data. However, there are some tasks that a search engine can do much more quickly than a triple-store. Some of the fields in the Solr search index were put there specifically to help with these tasks.

...

Record counts on VIVO's index pages are obtained using the same type of Solr query.

How is Solr created and configured?

The VIVO distribution includes a copy of Solr WAR file. When VIVO is installed the Solr WAR file is deployed to Tomcat as a web application.

...

If you are installing VIVO in a servlet container other than Tomcat, or if you are installing Solr in a separate servlet container, you will need to tell Solr how to find its home directory. See the instructions in Building a VIVO distribution for other servlet containers.

The search index

What is in the index?

The Solr search index contains one record for each Individual in VIVO, unless that individual is explicitly excluded from the index. Exclusions are usually made for individuals that represent "context nodes" in the VIVO data model.

...

The VIVO data model also contains an individual that represents this teaching activity. That individual will be excluded from the index, since users would almost certainly prefer to find the teacher or the course in their search results, rather than the concept that connects the two.

What is in each record?

Each record in the search index contains several fields (see the chart below). The most commonly used field is alltext, In the record for a faculty member, alltext will contain her name, the name of her department, the names of her classes, the names of her papers and grants, etc. So, if you search for "Carpenter", you might see results for people named Carpenter, people in the Carpentry department, people who have written papers about carpentry, or have worked on grants about carpentry. You would also see results for the department itself, for the papers, and for the grants.

Panel

title	Solr index fields, VIVO 1.6

DocId	nameRaw	PREFERRED_TITLE
URI	nameText	siteURL
ALLTEXT	nameLowercase	siteName
ALLTEXTUNSTEMMED	nameLowercaseSingleValued	THUMBNAIL
classgroup	nameUnstemmed	THUMBNAIL_URL
type	nameStemmed	indexedTime
mostSpecificTypeURIs	acNameUntokenized	timestamp
BETA	acNameStemmed	etag
PROHIBITED_FROM_TEXT_RESULTS	NAME_PHONETIC

When is the index updated?

During normal operation

When an individual is added, edited, or delete through VIVO's user interface, Solr is given the new information and the index is updated.

...

When an individual is added/edited/deleted, Solr is given the new information and updates the index.
Sometimes the index must be rebuilt
- Most commonly, after an ingest, since some of the ingest mechanisms bypass the usual VIVO framework
  - It would be too slow to update the Solr index on each new statement from the ingest
  - Working to add a search-aware ingest method, which Harvester or other tools could use.
- A rebuild is done on the side, then replaces the previous index, and Solr switches to the rebuilt one.
- send it requests to
  - search
  - add, update, or delete records

Customizing the index

Note
In progress

Building the record
Exclusions

How does VIVO contact Solr?

Note
In progress

Need to tell VIVO how to contact Solr
- Authorization tests, now obsolete
VIVO may start before Solr does. Usually does.

...

Page tree

Versions Compared

Old Version 2

New Version 3

Key

What is Solr?

How does VIVO use Solr?

Solr for the end user.

Solr within VIVO

How is Solr created and configured?

The search index

What is in the index?

What is in each record?

When is the index updated?

During normal operation

Customizing the index

How does VIVO contact Solr?

Page tree

Page History

Versions Compared

Old Version 2

New Version 3

Key

What is Solr?

How does VIVO use Solr?

Solr for the end user.

Solr within VIVO

How is Solr created and configured?

The search index

What is in the index?

What is in each record?

When is the index updated?

During normal operation

Customizing the index

How does VIVO contact Solr?