- harvesting information from several independent installations of VIVO or other software that can produce RDF compatible with the VIVO ontology in one of 3 (or possibly more) ways
- responding to linked open data requests in one of several RDF serializations
- note that this may be directly from a VIVO application or from Harvard Profiles
- or from another application configured to return RDF
- e.g., Iowa's Loki software does not store data natively in RDF but can return it in response to linked data requests
- or using D2R (http://d24q.org)
- or using tools such as John Fereira's semantic services, although these were designed to deliver data from VIVO to other applications not configured to consume RDF directly
- returning an entire file of RDF from a web-accessible directory (a file with only the statements about the URI requested; it my also be possible to return one big file containing that URI)
- responding to SPARQL query requests from a public SPARQL endpoint
- or, if the harvesting tool is provided with credentials, from a private SPARQL endpoint
- indexing the information harvested, including the original URI in the source system and a subset of the content associated with that URI in the source system, to facilitate text-based searching
- providing a simple, Google-like search with options to limit in advance by type of result (e.g., people, organizations, publications, events)
- providing results that have been relevance ranked across the sources being searched, in contrast to federated searches
- providing short snippets of text for each result to aid interpretation
- providing faceted display to aid users in filtering results; the two current facets are source institution and the type of result
- linking back from each result to the source so that the full scope of the result can be seen in its original context
- What features are desired for the search?
- What type of search?
- What is the goal of the search?
- Full text? - yes
- "semantic"? - future – the indexing takes advantage of the semantic structure of the VIVO ontology to include relevant text in the Solr document for each entry, but the search interface does not support queries that depend directly on the semantic relationships (e.g., find all principal investigators of grants investigating cancer who have collaborations with researchers on depression)
- faceted? - yes, though this could benefit from expansion
- Complex queries? - future
- For people? - yes
- For publications, organizations, etc? - yes, but needs futher further refinement
Make a index to support the desired types of search and have a web site that facilitates user with querying that index. Keep that index up-to-date.