Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This following vignettes attempt to follow the steps of a "typical" harvest with a focus primarily on functionality, not configuration or execution.

Fetch

Excerpt Include
Fetch
Fetch
nopaneltrue
The first step of a typical harvest is the get you data from your target source.  We call this the Fetch.  For example, let us suppose we have a VIVO installation containing researchers at our university, and we want to harvest from Pubmed information on publications written by researchers at our university. In this case we would use Harvester's PubmedFetch tool to send a query off to Pubmed, which will return the results of that query to us in its own XML format.  The harvesters Fetch package (org.vivoweb.harvester.fetch) contains various methods for retrieving data from external data sources.

Translate

The next step of a typical harvest is the translation. The fetched data will be in its own format, and this needs to be converted into VIVO-compatible triples. If the input is an XML format, this can be done using the XSLTranslator tool and a .xsl file containing XSLT code specific to the data format being converted to RDF/XML triples.  Included with Harvester in the config/datamaps/ directory are several pre-written XSLT files for frequently-needed formats (including for example Pubmed).  Another standard method in harvesting data is to prepare a SPARQL Construct using the VIVO UI that will take in RDF data and transform it into the VIVO ontology.  You can use the SPARQL Translator to process SPARQL Construct files against target models.

...