Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The magic black box doesn't yet exist, although the tools to work with RDF are improving all the time and the VIVO ontology has gained traction as a standard for information exchange on research networking. 
  • Furthermore, unless you are starting with an empty VIVO, the process preparing RDF for VIVO will have to have be able to query your VIVO to make sure it's not duplicating data already in VIVO, including people, organizations, and the content of the dataset at hand. When your data comes to you from several sources, alignment based on names alone is prone to errors including false positive matches and false negatives, leading to duplicate URIs for the same person, organization, or other entity.
  • Finally, that the data you add will very likely not stay remain static.  Your data ingest methodology very quickly must also serve as a data updating and data removal methodology.

...

  • You can enter sample data into VIVO through its editing interfaces, export the data, and write your own scripts to produce data matching what you see. This sounds like a cop-out on the part of the VIVO community, but some people with a lot of ETL experience prefer to leverage tools they already know to produce a given target
    • The VIVO ontology team has developed a number of visual diagrams of the VIVO ontology at both overview and specific levels to help understand VIVO data, and may allow you to bypass or minimize time spent on sample data entry
    • The Karma tool from USC's Information Sciences Institute has been extended to support the creation of VIVO-compatible RDF.
    • Furthermore, there are an increasing number of open-source libraries for writing RDF with PHP, Java, and Python.
    • There are also commercially developed and supported tools including TopBraid Composer
    • If you know of other open source or commercial tools, please add links to them here)
  • You can use the VIVO Harvester, a framework for gathering data, transforming it, performing matching against data already in your VIVO instance, and adding that data directly to VIVO, bypassing the data tools in VIVO.  There is a learning curve to the Harvester, but a framework for 

...

  • number of VIVO technical teams use it extensively and have benefitted from shared experience in improving the Harvester framework on an ongoing basis
  • You can write tools to transform your data to RDF using an ontology of your choosing or creation, import that RDF into VIVO, and use tools within VIVO to align data with existing data and transform data to the VIVO ontology, as described below under "Working with semantic rather than scripting tools."

The VIVO Harvester

The VIVO harvester can be configured for a wide variety of tasks.

  • Configuration files can be adjusted to get data from different sources and in different forms.
  • The Harvester is
  • quite
  • modular. Some sites use parts of the Harvester to accomplish parts of their ingest, and use home-brewed tools for the rest.
  • Sometimes a home-brew perl script can be more easily tailored to your special needs.
  • A wide variety of tools are out there . See the pages under [to combine with the Harvester.

The Harvester has been extensively documented throughout it's lifetime by its original developers at the University of Florida and through the work of other VIVO developers and implementers at other institutions. Please see the Data ingest and the Harvester

...

  section for full details.

Working with semantic rather than scripting tools

...