Much of this documentation remains the same across VIVO releases, but some may not have been fully updated to the most recent release. While we will attempt to identify and alert you in such cases, please be aware that your VIVO may look and act slightly differently from what is represented here.
This document provides an overview of the data ingest process for VIVO.
- The early sections describe some typical data sources for VIVO and the challenges often associated with multiple values, representing information that is only true for certain periods of time, and the like.
- Later sections describe different technical and workflow options for loading data into VIVO.
- Some more specific examples are provided, but readers should expect to have to modify or extend examples to reflect their local data needs, the format of sources, and the depth of technical skills available to them, such as the ability to write or modify XSLT scripts
Other sources of information
How VIVO differs from a spreadsheet, where VIVO data typically comes from, cleaning data prior to loading, matching against data already in VIVO, and doing further cleanup once it's in VIVO
- Ingest tools: home brew or off the shelf? — Major options including the Harvester, semantic ingest tools such as Karma, and XSLT
- Typical ingest processes — Alternative approaches to ingest and making ingest repeatable
- Challenges for data ingest — Challenges in the data, in workflow, in working incrementally, in modeling, and in migration
- Monitoring for quality
- Name disambiguation and entity resolution
- Advanced PubMed name matching diagram
- Alternative converters from tabular data to RDF
- Ingest Workflow Language
- VIVO PHP Person Data Library
- 2011 Conference Workshop on Extended Ingest by Example
- 2012 VIVO Conference workshop: Survey of VIVO Data Ingest Methods
- VIVO 1.2 Data Ingest Guide
- A Generalizable, XSLT Based RDF Ingest Example
- Faculty affiliation ingest example