Updates in system sources not yet reflected in profiles
back up to User Stories: Defining features and functionality VIVO needs - September 2011
User types involved
Narrative User Story (for sharing/review/voting)
to be written
Background
A VIVO instance is normally fed from multiple institutional data sources, which may reflect changes continuously or only be updated nightly or even less frequently.
With some data sources updates can be handled by wholesale RDF retraction and replacement – normally when users are not permitted to make any changes to the data, such as information from the Registrar on courses offered in a given semester. Other data such as narrative statements about research or teaching may have been edited, either by a self-editor or by a curator correcting a typo or removing unwanted HTML tags.
If the data source has appropriately scoped last modified tags, it may be possible to scan the entire data source and quickly assess the number of changes, using the assumption that the only changes present will be flagged by a last modified date later than the previous harvest. Otherwise a more laborious item-by-item comparison may be needed to determine when changes have occurred – either by comparing new data against what is in VIVO (which due to data transformations may not be as simple as it sounds) or by saving a copy of the source data from the most recent previous ingest from that data source.
In either case, a delta will indicate the number of changes and the nature of the changes – and this could be a useful report, both for assessing when to add the data to VIVO and for monitoring the nature of the changes.
Wish list for improvement
to be written
Technical considerations
to be written
Priority or staging considerations
Progress will be made on 2 fronts for VIVO 1.5:
- the "Graph and provenance metadata" component
- the "RDF API" component, which will provide a way to optionally store and retrieve logging/audit information for use in reporting.