up to all Development Components for VIVO 1.5

Component description

For any graph in use by the VIVO application, this feature will support storing some metadata about the provenance of the triples in that graph, including any new graphs that may be created as VIVO is running. Unless a VIVO implementation segments data into a large number of tiny graphs, this feature will not support detailed provenance about particular assertions. It will only support provenance for the current state of a collection of assertions in a graph.

If the VIVO application needed, for example, to record the fact that a particular assertion was hand-edited through the interface, it would need to use a particular graph only for manual edits (and not populate this graph with triples by any other means).

Scope of the component for v1.5

This component involves creating a way for users to access this metadata in a way meaningful to them (end users see something different from DB admins), as well as a way to expose provenance metadata to linked data consumers so they can make decisions about how to use the RDF data they harvest.

This feature should be useful both to people who have profiles in VIVO and those who manage ingest of content.

Implementation sites will have control over the way data is segmented intro graphs (and thus over the granularity of provenance) as well as how the provenance data is displayed (e.g. will there be a contact option for correcting/updating content at the source, etc.?)

Design work needed for the component

what metadata to record about each graph

  • PROV-O? Some other ontology?
  • where to store the metadata
  • How to expose provenance metadata
  • PROV-AQ?

    application changes

    Semantic infrastructure changes to support modifying and querying provenance metadata

    UI design

    UI for an end user to discover where data on his/her profile came from
  • how does user accesses this feature?
  • what does it look like
  • appropriate language that’s actually meaningful to the user (e.g. not “this came from http://blah.blah/kb-2” but something “this came from the HR database,” “this came from the main VIVO database,” etc. Possibly with date/time the source last did an update.

UI design for admin interface (to set names for sources, optionally set contact info for sources, etc.)

any new permissions/roles required to support above features

Can this component be addressed in stages?

Groundwork could be laid by initially storing metadata without any useful way of displaying it to users (other than writing SPARQL queries, etc.) but this feature would be most useful with its corresponding UI work.

Dependency on any other component

Will go through the RDF API but no particular dependencies.

A suggested incremental plan

  • No labels