Overview

This document includes general discussions about the overall approach and descriptions of known existing change management approaches for various institutions.


General Discussion on Approaches

Question: Which might be easier...  When data is changed, approaches to identifying changes...  


How to distinguish the impact of changes?  Determines how much info to present.  Human users want to focus in on impactful changes.


Best practices questions:


Two audiences of the stream:

Action based workflows...




Change management documents produced by authoritative providers

Library of Congress

LOC currently implements ATOM feeds for each of its datasets.  Examples:

This provides information about whether and when resources have been created, updated, or deleted/deprecated.

LOC is in the process of implementing Activity Stream.  This work will duplicate the information communicated with the current ATOM feeds, but will also permit LOC to offer more specific activity streams, such as one devoted solely to authoritative label changes.  Some issues being considered:


MeSH


Getty


3rd party vendor notification

When an entity is requested and is not available, some 3rd party vendor's allow the requestor to be notified when the authority provider adds the entity to the authority.

Example format:




Change management processing by downstream users

Traditional authority file in an ILS

For many libraries, "authority change management" is a paid service acquired from a third party. Because monitoring changes in an authoritative source and determining whether the change is relevant to one's own system is not a simple task, and requires specialized software that may or may not be available in an ILS, many libraries have opted to outsource the task to an authorities vendor (e.g. Backstage Library Works). Services that the vendor provides may include the following:

  1. Keep a copy of the library's authority file. This file may be provided by the library at the beginning of the contract, or built by the vendor by matching headings in bibliographic records against one or more authoritative sources.

  2. Keep track of all unmatched headings (i.e. headings used in bib records that had no match when last searched) and partial matches (e.g. only the name portion of a name/title heading has a match, or only the main subject heading of a main plus subdivision(s) heading has a match)

  3. Monitor changes in the authoritative source (i.e. new additions, change of an existing record, deletion or deprecation of an existing record)

  4. Based on this monitoring, provide an updated record when a record that is in the library's authority file has changed.

  5. At a specified period (e.g. every three months) re-search all previously unmatched headings against the latest version of the authoritative source to see if there is now a match. If so, provide the matching "new" record to the library or have the library send the bib records with that heading for re-processing.

  6. At a specified period (e.g. every three months) re-search all previous partial matches against the latest version of the authoritative source to see if there is now a full match. If so, provide the full match record or have the library send bib records with the partial match for re-processing

By using such a service, authority change management for the library is reduced to loading records provided by the vendor. The complicated task of monitoring the change stream of the authoritative source and determining  what is relevant is performed by the vendor. And because the boundary of a MARC authority record is unambiguous (it begins with the first character of the leader and ends with the record terminator), maintaining a changed record can be done simply by swapping out the entire record. Knowing what has changed may be unnecessary or is already specified in the profile set up with the vendor that determines what kind of changed records should be supplied.

This same model could be applied, with some variations, to future entity-based cataloging (e.g. records created in Sinopia). In fact, Stanford is talking with Share-VDE right now to establish such a service. However, there are some major differences between the MARC environment and the RDF environment, namely:



Change document specification approaches

rdflib

Reference: https://rdrr.io/cran/rdflib/man/rdflib-package.html

This is an R package used to manipulate triples.  Worth looking at how it specifies changes.

rdf_add

Reference: https://rdrr.io/cran/rdflib/man/rdf_add.html

rdf_add could be used to specify adding a triple.

Example:

rdf <- rdf()
rdf_add(rdf, 
    subject="http://www.dajobe.org/",
    predicate="http://purl.org/dc/elements/1.1/language",
    object="en")
    
## non-URI string in subject indicates a blank subject
## (prefixes to "_:b0")
rdf_add(rdf, "b0", "http://schema.org/jobTitle", "Professor") 

## identically a blank subject.  
## Note rdf is unchanged when we add the same triple twice.
rdf_add(rdf, "b0", "http://schema.org/jobTitle", "Professor", 
        subjectType = "blank") 
        
## blank node with empty string creates a default blank node id
rdf_add(rdf, "", "http://schema.org/jobTitle", "Professor")   
                    

## Subject and Object both recognized as URI resources:
rdf_add(rdf, 
        "https://orcid.org/0000-0002-1642-628X",
        "http://schema.org/homepage", 
        "http://carlboettiger.info")  

 ## Force object to be literal, not URI resource        
rdf_add(rdf, 
        "https://orcid.org/0000-0002-1642-628X",
        "http://schema.org/homepage", 
        "http://carlboettiger.info",
        objectType = "literal")  
        


SPARQL

This is how you add/delete triples with SPARQL.

PREFIX dc: <http://purl.org/dc/elements/1.1/>
INSERT { <http://example/egbook> dc:title  "This is an example title" } WHERE {}


PREFIX dc: <http://purl.org/dc/elements/1.1/>
DELETE DATA { <http://example/egbook> dc:title  "This is an example title" }