Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Change management documents produced by authoritative providers

Library of Congress

LOC currently implements Activity Stream and has types.ATOM feeds for each of its datasets.  Examples:

This provides information about whether and when resources have been created, updated, or deleted/deprecated.

LOC is in the process of implementing Activity Stream.  This work will duplicate the information communicated with the current ATOM feeds, but will also permit LOC to offer more specific activity streams, such as one devoted solely to authoritative label changes.  Some issues being considered:

  •  URI URI belongs to a scheme when active (e.g. subject scheme).  This is identified by a triple.  The scheme triple is removed when it is deprecated.  Point is: Deprecation alters the resource in fundamental ways.
  • Sometimes a term may move from one authority (e.g. subject) to another (e.g. genre).  Adds a triple identifying the new term (e.g. use_instead predicate).  Complicated when the change is a split.  Which term's URI should be used for use_instead?

Activity Stream is under development and is likely going to be a change management feed using JSON-LD.  Currently, LOC is using an Atom Feed to provide some basic change management info.

MeSH

  • documents deletions

Getty

  • uses Activity Streams

...

  • . This makes redirection or advertisement of the 'replacement' URI easy.  But sometimes a single term may be split into two or more.  No longer a one-to-one replacement, it is unclear what a suitable replacement would be for the old URI.


MeSH

  • documents deletions


Getty


3rd party vendor notification

When an entity is requested and is not available, some 3rd party vendor's allow the requestor to be notified when the authority provider adds the entity to the authority.

Example format:



...

Change management processing by downstream users

Traditional authority file in an ILS

For many libraries, "authority change management" is a paid service acquired from a third party. Because monitoring changes in an authoritative source and determining whether the change is relevant to one's own system is not a simple task, and requires specialized software that may or may not be available in an ILS, many libraries have opted to outsource the task to an authorities vendor (e.g. Backstage Library Works). Services that the vendor provides may include the following:

  1. Keep a copy of the library's authority file. This file may be provided by the library at the beginning of the contract, or built by the vendor by matching headings in bibliographic records against one or more authoritative sources.

  2. Keep track of all unmatched headings (i.e. headings used in bib records that had no match when last searched) and partial matches (e.g. only the name portion of a name/title heading has a match, or only the main subject heading of a main plus subdivision(s) heading has a match)

  3. Monitor changes in the authoritative source (i.e. new additions, change of an existing record, deletion or deprecation of an existing record)

  4. Based on this monitoring, provide an updated record when a record that is in the library's authority file has changed.

  5. At a specified period (e.g. every three months) re-search all previously unmatched headings against the latest version of the authoritative source to see if there is now a match. If so, provide the matching "new" record to the library or have the library send the bib records with that heading for re-processing.

  6. At a specified period (e.g. every three months) re-search all previous partial matches against the latest version of the authoritative source to see if there is now a full match. If so, provide the full match record or have the library send bib records with the partial match for re-processing

By using such a service, authority change management for the library is reduced to loading records provided by the vendor. The complicated task of monitoring the change stream of the authoritative source and determining  what is relevant is performed by the vendor. And because the boundary of a MARC authority record is unambiguous (it begins with the first character of the leader and ends with the record terminator), maintaining a changed record can be done simply by swapping out the entire record. Knowing what has changed may be unnecessary or is already specified in the profile set up with the vendor that determines what kind of changed records should be supplied.

This same model could be applied, with some variations, to future entity-based cataloging (e.g. records created in Sinopia). In fact, Stanford is talking with Share-VDE right now to establish such a service. However, there are some major differences between the MARC environment and the RDF environment, namely:

  • In MARC, what constitutes an authority record is clear and unambiguous. In RDF, you have an entity description in the form of a graph. As has already been pointed out elsewhere in this document, the boundary of such an authority "record" is not as clear. For example:

    • Should the "record" be all the triples that have that entity as the subject?

    • If a statement points to another node (either with URI or without), should that node be included? How to determine whether it should be or not?

    • Should all nodes that are not empty be included, and all nodes that are empty be considered the boundary of that "record"?

    • What if you have an authoritative entity that is the logical combination of multiple entities, e.g. a series could be construed as the combination of a bf:Work and a bf:Instance, in that case do you include two graphs as one authority "record"?

    • When would it be appropriate to unambiguously identify a graph by naming it, i.e. add a fourth element to the triple to tie a bunch of related triples together. The advantage of that is you can swap out the old version and replace it with the new without having to know what has changed.
       
  • In RDF, you don't necessarily cache the whole thing locally as you do a MARC authority record. If you only cache the label and the URI, does it mean that you don't care if a variant label is added to the description? Perhaps this last point could be addressed by setting up a profile with the vendor, so that you only get what is relevant to you, leaving the rest at the source. This would require the more granular monitoring of the "change stream" that we have discussed in our meetings


...

Change document specification approaches

...