Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Change management processing by downstream users

Traditional authority file in an ILS

For many libraries, "authority change management" is a paid service acquired from a third party. Because monitoring changes in an authoritative source and determining whether the change is relevant to one's own system is not a simple task, and requires specialized software that may or may not be available in an ILS, many libraries have opted to outsource the task to an authorities vendor (e.g. Backstage Library Works). Services that the vendor provides may include the following:

  1. Keep a copy of the library's authority file. This file may be provided by the library at the beginning of the contract, or built by the vendor by matching headings in bibliographic records against one or more authoritative sources.

  2. Keep track of all unmatched headings (i.e. headings used in bib records that had no match when last searched) and partial matches (e.g. only the name portion of a name/title heading has a match, or only the main subject heading of a main plus subdivision(s) heading has a match)

  3. Monitor changes in the authoritative source (i.e. new additions, change of an existing record, deletion or deprecation of an existing record)

  4. Based on this monitoring, provide an updated record when a record that is in the library's authority file has changed.

  5. At a specified period (e.g. every three months) re-search all previously unmatched headings against the latest version of the authoritative source to see if there is now a match. If so, provide the matching "new" record to the library or the have library send bib records with that heading for re-processing.

  6. At a specified period (e.g. every three months) re-search all previous partial matches against the latest version of the authoritative source to see if there is now a full match. If so, provide the full match record or have the library send bib records with the partial match for re-processing

By using such a service, authority change management for the library is reduced to loading records provided by the vendor. The complicated task of monitoring the change stream of the authoritative source and determining  what is relevant is performed by the vendor. And because the boundary of a MARC authority record is unambiguous (it begins with the first character of the leader and ends with the record terminator), maintaining a changed record can be done simply by swapping out the entire record. Knowing what has changed may be unnecessary or is already specified in the profile set up with the vendor that determines what kind of changed records should be supplied.

This same model could be applied, with some variations, to future entity-based cataloging (e.g. records created in Sinopia). In fact, Stanford is talking with Share-VDE right now to establish such a service. However, there are some major differences between the MARC environment and the RDF environment, namely:

  • In MARC, what constitutes an authority record is clear and unambiguous. In RDF, you have an entity description in the form of a graph. As has already been pointed out elsewhere in this document, the boundary of such an authority "record" is not as clear. For example:

    • Should the "record" be all the triples that have that entity as the subject?

    • If a statement points to another node (either with URI or without), should that node be included? How to determine whether it should be or not?

    • Should all nodes that are not empty be included, and all nodes that are empty be considered the boundary of that "record"?

    • What if you have an authoritative entity that is the logical combination of multiple entities, e.g. a series could be construed as the combination of a bf:Work and a bf:Instance, in that case do you include two graphs as one authority "record"?

    • When would it be appropriate to unambiguously identify a graph by naming it, i.e. add a fourth element to the triple to tie a bunch of related triples together. The advantage of that is you can swap out the old version and replace it with the new without having to know what has changed.
       
  • In RDF, you don't necessarily cache the whole thing locally as you do a MARC authority record. If you only cache the label and the URI, does it mean that you don't care if a variant label is added to the description? Perhaps this last point could be addressed by setting up a profile with the vendor, so that you only get what is relevant to you, leaving the rest at the source. This would require the more granular monitoring of the "change stream" that we have discussed in our meetings


...

Change document specification approaches

...