Page tree
Skip to end of metadata
Go to start of metadata


Brief outline of specifications to validate a Fedora 3 to OCFL migration.  The validation tool should begin with coarser validations, then progressively handle finer-grained validations.

TBD:  specify the validations to be performed as a runtime parameter (list)?

To be determined:  validation of external and redirect datastreams?

Objects

Validate: number of objects

Valid: number of objects in the OCFL repository is equal to the number of objects in the Fedora 3 repository.

Validate: object IDs

Valid: every object in the OCFL repository has the same ID as its corresponding object in the Fedora 3 repository.

Object Content

Validate: object metadata

Valid: The HEAD version of the OCFL object.nt metadata (HEAD determined from the object's top-level inventory.json manifest) matches the current version of the Fedora 3 object metadata

  • lastModifiedDate
  • createdDate
  • ownerId
  • label
  • state

Note that Fedora 3 content models will be verified as part of the examination of the RELS-EXT datastream (lower-level validation). 

Validate: list of datastreams

Valid: every datastream listed in the object's top-level (HEAD) inventory.json manifest matches the list of current (HEAD) version of the datastreams in the Fedora 3 repository.

Datastream Content

Validate: datastream metadata

Valid: The HEAD version of the OCFL <DSID>.nt metadata (HEAD determined from the object's top-level inventory.json manifest) matches the current version of the Fedora 3<DSID> metadata

  • messageDigest
  • size
  • mimeType
  • state
  • title
  • identifier (DSID)
  • lastModified
  • created

Validate: datastream size

Valid: the size of the HEAD version of the datastream in OCFL matches the size of the datastream on disk in the Fedora 3 repository

Valid: the size of the HEAD version of the datastream in OCFL recorded in <DSID>.nt matches the size of the OCFL file on disk

Validate:  datastream checksum

Valid: the algorithm type and checksum value of the datastream recorded in the HEAD version of OCFL <DSID>.nt metadata file matches the type and value of the checksum for the datastream in the Fedora 3 repository

Valid: the checksum of the HEAD version of the datastream file in OCFL matches the checksum value recorded in the corresponding <DSID>.nt metadata file, when calculated using the algorithm recorded in the metadata file

Versions

Validate: object versions

To the best of my knowledge, Fedora 3 does not version changes to object metadata.  Suggestions for validating changes to object metadata, as opposed to datastream versions, are welcome.

Validate: number of datastream versions

Valid: number of datastream versions for a datastream in an object in OCFL matches the number of versions of the same datastream in the same object in the Fedora 3 repository

Validate: datastream versions

Valid: each datastream version in the OCFL object has a corresponding version in the Fedora 3 repository, as determined by matching the datastream creationDate recorded in the OCFL version <DSID>.nt file to a datastream version creationDate in the Fedora 3 repository

Validate: datastream version content

Perform the validations listed above under "Datastream Content" on each version of a datastream in OCFL.


Other Requirements

Report

The results of a validation should provide a report that allows the user to understand an overall summary of the results,  lists of error types by count, the list of objects by error type,  summary of errors by object and detailed validation logs by object

Resume from object

The tool should allow the user to start a new validation process from a specified object id.

Validate objects in list

The user should be able to provide a list of object IDs to include in a validation routine.

Validate all objects

The user should be able to instruct the validator to validate all objects in the repository.







  • No labels