Attendees
- Andrew Woods
- Danny Bernstein
- David Wilcox
- Scott Prater
Agenda
Topic Specifying the validation tool Developing the validation tool Testing the validation toolWrap-up and next steps
Notes
Specifying the validation tool
- Specification is based on going from coarse to fine-grained validation
- Doesn't deal with disseminators, just standard Fedora 3 objects
- Fedora 6 does generate from specific files so we'll need to check if any of the migrated information ends up there
- Making some assumptions about versioning, checksums, etc.
- Could we also validate relationships?
- Fedora 3 doesn't do this so this might not be appropriate
- Possible issues
- XML attributes can appear in a different order post migration - how could we handle this?
- Size would be the same but checksum would be different
- Could run the files through sort and generate checksums to compare
- Is there a 1:1 mapping between a Fedora 3 object and a Fedora 6 object?
- Yes, F3 objects are created as F6 objects in archival groups
- XML attributes can appear in a different order post migration - how could we handle this?
- Output
- How should the results be laid out?
- A high-level report could be generated that says "I looked at 10,000 objects and they look ok"
- A second level could indicate how many objects have problems and what those problems are
- Need to also report on the number of objects that were examined to ensure nothing was skipped
- Levels of validation
- Coarse to fine grained
- Objects, object content, datastream content, versions
- Select level(s) to validate each run
- Validation based on a list of IDs rather than the entire repository
- How should the results be laid out?
Developing the tool
- Is this a command line tool?
- migration-utils is a command line tool so this should be fine
- Can it be multi-threaded?
- This would help performance for large repositories
- What programming language would be most appropriate?
- This will be file-system to file-system validation
- We already have good Java tools based on migration-utils
- Use F3 libraries to read F3 and OCFL libraries to read F6 content
- Danny and Andrew will do most of the development
- Tool should be delivered by December or January
Testing the tool
- Need a set of test fixtures that contain errors we want to tool to detect
- Need to validate against objects in a variety of repositories
- Also need to validate exported F3 objects
Next Steps
- Come up with an initial design
- Set up basic infrastructure
- Basic validation to start - number of objects
- Create JIRA tickets for the work