Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Why?

By default, Fedora  automatically create a new version of any resource changed within a transaction.  While this auto-version feature provides robust protection against information loss, it can also cause OCFL objects to become bloated as they accumulate more and more versions.  It is conceivable that some users may wish to remove old versions to make better use of limited and/or costly storage space.    Manual versioning (which is currently supported) gives users enhanced control over what will go into a version.  Squashing - the ability to combine versions into a single version - gives user the ability to manipulate the contents of a version retrospectively.

Proposed Implementation

  1. Create a staging OCFL repository for constructing squashed objects
  2. Create the object in the staging repository
  3. Copy over each version of the original object into the staged object up until the beginning of the squash by replaying the changes that were made in each version
  4. Diff the squashed changes. For example, if you're squashing versions v4 through v8, then you'd create a diff between v3 and v8. This diff represents the changes that should be applied to the object to create the squashed v4
  5. Create the squashed version by applying the diff
  6. Proceed to copy over any remaining versions in the same manner as in step 3
  7. Validate the newly formed object
  8. Purge the original object
  9. Import the staged object
  10. Reindex

Notes

The only feature to support this that ocfl-java is currently missing is the ability to generate a diff between versions, but this would not be hard to add.The advantage to doing it in a separate repository is that you can use the same object id and do not need to rewrite the inventories later.