Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3
Table of Contents
outlinetrue
stylenone
Info
AIP Backup & Restore Documentation
AIP Backup & Restore Documentation

The official AIP Backup & Restore Documentation is avaliable at: AIP Backup and Restore. This page is just for discussion around the existing feature and how to make it better.

AIP Backup & Restore Discussion

This page serves as a Discussion page for the AIP Backup and Restore feature, first released in DSpace 1.7.0.

Below you'll find more information pertaining to decisions made when developing this feature, etc.

Feel free to add your own notes!

Notes on Development

What code has really changed (as of DSpace 1.7)?

The majority of the code changes are in two main areas:

  1. org.dspace.content.packager.* - Packager classes
    • PackageIngester interface - Now ingests 'java.io.File' objects instead of InputStreams (to better support recursive imports of Communities/Collections)
    • PackageDisseminator interface - Now exports 'java.io.File' objects instead of OutputStreams (to better support recursive exports of Communities/Collections)
    • DSpaceAIPDisseminator - Disseminates/Exports AIP(s)
    • DSpaceAIPIngester - Ingests exported AIP(s)\
    • Changes were also made to refactor / enhance the AbstractMETSDisseminator, AbstractMETSIngester, and METSManifest classes
  2. org.dspace.content.crosswalk.*
    • AIPDIMCrosswalk - Crosswalks DIM metadata for AIPs
    • AIPTechMDCrosswalk - Crosswalks METS TechMD sections for AIPs
    • There were also changes to the MODSDisseminationCrosswalk and XSLTDisseminationCrosswalk to support creating "Site" AIPs
Note
titleFor More Information

For a full list of code changes (including patches) see: AipCoreAPIChanges

Warning
titleWarning For Developers

Because of the changes to the PackageIngester and PackageDisseminator interfaces, if you've created any local Packagers at your institution, those will need to be refactored.

To-Do List – What remains to be done!

Testing Special Cases during Restore/Replace

The below special cases need further testing, especially when performing a "Restore" or "Replace". Mostly, these are just notes for Tim (and other developers), to ensure that all these various "edge" cases can be restored properly (or perhaps not restored properly, if the decision is made that it needs not be restored).

As each special case is implemented, we can check off the item in the below list. Special cases which have been fully tested & implemented are marked with a (tick). Feel free to add more special cases to this listing, if we missed anything.

Anything not marked with a (tick) either is not working or has not yet been fully tested! If you test it and it works, let us know, so we can check it off the list!

Item Restoration/Replacement

Special Cases

  • (tick) Restore existing Deposit License from AIP – i.e. do not add a new license (or change the license) during restore/replace
  • (tick) Restore existing CC License(s)
  • (tick) Restore item mappings to multiple collections (for items which are mapped to several collections)
  • (tick) Restore withdrawal state
  • (tick) Restore embargo state
  • (tick) Restore permissions & roles (user/group permissions) on Items, Bundles & Bitstreams
  • (tick) Restoring metadata in a custom Metadata Field (i.e. non-default "dc" field)
  • (tick) Restoring metadata in a custom Metadata Schema (i.e. not "dc").
    • Note: Schema must be created manually, but after that, the fields will be auto-created and auto-restored.
  • (tick) Restore item having no bitstreams and/or no bundles.
  • Options to restore just metadata or just particular bitstreams/bundles?
    • (tick) Exists on export, but not yet on import.
  • Will not restore items which have not made it into the "archived" state. In other words, at this time, there are no plans to restore items which are still in an approval workflow (WorkflowItems) or items which are unfinished submissions (WorkspaceItems). WorkspaceItems and WorkflowItems are never exported as AIPs.

Collection Restoration/Replacement

Special Cases

  • (tick) Restore permissions & roles (user/group permissions) on Collections
    • (tick) Restore Workflow approval groups
  • (tick) Restore Collection-specific license
  • (tick) Restore Collection's Item Template?
  • Restore Collection's content source info? (e.g. OAI-Harvesting Collections versus normal Collections)

Community Restoration/Replacement

Special Cases

  • (tick) Restore permissions & roles (user/group permissions) on Communities

Admin UI work

As part of the CurationTaskProposal (led by Richard Rodgers & MIT), a new Curation Framework is in the works. This Curation Framework will have a Command Line interface initially. However, the goal for 1.7, is to also have Administrative UI tools which are able to kick off various "curation tools". Among these curation tools will be the ability to export/import AIPs via the Admin UI.

Notes on AIP ingest speed & improving it

Some very basic ingestion speed tests were performed on a set of 26 AIPs (which represented a Community containing a Collection containing 24 Items). These tests found that, by default, the parsing/ingest settings are currently not optimized for speed.

Here are the basic (non-scientific) results

Discussion / Use Cases

Please add your own potential use cases or discussion topics

  • MIT Use Cases - Notes on defining common operations in a replication system.

Questions / Comments?

Questions or comments – either add them inline above, or contact Tim Donohue