Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
outlinetrue
stylenone

DuraCloud Backup & Restore Prototype for DSpace 1.6

Background & Overview

This comes out of a requirement for DSpace integration with DuraCloud (http://www.duraspace.org/duracloud.php). One of these requirements is to be able to essentially "backup" local DSpace contents into the cloud (as a type of offsite backup), and "restore" those contents at a later time.

...

The current plan is to build off of the subset of the AipPrototype (essentially the packagers, crosswalks and related changes) which begins to allow for this roundtripping of Communities and Collections.

Makeup and Definition of AIPs

AIPs are Archival Information Packages.

  • AIP is a package describing one archival object.
    • Archival object may be Item, Collection, or Community. Bitstreams are included in an Item's AIP.
    • Each AIP is logically self-contained, can be restored without rest of the archive. (So you could restore a single Item, Collection or Community)
    • AIP profile favors completeness and accuracy rather than presenting the semantics of an object in a standard format. It conforms to the quirks of DSpace's internal object model rather than attempting to produce a universally understandable representation of the object.
    • An AIP can serve as a DIP (Dissemination Information Package) or SIP (Submission Information Package), especially when transferring custody of objects to another DSpace implementation.
  • In contrast to SIP or DIP, the AIP should include all available DSpace structural and administrative metadata, and basic provenance information.
  • Restoration of an archive from AIPs is not perfectly complete at this time; it is intended to recover from catastrophic loss of content and metadata, not restore the exact same archive as before. Currently, some information (e.g. access controls, people, groups) would be lost, as they are not stored in the AIPs.

AIPs Structure

Generally speaking, an AIP is an Zip file containing a METS manifest and all related content bitstreams.

...

  • DSpace Groups, EPeople and Policies (access rights) are currently not described in AIPs. However, there is hope to include them in a future version.
  • Wiki Markup
    DSpace Site configurations (\[dspace\]/config/ directory) or customizations are not described in AIPs
  • DSpace Database model (or customizations therein) is not described in AIPs

Where to get the Code

There is an SVN sandbox area for this work (so that others can help out, if it interests them). If anyone has comments, suggestions or feedback on this idea, or would like to be involved in this project, definitely let me know (or add comments to this issue).

Code Block
 svn co http://scm.dspace.org/svn/repo/sandbox/aip-external-1_6-prototype/ 

Running the Code

Here's how to get up and running relatively quickly!

Install Prototype

  1. Download the code from the SVN Sandbox (see above).
  2. Build & Install the prototype. This is just a modified version of DSpace 1.6.0 – so, follow the normal DSpace 1.6.0 Installation procedure.
    • If you have a DSpace 1.6.0 instance already running, you can just build the code and point it at your existing DSpace 1.6.0 database & assetstore.

You'll want to have some content (Communities, Collections & Items) to test with!

Exporting AIPs

There are two main "modes" you can run the AIP packager in:

  • Single AIP (default) - Exports just an AIP describing a single DSpace object. So, if you ran it in this default mode for a Collection, you'd just end up with a single Collection AIP (which would not include AIPs for all its child Items)
  • Hierarchy (including child objects) - Exports the requested AIP describing an object, plus the AIP for all child objects. Some examples follow:
    • For a Site - this would export all Communities, Collections & Items within the site into AIP files (in a provided directory)
    • For a Community - this would export that Community and all SubCommunities, Collections and Items into AIP files (in a provided directory)
    • For a Collection - this would export that Collection and all contained Items into AIP files (in a provided directory)
    • For an Item – this just exports the Item into an AIP as normal (as it already contains its Bitstreams/Bundles by default)

Exporting just a single AIP

To export in single AIP mode (default), use this 'packager' command template:

...

The above code will export the object of the given handle (1721.1/4567) into an AIP file named "aip4567.zip". This will not include any child objects for Communities or Collections.

Exporting AIP Hierarchy