Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Replication Task Used:

Estimate Storage Space for AIP(s) (

Task ID: estaipsize)

In order to budget for replication storage, she needs to know the 'size' of the collection. When she asks her sysadmin, he replies that it is easy to give her figures for the whole asset store, but since collections aren't stored separately, she would have to add up each item's bitstreams in the collection, a rather tedious process. Thus the first task: a reporting tool which operates on natural DSpace objects, rather than storage volumes.

...

Replication Task Used:

Transmit AIP(s) to Storage (

Task ID: transmitaip)

Having secured approval to replicate 'Amazing Images' collection, our curator obviously needs a task to generate the AIP representations of each item in the collection, and transmit these archive files to the replication storage site (which may be service-backed, local, in the cloud, etc, as will be explored below). Adding this task is just like the previous step: editing into curate.cfg the configuration properties. (We won't repeat a description of this process each time, but note that you may always add a task, but elect not to display it in the administrative UI.). This task is 'org.dspace.ctask.replicate.TransmitAIP'.

...

Replication Task Used:

Verify AIP(s) exist in Storage (

Task ID: verifyaip)

While the transmitAIP task will report on whether or not it was successful in generating and transmitting AIP(s) to the replication service, our data curator wants the ability (within DSpace, not by using the replication service tools or UIs) to check whenever she likes that the AIP(s) which were transmitted are still there. A simple task 'org.dspace.ctask.replicate.VerifyAIP' can perform this function.

...

Replication Task Used:

Audit against AIP(s) (

Task ID: auditaip)

The 'Amazing Images' collection is comparatively static, meaning that few new items are likely to be added, and most of the metadata in each item is not routinely changed. However, over longer periods of time, cataloging errors are discovered and corrected, perhaps formats become obsolete and new bitstreams are added. If the curator is fastidious about each change, and performs the 'transmitaip' task on each item that has changed, then in general the set of AIP replicas will always be 'in sync' with the repository. However, it useful to have the means to ensure that the replicas agree with the repository without having to create and transmit entirely new ones. Thus the task: 'org.dspace.ctask.replicate.CompareWithAIP', which can also be thought of as a simple audit task. When performed on an Item, the task does the following:

...

Replication Tasks Used:

Restore Missing Objects(s) from AIP(s) (

Task ID: restorefromaip)

 

Replace Existing Object(s) with AIP(s) (

Task ID: replacewithaip)

 

Restore Missing Object(s) but Keep Existing Objects (*METS-AIP)

Task ID: restorekeepexisting) (only supported for METS-based AIPs)

 

Restore Single Object from AIP (*METS-AIP)

Task ID: restoresinglefromaip) (only supported for METS-based AIPs)

 

Replace Single Object with AIP (*METS-AIP)

Task ID: replacesinglewithaip) (

NOTE: Those tasks marked (*METS-AIP) are only supported

...

when using METS-based AIPs

...

The AIPs in the replica store represent an insurance policy, and when 'claims' against that policy are filed, they can cover 2 situations: either the repository object is completely missing, and we want to restore it, or it is damaged and we want to repair the damage with data from the replica store AIP. A pair set of replication tasks perform these functions: 'org.dspace.ctask.replicate.RecoverFromAIP'

Restoring Object(s)

The "Restore" (restorefromaip) task will do the following:

  • fetch the replica store AIP for the identifier given to the task
  • decompress it and create a new DSpace object
  • install the object into the repository, including restoring it's state (withdrawn, embargoed, etc)
  • if the object is a collection or community, all child objects (e.g. items) will also have their AIP fetched, decompressed and restored

This task will fail if there This task will fail if there is already an object in the repository bearing the identifier given.

If you are using METS-based AIPs, two additional restoration tasks are available:

  1. Restore Single Object from AIP (restoresinglefromaip)
    • This task acts the same as the default "restorefromaip" task, but it does NOT restore any child objects. So, if it is run on a collection, just the collection itself will be restored (items in that collection will not be restored).
  2. Restore Missing Object(s) but Keep Existing Objects (restorekeepexisting)
    • This task acts similar to the default "restorefromaip" task, but it attempts to skip over any objects which already exist in the repository. In other words an error is not thrown if an object already exists – rather that entire object (and all its child objects) are skipped over during processing and left unchanged. This mode is identical to the "Keep Existing" mode of the DSpace AIP Backup and Restore tool.

Replacing Object(s)

By contrast, the task 'org.dspace.ctask.replicate.ReplaceWithAIP' (the 'repair' task), expects an existing repository object, and will fail if it does not find one. This task simply 'overlays' the metadata and bitstreams of the AIP version onto the existing record.

...