Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
outlinetrue
stylenone

DuraCloud Backup & Restore Prototype for DSpace 1.6

Background & Overview

...

This comes out of a requirement for DSpace integration with DuraCloud (

http://www.duracloud.org

...

).

...

One

...

of

...

these

...

requirements

...

is

...

to

...

be

...

able

...

to

...

essentially

...

"backup"

...

local

...

DSpace

...

contents

...

into

...

the

...

cloud

...

(as

...

a

...

type

...

of

...

offsite

...

backup),

...

and

...

"restore"

...

those

...

contents

...

at

...

a

...

later

...

time.

...

Essentially,

...

we'd

...

like

...

a

...

way

...

to

...

be

...

able

...

to

...

export

...

the

...

entire

...

hierarchy

...

(i.e.

...

bitstreams,

...

metadata

...

and

...

relationships

...

between

...

Communities/Collections/Items)

...

into

...

a

...

relatively

...

standard

...

format

...

(e.g.

...

METS

...

or

...

similar

...

structured

...

packaging

...

format).

...

This

...

entire

...

hierarchy

...

should

...

also

...

be

...

able

...

to

...

be

...

re-imported

...

into

...

DSpace

...

in

...

the

...

same

...

format,

...

to

...

allow

...

for

...

"roundtripping"

...

of

...

that

...

content

...

(essentially

...

a

...

restore

...

of

...

that

...

content

...

in

...

the

...

same

...

or

...

different

...

DSpace

...

installation).

...

Perceived

...

benefits

...

to

...

DSpace

...

community:

...

  • Would

...

  • allow

...

  • folks

...

  • to

...

  • more

...

  • easily

...

  • move

...

  • entire

...

  • Communities

...

  • or

...

  • Collections

...

  • between

...

  • DSpace

...

  • instances.

...

  • Would

...

  • allow

...

  • for

...

  • a

...

  • potentially

...

  • more

...

  • consistent

...

  • backup

...

  • of

...

  • this

...

  • hierarchy

...

  • (e.g.

...

  • to

...

  • DuraCloud,

...

  • or

...

  • just

...

  • to

...

  • your

...

  • own

...

  • local

...

  • backup

...

  • system),

...

  • rather

...

  • than

...

  • relying

...

  • on

...

  • synchronizing

...

  • a

...

  • backup

...

  • of

...

  • your

...

  • DB

...

  • (metadata/relationships)

...

  • and

...

  • assetstore

...

  • (bitstreams).

...

  • Would

...

  • provide

...

  • a

...

  • way

...

  • for

...

  • people

...

  • to

...

  • more

...

  • easily

...

  • get

...

  • their

...

  • data

...

  • out

...

  • of

...

  • DSpace

...

  • (whatever

...

  • the

...

  • purpose

...

  • may

...

  • be).

...

  • Would

...

  • provide

...

  • a

...

  • relatively

...

  • standard

...

  • format

...

  • for

...

  • people

...

  • to

...

  • migrate

...

  • entire

...

  • hierarchies

...

  • (Communities/Collections)

...

  • into

...

  • DSpace

...

  • (from

...

  • another

...

  • system).

...

Known

...

Issues:

...

  • Exporting/Importing

...

  • the

...

  • Community/Collection/Item

...

  • hierarchy

...

  • technically

...

  • doesn't

...

  • cover

...

  • all

...

  • the

...

  • "content"

...

  • held

...

  • in

...

  • DSpace.

...

  • There

...

  • are

...

  • also

...

  • Groups,

...

  • EPeople

...

  • and

...

  • permissions/rights

...

  • (which

...

  • would

...

  • get

...

  • you

...

  • closer

...

  • to

...

  • a

...

  • full

...

  • export/import

...

  • of

...

  • all

...

  • DSpace

...

  • content).

...

  • However,

...

  • concentrating

...

  • on

...

  • just

...

  • the

...

  • hierarchy

...

  • of

...

  • Community/Collection/Item

...

  • seems

...

  • like

...

  • a

...

  • good

...

  • first

...

  • step.

...

This

...

is

...

related

...

to

...

(and

...

a

...

partial

...

subset

...

of)

...

MIT's

...

AipPrototype

...

:

...

http://jira.dspace.org/jira/browse/DS-465

...

However,

...

the

...

AIP

...

prototype

...

currently

...

does

...

not

...

make

...

it

...

very

...

easy

...

to

...

re-import

...

the

...

exported

...

AIPs

...

for

...

Communities

...

or

...

Collections.

...

So,

...

this

...

feature

...

would

...

extend

...

on

...

the

...

AIP

...

prototype's

...

current

...

packagers/crosswalks

...

to

...

allow

...

for

...

an

...

full

...

export

...

and

...

import

...

of

...

an

...

entire

...

DSpace

...

hierarchy,

...

or

...

just

...

a

...

set

...

of

...

Communities,

...

Collections

...

or

...

Items.

...

The

...

current

...

plan

...

is

...

to

...

build

...

off

...

of

...

the

...

subset

...

of

...

the

...

AipPrototype

...

(essentially

...

the

...

packagers,

...

crosswalks

...

and

...

related

...

changes)

...

which

...

begins

...

to

...

allow

...

for

...

this

...

roundtripping

...

of

...

Communities

...

and

...

Collections.

...

How

...

does

...

this

...

work

...

help

...

DSpace

...

interact

...

with

...

DuraCloud?

...

In

...

this

...

initial

...

prototype,

...

this

...

work

...

is

...

entirely

...

about

...

exporting

...

DSpace

...

content

...

objects

...

to

...

a

...

location

...

on

...

a

...

local

...

filesystem.

...

So,

...

this

...

work

...

doesn't

...

interact

...

solely

...

with

...

DuraCloud,

...

and

...

could

...

be

...

used

...

by

...

any

...

backup

...

storage

...

system

...

to

...

backup

...

your

...

DSpace

...

contents.

...

In

...

the

...

initial

...

DuraCloud

...

work,

...

the

...

DuraCloud

...

team

...

is

...

working

...

on

...

a

...

way

...

to

...

"synchronize"

...

DuraCloud

...

with

...

a

...

local

...

file

...

folder.

...

So,

...

DuraCloud

...

can

...

be

...

configured

...

to

...

"watch"

...

a

...

given

...

folder

...

and

...

automatically

...

replicate

...

its

...

contents

...

into

...

the

...

cloud.

...

Therefore,

...

moving

...

content

...

from

...

DSpace

...

to

...

DuraCloud

...

would

...

currently

...

be

...

a

...

two-step

...

process:

...

  1. First,

...

  1. export

...

  1. AIPs

...

  1. describing

...

  1. that

...

  1. content

...

  1. from

...

  1. DSpace

...

  1. to

...

  1. a

...

  1. filesystem

...

  1. folder

...

  1. Second,

...

  1. enable

...

  1. DuraCloud

...

  1. to

...

  1. watch

...

  1. that

...

  1. same

...

  1. filesystem

...

  1. folder

...

  1. and

...

  1. replicate

...

  1. it

...

  1. into

...

  1. the

...

  1. cloud.

...

Similarly,

...

moving

...

content

...

from

...

DuraCloud

...

back

...

into

...

DSpace

...

would

...

also

...

be

...

a

...

two-step

...

process:

...

  1. First,

...

  1. you'd

...

  1. tell

...

  1. DuraCloud

...

  1. to

...

  1. replicate

...

  1. the

...

  1. AIPs

...

  1. from

...

  1. the

...

  1. cloud

...

  1. to

...

  1. a

...

  1. folder

...

  1. on

...

  1. your

...

  1. file

...

  1. system

...

  1. Second,

...

  1. you'd

...

  1. ingest

...

  1. those

...

  1. AIPs

...

  1. back

...

  1. into

...

  1. DSpace

...

(These

...

backup/restore

...

processes

...

may

...

change

...

as

...

we

...

go

...

forward

...

and

...

investigate

...

more

...

use

...

cases.

...

This

...

is

...

just

...

the

...

initial

...

plan.)

...

Makeup

...

and

...

Definition

...

of

...

AIPs

AIPs are Archival Information Packages.

  • AIP is a package describing one archival object.
    • Archival object may be Item, Collection, or Community. Bitstreams are included in an Item's AIP.
    • Each AIP is logically self-contained, can be restored without rest of the archive. (So you could restore a single Item, Collection or Community)
    • AIP profile favors completeness and accuracy rather than presenting the semantics of an object in a standard format. It conforms to the quirks of DSpace's internal object model rather than attempting to produce a universally understandable representation of the object.
    • An AIP can serve as a DIP (Dissemination Information Package) or SIP (Submission Information Package), especially when transferring custody of objects to another DSpace implementation.
  • In contrast to SIP or DIP, the AIP should include all available DSpace structural and administrative metadata, and basic provenance information.
  • Restoration of an archive from AIPs is not perfectly complete at this time; it is intended to recover from catastrophic loss of content and metadata, not restore the exact same archive as before. Currently, some information (e.g. access controls, people, groups) would be lost, as they are not stored in the AIPs.

AIPs Structure

Generally speaking, an AIP is an Zip file containing a METS manifest and all related content bitstreams.

Some examples include:

  • Site AIP (Sample: aip0-site.zip)
    • METS contains basic metadata about DSpace Site and persistent IDs referencing all Top Level Communities
  • Community AIP (Sample: COLLECTION@123456789-2.zip)
    • METS contains all metadata for Community and persistent IDs referencing all members (SubCommunities or Collections). Package may also include a Logo file, if one exists.
  • Collection AIP (Sample: COLLECTION@123456789-2.zip)
    • METS contains all metadata for Collection and persistent IDs referencing all members (Items). Package may also include a Logo file, if one exists.
  • Item AIP (Sample: ITEM@123456789-8.zip)
    • METS contains all metadata for Item and references to all Bundles and Bitstreams. Package also includes all Bitstream files.

Notes:

  • Bitstreams and Bundles are second-class archival objects; they are recorded in the context of an Item.
  • BitstreamFormats are not even second-class; they are described implicitly within Item technical metadata, and reconstructed from that during restoration

What is NOT in AIPs

  • DSpace Groups, EPeople and Policies (access rights) are currently not described in AIPs. However, there is hope to include them in a future version.
  • Wiki Markup
    DSpace Site configurations (\[dspace\]/config/ directory) or customizations are not described in AIPs

...

  • DSpace

...

  • Database

...

  • model

...

  • (or

...

  • customizations

...

  • therein)

...

  • is

...

  • not

...

  • described

...

  • in AIPs

Where to get the Code

There is an SVN sandbox area for this work (so that others can help out, if it interests them). If anyone has comments, suggestions or feedback on this idea, or would like to be involved in this project, definitely let me know (or add comments to this wiki page).

Code Block
 AIPs


h2. Where to get the Code

There is an SVN sandbox area for this work (so that others can help out, if it interests them). If anyone has comments, suggestions or feedback on this idea, or would like to be involved in this project, definitely let me know (or add comments to this wiki page).

{code} svn co http://scm.dspace.org/svn/repo/sandbox/aip-external-1_6-prototype/ {code}
h3. What code has really changed?

The majority of the code changes are in two main areas:

# [

What code has really changed?

The majority of the code changes are in two main areas:

  1. org.dspace.content.packager.

...

  1. * - Packager classes
    • DSpaceAIPDisseminator - Disseminates/Exports AIP(s)
    • DSpaceAIPIngester - Ingests exported AIP(s)\

...

    • Changes

...

    • were

...

    • also

...

    • made

...

    • to

...

    • refactor

...

    • /

...

    • enhance

...

    • the

...

    • AbstractMETSDisseminator

...

    • ,

...

    • AbstractMETSIngester

...

    • ,

...

    • and

...

    • METSManifest

...

    • classes
  1. org.dspace.content

...

  1. .crosswalk.*
    • AIPDIMCrosswalk - Crosswalks DIM metadata for AIPs
    • AIPTechMDCrosswalk - Crosswalks METS TechMD sections for AIPs
    • There were also changes to the MODSDisseminationCrosswalk and XSLTDisseminationCrosswalk to support creating "Site" AIPs

For a full list of code changes see: AipCoreAPIChanges

Various AIP Processing "Modes"

A short list of definitions of the various modes in which you can interact (export or import) with an AIP or set of AIPs:

  1. Exporting Modes
    • Disseminate Mode (-d option) – disseminate/export AIP(s) based on objects in DSpace
  2. Importing Modes
    • Submit/Ingest Mode (-s option, default) – submit AIP(s) to DSpace in order to create a new object(s).
    • Restore Mode (-r option) – restore pre-existing object(s) in DSpace based on AIP(s). This also attempts to restore all handles and relationships (parent/child objects). This is a specialized type of "submit", where the object is created with a known Handle and known relationships.
    • Replace Mode (-r -f option) – replace existing object(s) in DSpace based on AIP(s). This also attempts to restore all handles and relationships (parent/child objects). This is a specialized type of "restore" where existing object(s) are removed and replaced by the contents in the AIP(s). By default, if a "restore" encounters an existing object, it will back out (i.e. rollback all changes) and report which object already exists.

Running the Code

Here's how to get up and running relatively quickly!

Install Prototype

  1. Download the code from the SVN Sandbox (see above).
  2. Build & Install the prototype. This is just a modified version of DSpace 1.6.0 – so, follow the normal DSpace 1.6.0 Installation procedure.
    • If you have a DSpace 1.6.0 instance already running, you can just build the code and point it at your existing DSpace 1.6.0 database & assetstore.

You'll want to have some content (Communities, Collections & Items) to test with!

Exporting AIPs

There are two main "modes" you can run the AIP packager in:

  • Single AIP (default) - Exports just an AIP describing a single DSpace object. So, if you ran it in this default mode for a Collection, you'd just end up with a single Collection AIP (which would not include AIPs for all its child Items)
  • Hierarchy (including child objects) - Exports the requested AIP describing an object, plus the AIP for all child objects. Some examples follow:
    • For a Site - this would export all Communities, Collections & Items within the site into AIP files (in a provided directory)
    • For a Community - this would export that Community and all SubCommunities, Collections and Items into AIP files (in a provided directory)
    • For a Collection - this would export that Collection and all contained Items into AIP files (in a provided directory)
    • For an Item – this just exports the Item into an AIP as normal (as it already contains its Bitstreams/Bundles by default)

Exporting just a single AIP

To export in single AIP mode (default), use this 'packager' command template:

Code Block
com/browse/dspace/sandbox/aip-external-1_6-prototype/dspace-api/src/main/java/org/dspace/content/crosswalk]
#* {{AIPDIMCrosswalk}} \- Crosswalks DIM metadata for AIPs
#* {{AIPTechMDCrosswalk}} \- Crosswalks METS TechMD sections for AIPs
#* There were also changes to the {{MODSDisseminationCrosswalk}} and {{XSLTDisseminationCrosswalk}} to support creating "Site" AIPs


For a full list of code changes see: [AipCoreAPIChanges]

h2. Various AIP Processing "Modes"

A short list of definitions of the various modes in which you can interact (export or import) with an AIP or set of AIPs:

# Exporting Modes
#* Disseminate Mode ({{-d}} option) -- disseminate/export AIP(s) based on objects in DSpace
# Importing Modes
#* Submit/Ingest Mode ({{-s}} option, default) -- submit AIP(s) to DSpace in order to create a new object(s).
#* Restore Mode ({{-r}} option) -- restore pre-existing object(s) in DSpace based on AIP(s).  This also attempts to restore all handles and relationships (parent/child objects).  This is a specialized type of "submit", where the object is created with a known Handle and known relationships.
#* Replace Mode ({{-r -f}} option) -- replace existing object(s) in DSpace based on AIP(s). This also attempts to restore all handles and relationships (parent/child objects).  This is a specialized type of "restore" where existing object(s) are *removed* and replaced by the contents in the AIP(s).  By default, if a "restore" encounters an existing object, it will back out (i.e. rollback all changes) and report which object already exists.

h2. Running the Code

Here's how to get up and running relatively quickly\!

h3. Install Prototype

# Download the code from the SVN Sandbox (see above).
# Build & Install the prototype.  This is just a modified version of DSpace 1.6.0 -- so, follow the normal DSpace 1.6.0 Installation procedure.
#* If you have a DSpace 1.6.0 instance already running, you can just build the code and point it at your existing DSpace 1.6.0 database & assetstore.

You'll want to have some content (Communities, Collections & Items) to test with\!

h3. Exporting AIPs

There are two main "modes" you can run the AIP packager in:
* *Single AIP* (default) - Exports just an AIP describing a single DSpace object.  So, if you ran it in this default mode for a Collection, you'd just end up with a single Collection AIP (which would not include AIPs for all its child Items)
* *Hierarchy* (including child objects) - Exports the requested AIP describing an object, plus the AIP for all child objects.  Some examples follow:
** For a Site - this would export *all* Communities, Collections & Items within the site into AIP files (in a provided directory)
** For a Community - this would export that Community and all SubCommunities, Collections and Items into AIP files (in a provided directory)
** For a Collection - this would export that Collection and all contained Items into AIP files (in a provided directory)
** For an Item -- this just exports the Item into an AIP as normal (as it already contains its Bitstreams/Bundles by default)


h4. Exporting just a single AIP

To export in single AIP mode (default), use this 'packager' command template:

{code} /dspace/bin/dspace packager -d -t AIP -e <eperson> -i <handle> <file-path>
{code}

for

...

example:

{
Code Block
} /dspace/bin/dspace packager -d -t AIP -e admin@myu.edu -i 4321/4567 aip4567.zip
{code}

The

...

above

...

code

...

will

...

export

...

the

...

object

...

of

...

the

...

given

...

handle

...

(4321/4567)

...

into

...

an

...

AIP

...

file

...

named

...

"aip4567.zip".

...

This

...

will

...

not

...

include

...

any

...

child

...

objects

...

for

...

Communities

...

or

...

Collections.

Exporting AIP Hierarchy

To export an AIP hierarchy, use the -c (or --includeChildren) package parameter.

For example, use this 'packager' command template:

Code Block



h4. Exporting AIP Hierarchy

To export an AIP hierarchy, use the {{\-c}} (or {{\-\-includeChildren}}) package parameter.

For example, use this 'packager' command template:

{code} /dspace/bin/dspace packager -d -t AIP -e <eperson> -i <handle> \
                             -c <child-dir-path> <file-path>
{code}

for

...

example:

{
Code Block
} /dspace/bin/dspace packager -d -t AIP -e admin@myu.edu -i 4321/4567 \
                             -c /path/to/children-aips/ aip4567.zip
{code}

The

...

above

...

code

...

will

...

export

...

the

...

object

...

of

...

the

...

given

...

handle

...

(4321/4567)

...

into

...

an

...

AIP

...

file

...

named

...

"aip4567.zip".

...

In

...

addition

...

it

...

would

...

export

...

all

...

children

...

objects

...

to

...

a

...

directory

...

at

...

the

...

path

...

"/path/to/children-aips/".

...

The

...

child

...

AIPs

...

are

...

all

...

named

...

using

...

the

...

following

...

format:

...

  • File

...

  • Name

...

  • Format:

...

  • <Obj-Type>@<Handle-with-dashes>.zip

...

    • e.g.

...

    • COMMUNITY@123456789-1.zip,

...

    • COLLECTION@123456789-2.zip,

...

    • ITEM@123456789-200.zip

...

    • This

...

    • general

...

    • file

...

    • naming

...

    • convention

...

    • ensures

...

    • that

...

    • you

...

    • can

...

    • easily

...

    • locate

...

    • an

...

    • object

...

    • to

...

    • restore

...

    • by

...

    • its

...

    • name

...

    • (assuming

...

    • you

...

    • know

...

    • its

...

    • Object

...

    • Type

...

    • and

...

    • Handle).

...

  • Alternatively,

...

  • if

...

  • object

...

  • doesn't

...

  • have

...

  • a

...

  • Handle,

...

  • it

...

  • uses

...

  • this

...

  • File

...

  • Name

...

  • Format:

...

  • <Obj-Type>@internal-id-<DSpace-ID>.zip

...

  • (e.g.

...

  • ITEM@internal-id-234.zip)

...

Exporting

...

Entire

...

Site

...

To

...

export

...

an

...

entire

...

DSpace

...

Site,

...

pass

...

the

...

packager

...

the

...

Handle

...

<site-handle-prefix>/0

...

.

...

For

...

example,

...

if

...

your

...

site

...

prefix

...

is

...

"4321",

...

you'd

...

run

...

a

...

command

...

similar

...

to

...

the

...

following:

{
Code Block
} /dspace/bin/dspace packager -d -t AIP -e admin@myu.edu -i 4321/0 \
                             -c /path/to/children-aips/ sitewide-aip.zip
{code}

Again,

...

this

...

would

...

export

...

the

...

DSpace

...

Site

...

AIP

...

into

...

the

...

file

...

"sitewide-aip.zip",

...

and

...

export

...

AIPs

...

for

...

all

...

Communities,

...

Collections

...

and

...

Items

...

into

...

the

...

"/path/to/children-aips"

...

directory.

Ingesting / Restoring AIPs

Again, like export, there are two main "modes" you can run the AIP packager in:

  • Single AIP (default) - Ingests just an AIP describing a single DSpace object. So, if you ran it in this default mode for a Collection AIP, you'd just create a DSpace Collection from the AIP (but not ingest any of its child objects)
  • Hierarchy (including child objects) - Ingests the requested AIP describing an object, plus the AIP for all child objects. Some examples follow:
    • For a Site - this would create all Communities, Collections & Items based on the located AIP files
    • For a Community - this would create that Community and all SubCommunities, Collections and Items based on the located AIP files
    • For a Collection - this would create that Collection and all contained Items based on the located AIP files
    • For an Item – this just create the Item (including all Bitstreams & Bundles) based on the AIP file.

Ingesting just a Single AIP

To ingest a single AIP and create a new DSpace object under a parent of your choice, add the ignoreParent and ignoreHandle package parameters to the command. Also, note that you are running the packager in -s (submit) mode.

NOTE: This only ingests the single AIP specified. It does not ingest all children objects.

Code Block



h3. Ingesting / Restoring AIPs

Again, like export, there are two main "modes" you can run the AIP packager in:
* *Single AIP* (default) - Ingests just an AIP describing a single DSpace object.  So, if you ran it in this default mode for a Collection AIP, you'd just create a DSpace Collection from the AIP (but not ingest any of its child objects)
* *Hierarchy* (including child objects) - Ingests the requested AIP describing an object, plus the AIP for all child objects.  Some examples follow:
** For a Site - this would create *all* Communities, Collections & Items based on the located AIP files
** For a Community - this would create that Community and all SubCommunities, Collections and Items based on the located AIP files
** For a Collection - this would create that Collection and all contained Items based on the located AIP files
** For an Item -- this just create the Item (including all Bitstreams & Bundles) based on the AIP file.

h4. Ingesting just a Single AIP

To ingest a single AIP and create a new DSpace object under a parent of your choice, add the {{ignoreParent}} and {{ignoreHandle}} package parameters to the command.  Also, note that you are running the {{packager}} in {{\-s}} (submit) mode.

_NOTE:_ This only ingests the single AIP specified.  It does *not* ingest all children objects.

{code} /dspace/bin/dspace packager -s -t AIP -e <eperson> -p <parent-handle> -o ignoreParent=true -o ignoreHandle=true <file-path>
{code}

If

...

you

...

leave

...

out

...

these

...

package-parameter

...

options,

...

the

...

AIP

...

package

...

ingester

...

will

...

attempt

...

to

...

install

...

the

...

AIP

...

under

...

the

...

parent

...

handle

...

it

...

had

...

before,

...

and

...

give

...

it

...

back

...

its

...

original

...

Handle.

...

After

...

all,

...

the

...

point

...

of

...

AIPs

...

was

...

to

...

reproduce

...

the

...

exact

...

object

...

that

...

was

...

exported.

...

When

...

you

...

are

...

effectively

...

using

...

the

...

AIP

...

as

...

a

...

SIP,

...

however,

...

you

...

may

...

not

...

want

...

it

...

back

...

under

...

the

...

same

...

parent

...

or

...

handle,

...

so

...

there

...

is

...

a

...

way

...

to

...

override

...

these

...

features.

Ingesting an AIP Hierarchy

To ingest an AIP hierarchy from a directory of AIPs, use the -c (or --includeChildren) package parameter. In addition, as this is not a restore, you'd want to specify the -o ignoreParent=true parameter (ignores Parent Object information contained in the package) and the -o ignoreHandle=true parameter (ignores handle in package, and a new handle is assigned on ingest).

For example, use this 'packager' command template:

Code Block



h4. Ingesting an AIP Hierarchy

To ingest an AIP hierarchy from a directory of AIPs, use the {{\-c}} (or {{\-\-includeChildren}}) package parameter.  In addition, as this is *not* a restore, you'd want to specify the {{-o ignoreParent=true}} parameter (ignores Parent Object information contained in the package) and the {{-o ignoreHandle=true}} parameter (ignores handle in package, and a new handle is assigned on ingest).

For example, use this 'packager' command template:

{code} /dspace/bin/dspace packager -s -t AIP -e <eperson> -p <parent-handle> -o ignoreParent=true -o ignoreHandle=true \
                             -c <child-dir-path> <file-path>
{code}

for

...

example:

{
Code Block
} /dspace/bin/dspace packager -s -t AIP -e admin@myu.edu -p 4321/12 -o ignoreParent=true -o ignoreHandle=true \
                             -c /path/to/children-aips/ aip4567.zip
{code}

The

...

above

...

command

...

will

...

ingest

...

the

...

package

...

named

...

"aip4567.zip"

...

as

...

a

...

child

...

of

...

the

...

specified

...

Parent

...

Object

...

(handle="4321/12").

...

The

...

resulting

...

object

...

is

...

assigned

...

a

...

new

...

Handle

...

(

...

-o

...

ignoreHandle=true

...

).

...

In

...

addition,

...

any

...

child

...

AIPs

...

referenced

...

by

...

"aip4567.zip"

...

in

...

the

...

folder

...

"/path/to/children-aips"

...

are

...

also

...

recursively

...

ingested

...

(a

...

new

...

Handle

...

is

...

also

...

assigned

...

for

...

each

...

child

...

AIP).

...

Another

...

example

...

Ingesting

...

a

...

Top-Level

...

Community

...

(by

...

using

...

the

...

Site

...

Handle,

...

<site-handle-prefix>/0

...

):

{
Code Block
} /dspace/bin/dspace packager -s -t AIP -e admin@myu.edu -p 4321/0 -o ignoreParent=true -o ignoreHandle=true \
                             -c /path/to/children-aips/ community-aip.zip
{code}

The

...

above

...

command

...

will

...

ingest

...

the

...

package

...

named

...

"community-aip.zip"

...

as

...

a

...

top-level

...

community

...

(i.e.

...

the

...

specified

...

parent

...

is

...

"4321/0"

...

which

...

is

...

a

...

Site

...

Handle).

...

Again,

...

the

...

resulting

...

object

...

is

...

assigned

...

a

...

new

...

Handle

...

(

...

-o

...

ignoreHandle=true

...

).

...

In

...

addition,

...

any

...

child

...

AIPs

...

referenced

...

by

...

"community-aip.zip"

...

in

...

the

...

folder

...

"/path/to/children-aips"

...

are

...

also

...

recursively

...

ingested

...

(a

...

new

...

Handle

...

is

...

also

...

assigned

...

for

...

each

...

child

...

AIP).

...

Restoring an AIP Hierarchy

Restoring is slightly different than just re-ingesting

...

.

...

When

...

restoring,

...

we

...

want

...

to

...

retain

...

the

...

old

...

Handles

...

and

...

Parent

...

Objects

...

within

...

the

...

Hierarchy.

...

There

...

are

...

currently

...

three

...

restore

...

modes:

...

  1. Default

...

  1. Restore

...

  1. Mode

...

  1. (

...

  1. -r

...

  1. )

...

  1. =

...

  1. Attempt

...

  1. to

...

  1. restore

...

  1. object

...

  1. (and

...

  1. optionally

...

  1. children).

...

  1. Rollback

...

  1. all

...

  1. changes

...

  1. if

...

  1. any

...

  1. object

...

  1. is

...

  1. found

...

  1. to

...

  1. already

...

  1. exist.

...

  1. Restore,

...

  1. Keep

...

  1. Existing

...

  1. Mode

...

  1. (

...

  1. -r

...

  1. -k

...

  1. )

...

  1. =

...

  1. Attempt

...

  1. to

...

  1. restore

...

  1. object

...

  1. (and

...

  1. optionally

...

  1. children).

...

  1. If

...

  1. an

...

  1. object

...

  1. is

...

  1. found

...

  1. to

...

  1. already

...

  1. exist,

...

  1. skip

...

  1. over

...

  1. it

...

  1. (and

...

  1. all

...

  1. children

...

  1. objects),

...

  1. and

...

  1. continue

...

  1. to

...

  1. restore

...

  1. all

...

  1. other

...

  1. non-existing

...

  1. objects.

...

  1. Force

...

  1. Replace

...

  1. Mode

...

  1. (

...

  1. -r

...

  1. -f

...

  1. )

...

  1. =

...

  1. Restore

...

  1. an

...

  1. object

...

  1. (and

...

  1. optionally

...

  1. children)

...

  1. and

...

  1. overwrite

...

  1. any

...

  1. existing

...

  1. objects

...

  1. in

...

  1. DSpace.

...

  1. Therefore,

...

  1. if

...

  1. an

...

  1. object

...

  1. is

...

  1. found

...

  1. to

...

  1. already

...

  1. exist

...

  1. in

...

  1. DSpace,

...

  1. it

...

  1. is

...

  1. removed

...

  1. and

...

  1. then

...

  1. replaced

...

  1. by

...

  1. the

...

  1. contents

...

  1. of

...

  1. the

...

  1. AIP.

...

  1. WARNING:

...

  1. This

...

  1. mode

...

  1. is

...

  1. potentially

...

  1. dangerous

...

  1. as

...

  1. it

...

  1. will

...

  1. destroy

...

  1. existing

...

  1. contents in DSpace – you should always backup first!
Default Restore Mode

By default, the restore mode (-r option) will rollback all changes if any object is found to already exist. The user will be informed if which object already exists within their DSpace installation.

Use this 'packager' command template:

Code Block
 in DSpace -- you should always backup first!_


h5. Default Restore Mode

By default, the restore mode ({{-r}} option) will rollback all changes if any object is found to already exist.  The user will be informed if which object already exists within their DSpace installation.

Use this 'packager' command template:
{code} /dspace/bin/dspace packager -r -t AIP -e <eperson> \
                             -c <child-dir-path> <file-path>
{code}

For

...

example:

{
Code Block
} /dspace/bin/dspace packager -r -t AIP -e admin@myu.edu \
                             -c /path/to/children-aips/ aip4567.zip
{code}

_Notice that unlike {{-s}} option (for 

Notice that unlike -s option (for submission/ingesting),

...

the

...

-r

...

option

...

does

...

not

...

require

...

the

...

Parent

...

Object

...

(

...

-p

...

option)

...

to

...

be

...

specified

...

if

...

it

...

can

...

be

...

determined

...

from

...

the

...

package

...

itself.

...

In

...

the

...

above

...

example,

...

the

...

package

...

"aip4567.zip"

...

is

...

restored

...

to

...

the

...

DSpace

...

installation

...

with

...

the

...

Handle

...

provided

...

within

...

the

...

package

...

itself

...

(and

...

added

...

as

...

a

...

child

...

of

...

the

...

parent

...

object

...

specified

...

within

...

the

...

package

...

itself).

...

In

...

addition,

...

any

...

child

...

AIPs

...

referenced

...

by

...

"aip4567.zip"

...

in

...

the

...

folder

...

"/path/to/children-aips"

...

are

...

also

...

recursively

...

ingested.

...

They

...

are

...

also

...

restored

...

with

...

the

...

Handles

...

&

...

Parent

...

Objects

...

provided

...

with

...

their

...

package.

...

If

...

any

...

object

...

is

...

found

...

to

...

already

...

exist,

...

all

...

changes

...

are

...

rolled

...

back

...

(i.e.

...

nothing

...

is

...

restored

...

to

...

DSpace)

...

Restore,

...

Keep

...

Existing

...

Mode

...

When

...

the

...

"Keep

...

Existing"

...

flag

...

(

...

-k

...

option)

...

is

...

specified,

...

the

...

restore

...

will

...

attempt

...

to

...

skip

...

over

...

any

...

objects

...

found

...

to

...

already

...

exist.

...

It

...

will

...

report

...

to

...

the

...

user

...

that

...

the

...

object

...

was

...

found

...

to

...

exist

...

(and

...

was

...

not

...

modified

...

or

...

changed).

...

It

...

will

...

then

...

continue

...

to

...

restore

...

all

...

objects

...

which

...

do

...

not

...

already

...

exist.

...

One

...

special

...

case

...

to

...

note:

...

If

...

a

...

Collection

...

or

...

Community

...

is

...

found

...

to

...

already

...

exist,

...

its

...

child

...

objects

...

are

...

also

...

skipped

...

over.

...

So,

...

this

...

mode

...

will

...

not

...

auto-restore

...

items

...

to

...

an

...

existing

...

Collection.

...

Use

...

this

...

'packager'

...

command

...

template:

{
Code Block
} /dspace/bin/dspace packager -r -k -t AIP -e <eperson> \
                             -c <child-dir-path> <file-path>
{code}

For

...

example:

{
Code Block
} /dspace/bin/dspace packager -r -k -t AIP -e admin@myu.edu \
                             -c /path/to/children-aips/ aip4567.zip
{code}

In

...

the

...

above

...

example,

...

the

...

package

...

"aip4567.zip"

...

is

...

restored

...

to

...

the

...

DSpace

...

installation

...

with

...

the

...

Handle

...

provided

...

within

...

the

...

package

...

itself

...

(and

...

added

...

as

...

a

...

child

...

of

...

the

...

parent

...

object

...

specified

...

within

...

the

...

package

...

itself).

...

In

...

addition,

...

any

...

child

...

AIPs

...

referenced

...

by

...

"aip4567.zip"

...

in

...

the

...

folder

...

"/path/to/children-aips"

...

are

...

also

...

recursively

...

ingested.

...

They

...

are

...

also

...

restored

...

with

...

the

...

Handles

...

&

...

Parent

...

Objects

...

provided

...

with

...

their

...

package.

...

If

...

any

...

object

...

is

...

found

...

to

...

already

...

exist,

...

it

...

is

...

skipped

...

over

...

(child

...

objects

...

are

...

also

...

skipped).

...

All

...

non-existing

...

objects

...

are

...

restored.

...

Force

...

Replace

...

Mode

...

When

...

the

...

"Force

...

Replace"

...

flag

...

(

...

-f

...

option)

...

is

...

specified,

...

the

...

restore

...

will

...

overwrite

...

any

...

objects

...

found

...

to

...

already

...

exist

...

in

...

DSpace.

...

In

...

other

...

words,

...

existing

...

content

...

is

...

deleted

...

and

...

then

...

replaced

...

by

...

the

...

contents

...

of

...

the

...

AIP(s).

{
Panel
}

WARNING:

As

this

mode

actually

*

destroys

*

existing

content

in

DSpace,

this

mode

is

potentially

dangerous

and

may

result

in

content

loss!

You

should

always

perform

a

full

backup

(assetstore

files

&

database)

before

attempting

a

larger

scale

replace

{panel} {pane}SECOND

Panel

SECOND WARNING:

This

doesn't

100%

work

yet

for

Communities! Use this

Communities or an entire Site! You've been warned!!! - Tim

Use this 'packager'

...

command

...

template:

{
Code Block
} /dspace/bin/dspace packager -r -f -t AIP -e <eperson> \
                             -c <child-dir-path> <file-path>
{code}

For

...

example:

{
Code Block
} /dspace/bin/dspace packager -r -f -t AIP -e admin@myu.edu \
                             -c /path/to/children-aips/ aip4567.zip
{code}

In the above example, the package .zip

In the above example, the package "aip4567.zip"

...

is

...

restored

...

to

...

the

...

DSpace

...

installation

...

with

...

the

...

Handle

...

provided

...

within

...

the

...

package

...

itself

...

(and

...

added

...

as

...

a

...

child

...

of

...

the

...

parent

...

object

...

specified

...

within

...

the

...

package

...

itself).

...

In

...

addition,

...

any

...

child

...

AIPs

...

referenced

...

by

...

"aip4567.zip"

...

in

...

the

...

folder

...

"/path/to/children-aips"

...

are

...

also

...

recursively

...

ingested.

...

They

...

are

...

also

...

restored

...

with

...

the

...

Handles

...

&

...

Parent

...

Objects

...

provided

...

with

...

their

...

package.

...

If

...

any

...

object

...

is

...

found

...

to

...

already

...

exist,

...

it

...

is

...

removed

...

from

...

DSpace

...

and

...

replaced

...

by

...

the

...

contents

...

of

...

the

...

appropriate

...

AIP.

...

If any error occurs,

...

the

...

script

...

attempts

...

to

...

rollback

...

the

...

entire

...

replacement.

...

However,

...

no

...

guarantees

...

can

...

be

...

made

...

that

...

the

...

entire

...

replacement

...

can

...

always

...

be

...

rolled

...

back

...

there

...

is

...

a

...

potential

...

for

...

accidental

...

data

...

loss!

...

Restoring

...

Entire

...

Site

...

Details

...

Coming

...

Soon

...

!

...

In

...

all

...

likelihood

...

it

...

will

...

take

...

the

...

same

...

parameters

...

as

...

the

...

"Exporting

...

entire

...

Site",

...

except

...

that

...

you'll

...

be

...

running

...

the

...

packager

...

in

...

-r

...

(restore)

...

mode.

...

Testing Special Cases during Restore/Replace

...

The

...

below

...

special

...

cases

...

need

...

further

...

testing,

...

especially

...

when

...

performing

...

a

...

"Restore"

...

or

...

"Replace".

...

Mostly,

...

these

...

are

...

just

...

notes

...

for

...

Tim

...

(and

...

other

...

developers),

...

to

...

ensure

...

that

...

all

...

these

...

various

...

"edge"

...

cases

...

can

...

be

...

restored

...

properly

...

(or

...

perhaps

...

not

...

restored

...

properly,

...

if

...

the

...

decision

...

is

...

made

...

that

...

it

...

needs

...

not

...

be

...

restored).

...

As

...

each

...

special

...

case

...

is

...

implemented,

...

we

...

can

...

check

...

off

...

the

...

item

...

in

...

the

...

below

...

list.

...

Item Restoration/Replacement

...

Special

...

Cases

...

  • Restore

...

  • existing

...

  • License

...

  • from

...

  • AIP

...

  • i.e.

...

  • do

...

  • not

...

  • add

...

  • a

...

  • new

...

  • license

...

  • (or

...

  • change

...

  • the

...

  • license)

...

  • during

...

  • restore/replace

...

  • Restore

...

  • item

...

  • mappings

...

  • to

...

  • multiple

...

  • collections

...

  • (for

...

  • items

...

  • which

...

  • are

...

  • mapped

...

  • to

...

  • several

...

  • collections)

...

  • Restore

...

  • withdrawal

...

  • state

...

  • Restore

...

  • embargo

...

  • state

...

  • Restore

...

  • permissions

...

  • &

...

  • roles

...

  • (user/group

...

  • permissions),

...

  • if

...

  • possible

...

  • Options

...

  • to

...

  • restore

...

  • just

...

  • metadata

...

  • or

...

  • just

...

  • particular

...

  • bitstreams/bundles?

...

  • Will

...

  • not

...

  • restore

...

  • items

...

  • which

...

  • have

...

  • not

...

  • made

...

  • it

...

  • into

...

  • the

...

  • "archived"

...

  • state.

...

  • In

...

  • other

...

  • words,

...

  • at

...

  • this

...

  • time,

...

  • there

...

  • are

...

  • no

...

  • plans

...

  • to

...

  • restore

...

  • WorkspaceItems

...

  • or

...

  • WorkflowItems.

...

Collection

...

Restoration/Replacement

...

Special

...

Cases

...

  • Restore

...

  • permissions

...

  • &

...

  • roles

...

  • (user/group

...

  • permissions),

...

  • if

...

  • possible

...

  • Restore

...

  • Collection-specific

...

  • license

...

  • Restore

...

  • Collection's

...

  • Item

...

  • Template?

...

  • Restore

...

  • Collection's

...

  • content

...

  • source

...

  • info?

...

  • (e.g.

...

  • OAI-Harvesting

...

  • Collections

...

  • versus

...

  • normal

...

  • Collections)

...

Community

...

Restoration/Replacement

...

Special

...

Cases

...

  • Restore

...

  • permissions

...

  • &

...

  • roles

...

  • (user/group

...

  • permissions),

...

  • if

...

  • possible

Discussion / Use Cases

Please add your own potential use cases or discussion topics

  • MITUseCases - Notes on defining common operations in a replication system.

Questions / Comments?

Questions or comments – either add them inline above, or contact Tim Donohue