Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
outlinetrue
stylenone
{toc:outline=true|style=none} {note:title=Will be released in
Note
titleWill be released in
Wiki Markup
1.7.0
}

This

code

is

now

available

on

the

current

DSpace

SVN

Trunk

(http://scm.dspace.org/svn/repo/dspace/trunk/Image Added).

It

will

be

officially

released

as

part

of

DSpace

1.7.0.

{note} {warning:title=Warning For Developers}This code changes the current {{

Warning
titleWarning For Developers

This code changes the current org.dspace.content.packager.PackagerIngester

}}

and

{{

org.dspace.content.packager.PackagerDisseminator

}}

interfaces.

If

you've

written

any

local,

custom

Packagers

at

your

institution,

they

will

need

to

be

refactored

to

utilize

these

updated

interfaces.

{warning} h1. AIP Backup & Restore for DSpace 1.7 h2. Background & Overview {note}Additional background information available in the OR10 Presentation entitled [Improving DSpace Backups, Restores & Migrations|http://www.slideshare.net/tdonohue/improving-dspace-backups-restores-migrations]{note} This work comes out of a requirement for DSpace integration with DuraCloud ([

AIP Backup & Restore for DSpace 1.7

Background & Overview

Note

Additional background information available in the OR10 Presentation entitled Improving DSpace Backups, Restores & Migrations

This work comes out of a requirement for DSpace integration with DuraCloud (http://www.duracloud.org

...

).

...

One

...

of

...

these

...

requirements

...

is

...

to

...

be

...

able

...

to

...

essentially

...

"backup"

...

local

...

DSpace

...

contents

...

into

...

the

...

cloud

...

(as

...

a

...

type

...

of

...

offsite

...

backup),

...

and

...

"restore"

...

those

...

contents

...

at

...

a

...

later

...

time.

...

Essentially,

...

we

...

need

...

a

...

way

...

to

...

be

...

able

...

to

...

export

...

the

...

entire

...

hierarchy

...

(i.e.

...

bitstreams,

...

metadata

...

and

...

relationships

...

between

...

Communities/Collections/Items)

...

into

...

a

...

relatively

...

standard

...

format

...

(e.g.

...

METS

...

or

...

similar

...

structured

...

packaging

...

format).

...

This

...

entire

...

hierarchy

...

should

...

also

...

be

...

able

...

to

...

be

...

re-imported

...

into

...

DSpace

...

in

...

the

...

same

...

format,

...

to

...

allow

...

for

...

"round-tripping"

...

of

...

that

...

content

...

(essentially

...

a

...

restore

...

of

...

that

...

content

...

in

...

the

...

same

...

or

...

different

...

DSpace

...

installation).

...

Perceived

...

benefits

...

to

...

DSpace

...

community:

...

  • Would

...

  • allow

...

  • folks

...

  • to

...

  • more

...

  • easily

...

  • move

...

  • entire

...

  • Communities

...

  • or

...

  • Collections

...

  • between

...

  • DSpace

...

  • instances.

...

  • Would

...

  • allow

...

  • for

...

  • a

...

  • potentially

...

  • more

...

  • consistent

...

  • backup

...

  • of

...

  • this

...

  • hierarchy

...

  • (e.g.

...

  • to

...

  • DuraCloud,

...

  • or

...

  • just

...

  • to

...

  • your

...

  • own

...

  • local

...

  • backup

...

  • system),

...

  • rather

...

  • than

...

  • relying

...

  • on

...

  • synchronizing

...

  • a

...

  • backup

...

  • of

...

  • your

...

  • DB

...

  • (metadata/relationships)

...

  • and

...

  • assetstore

...

  • (bitstreams).

...

  • Would

...

  • provide

...

  • a

...

  • way

...

  • for

...

  • people

...

  • to

...

  • more

...

  • easily

...

  • get

...

  • their

...

  • data

...

  • out

...

  • of

...

  • DSpace

...

  • (whatever

...

  • the

...

  • purpose

...

  • may

...

  • be).

...

  • Would

...

  • provide

...

  • a

...

  • relatively

...

  • standard

...

  • format

...

  • for

...

  • people

...

  • to

...

  • migrate

...

  • entire

...

  • hierarchies

...

  • (Communities/Collections)

...

  • into

...

  • DSpace

...

  • (from

...

  • another

...

  • system).

...

This

...

is

...

related

...

to

...

(and

...

a

...

partial

...

subset

...

of)

...

MIT's

...

AipPrototype

...

.

...

However,

...

the

...

original

...

AIP

...

prototype

...

did

...

not

...

make

...

it

...

very

...

easy

...

to

...

re-import

...

the

...

exported

...

AIPs

...

for

...

Communities

...

or

...

Collections.

...

So,

...

this

...

AIP

...

Backup/Restore

...

feature

...

extends

...

on

...

the

...

old

...

AIP

...

prototype's

...

packagers/crosswalks

...

to

...

allow

...

for

...

an

...

full

...

export

...

and

...

import

...

of

...

an

...

entire

...

DSpace

...

hierarchy,

...

or

...

just

...

a

...

set

...

of

...

Communities,

...

Collections

...

or

...

Items.

How does this work help DSpace interact with DuraCloud?

This work is entirely about exporting DSpace content objects to a location on a local filesystem. So, this work doesn't interact solely with DuraCloud, and could be used by any backup storage system to backup your DSpace contents.

In the initial DuraCloud work, the DuraCloud team is working on a way to "synchronize" DuraCloud with a local file folder. So, DuraCloud can be configured to "watch" a given folder and automatically replicate its contents into the cloud.

Therefore, moving content from DSpace to DuraCloud would currently be a two-step process:

  1. First, export AIPs describing that content from DSpace to a filesystem folder
  2. Second, enable DuraCloud to watch that same filesystem folder and replicate it into the cloud.

Similarly, moving content from DuraCloud back into DSpace would also be a two-step process:

  1. First, you'd tell DuraCloud to replicate the AIPs from the cloud to a folder on your file system
  2. Second, you'd ingest those AIPs back into DSpace

(These backup/restore processes may change as we go forward and investigate more use cases. This is just the initial plan.)

Makeup and Definition of AIPs

AIPs are Archival Information Packages.

  • AIP is a package describing one archival object.
    • Archival object may be Item, Collection, or Community. Bitstreams are included in an Item's AIP.
    • Each AIP is logically self-contained, can be restored without rest of the archive. (So you could restore a single Item, Collection or Community)
    • AIP profile favors completeness and accuracy rather than presenting the semantics of an object in a standard format. It conforms to the quirks of DSpace's internal object model rather than attempting to produce a universally understandable representation of the object.
    • An AIP can serve as a DIP (Dissemination Information Package) or SIP (Submission Information Package), especially when transferring custody of objects to another DSpace implementation.
  • In contrast to SIP or DIP, the AIP should include all available DSpace structural and administrative metadata, and basic provenance information.
  • Restoration of an archive from AIPs is not perfectly complete at this time; it is intended to recover from catastrophic loss of content and metadata, not restore the exact same archive as before. Currently, some information (e.g. access controls, people, groups) would be lost, as they are not stored in the AIPs.

AIP Structure / Format

Generally speaking, an AIP is an Zip file containing a METS manifest and all related content bitstreams.

For more specific details of AIP format / structure, along with examples, please see DSpaceAIPFormat

Where to get the Code

The latest code is available on DSpace Trunk (and will be released in DSpace 1.7.0)

Code Block



h3. How does this work help DSpace interact with DuraCloud?

This work is entirely about *exporting* DSpace content objects to a location on a local filesystem.  So, this work doesn't interact solely with DuraCloud, and could be used by any backup storage system to backup your DSpace contents.

In the initial DuraCloud work, the DuraCloud team is working on a way to "synchronize" DuraCloud with a local file folder.  So, DuraCloud can be configured to "watch" a given folder and automatically replicate its contents into the cloud.

Therefore, moving content from DSpace to DuraCloud would currently be a two-step process:
# First, export AIPs describing that content from DSpace to a filesystem folder
# Second, enable DuraCloud to watch that same filesystem folder and replicate it into the cloud.

Similarly, moving content from DuraCloud back into DSpace would also be a two-step process:
# First, you'd tell DuraCloud to replicate the AIPs from the cloud to a folder on your file system
# Second, you'd ingest those AIPs back into DSpace

(These backup/restore processes may change as we go forward and investigate more use cases.  This is just the initial plan.)

h2. Makeup and Definition of AIPs

h3. AIPs are Archival Information Packages.

* AIP is a package describing one archival object.
** Archival object may be *Item*, *Collection*, or *Community*. Bitstreams are included in an Item's AIP.
** Each AIP is logically self-contained, can be restored without rest of the archive. (So you could restore a single Item, Collection or Community)
** AIP profile favors completeness and accuracy rather than presenting the semantics of an object in a standard format.  It conforms to the quirks of DSpace's internal object model rather than attempting to produce a universally understandable representation of the object.
** An AIP _can_ serve as a DIP (Dissemination Information Package) or SIP (Submission Information Package), especially when transferring custody of objects to another DSpace implementation.
* In contrast to SIP or DIP, the AIP should include all available DSpace structural and administrative metadata, and basic provenance information.
* Restoration of an archive from AIPs is not perfectly complete at this time; it is intended to recover from catastrophic loss of content and metadata, _not_ restore the exact same archive as before.  Currently, some information (e.g. access controls, people, groups) would be lost, as they are not stored in the AIPs.

h3. AIP Structure / Format

Generally speaking, an AIP is an Zip file containing a METS manifest and all related content bitstreams.

For more specific details of AIP format / structure, along with examples, please see [DSpaceAIPFormat]

h2. Where to get the Code

The latest code is available on DSpace Trunk (and will be released in DSpace 1.7.0)

{code} svn co http://scm.dspace.org/svn/repo/dspace/trunk/ {code}

h3. What code has really changed?

The majority of the code changes are in two main areas:

# [

What code has really changed?

The majority of the code changes are in two main areas:

  1. org.dspace.content.packager.

...

  1. *

...

  1. -

...

  1. Packager

...

  1. classes
    • PackageIngester interface - Now ingests 'java.io.File'

...

    • objects

...

    • instead

...

    • of

...

    • InputStreams

...

    • (to

...

    • better

...

    • support

...

    • recursive

...

    • imports

...

    • of

...

    • Communities/Collections)

...

    • PackageDisseminator interface - Now exports 'java.io.File'

...

    • objects

...

    • instead

...

    • of

...

    • OutputStreams

...

    • (to

...

    • better

...

    • support

...

    • recursive

...

    • exports

...

    • of

...

    • Communities/Collections)

...

    • DSpaceAIPDisseminator - Disseminates/Exports

...

    • AIP(s)

...

    • DSpaceAIPIngester - Ingests exported AIP(s)\

...

    • Changes

...

    • were

...

    • also

...

    • made

...

    • to

...

    • refactor

...

    • /

...

    • enhance

...

    • the

...

    • AbstractMETSDisseminator

...

    • ,

...

    • AbstractMETSIngester

...

    • ,

...

    • and

...

    • METSManifest

...

    • classes
  1. org.dspace.content.crosswalk.*
    • AIPDIMCrosswalk - Crosswalks DIM metadata for AIPs
    • AIPTechMDCrosswalk - Crosswalks METS TechMD sections for AIPs
    • There were also changes to the MODSDisseminationCrosswalk and XSLTDisseminationCrosswalk to support creating "Site" AIPs
Note
titleFor More Information

For a full list of code changes (including patches) see: AipCoreAPIChanges

Warning
titleWarning For Developers

Because of the changes to the PackageIngester and PackageDisseminator interfaces, if you've created any local Packagers at your institution, those will need to be refactored.

Running the Code

Exporting AIPs

Export Modes & Options

All AIP Exports are done by using the Dissemination Mode (-d option) of the packager command.

There are two types of AIP Dissemination you can perform:

  • Single AIP (default, using -d option) - Exports just an AIP describing a single DSpace object. So, if you ran it in this default mode for a Collection, you'd just end up with a single Collection AIP (which would not include AIPs for all its child Items)
  • Hierarchy of AIPs (using the -d --all or -d -a option) - Exports the requested AIP describing an object, plus the AIP for all child objects. Some examples follow:
    • For a Site - this would export all Communities, Collections & Items within the site into AIP files (in a provided directory)
    • For a Community - this would export that Community and all SubCommunities, Collections and Items into AIP files (in a provided directory)
    • For a Collection - this would export that Collection and all contained Items into AIP files (in a provided directory)
    • For an Item – this just exports the Item into an AIP as normal (as it already contains its Bitstreams/Bundles by default)

Exporting just a single AIP

To export in single AIP mode (default), use this 'packager' command template:

Code Block
\*|http://fisheye3.atlassian.com/browse/dspace/trunk/dspace-api/src/main/java/org/dspace/content/crosswalk]
#* {{AIPDIMCrosswalk}} \- Crosswalks DIM metadata for AIPs
#* {{AIPTechMDCrosswalk}} \- Crosswalks METS TechMD sections for AIPs
#* There were also changes to the {{MODSDisseminationCrosswalk}} and {{XSLTDisseminationCrosswalk}} to support creating "Site" AIPs


{note:title=For More Information}For a full list of code changes (including patches) see: [AipCoreAPIChanges]{note}

{warning:title=Warning For Developers}Because of the changes to the {{PackageIngester}} and {{PackageDisseminator}} interfaces, if you've created any local Packagers at your institution, those will need to be refactored.{warning}


h2. Running the Code

h3. Exporting AIPs

h4. Export Modes & Options

All AIP Exports are done by using the Dissemination Mode ({{\-d}} option) of the {{packager}} command.

There are two types of AIP Dissemination you can perform:
* *Single AIP* (default, using {{\-d}} option) - Exports just an AIP describing a single DSpace object.  So, if you ran it in this default mode for a Collection, you'd just end up with a single Collection AIP (which would not include AIPs for all its child Items)
* *Hierarchy of AIPs* (using the {{\-d \-\-all}} or {{\-d \-a}} option) - Exports the requested AIP describing an object, plus the AIP for all child objects.  Some examples follow:
** For a Site - this would export *all* Communities, Collections & Items within the site into AIP files (in a provided directory)
** For a Community - this would export that Community and all SubCommunities, Collections and Items into AIP files (in a provided directory)
** For a Collection - this would export that Collection and all contained Items into AIP files (in a provided directory)
** For an Item -- this just exports the Item into an AIP as normal (as it already contains its Bitstreams/Bundles by default)


h4. Exporting just a single AIP

To export in single AIP mode (default), use this 'packager' command template:

{code} /dspace/bin/dspace packager -d -t AIP -e <eperson> -i <handle> <file-path>
{code}

for

...

example:

{
Code Block
} /dspace/bin/dspace packager -d -t AIP -e admin@myu.edu -i 4321/4567 aip4567.zip
{code}

The

...

above

...

code

...

will

...

export

...

the

...

object

...

of

...

the

...

given

...

handle

...

(4321/4567)

...

into

...

an

...

AIP

...

file

...

named

...

"aip4567.zip".

...

This

...

will

...

not

...

include

...

any

...

child

...

objects

...

for

...

Communities

...

or

...

Collections.

Exporting AIP Hierarchy

To export an AIP hierarchy, use the -a (or --all) package parameter.

For example, use this 'packager' command template:

Code Block



h4. Exporting AIP Hierarchy

To export an AIP hierarchy, use the {{\-a}} (or {{\--all}}) package parameter.

For example, use this 'packager' command template:

{code} /dspace/bin/dspace packager -d -a -t AIP -e <eperson> -i <handle> <file-path>
{code}

for

...

example:

{
Code Block
} /dspace/bin/dspace packager -d -a -t AIP -e admin@myu.edu -i 4321/4567 aip4567.zip
{code}

The

...

above

...

code

...

will

...

export

...

the

...

object

...

of

...

the

...

given

...

handle

...

(4321/4567)

...

into

...

an

...

AIP

...

file

...

named

...

"aip4567.zip".

...

In

...

addition

...

it

...

would

...

export

...

all

...

children

...

objects

...

to

...

the

...

same

...

directory

...

as

...

the

...

"aip4567.zip"

...

file.

...

The

...

child

...

AIP

...

files

...

are

...

all

...

named

...

using

...

the

...

following

...

format:

...

  • File

...

  • Name

...

  • Format:

...

  • <Obj-Type>@<Handle-with-dashes>.zip

...

    • e.g.

...

    • COMMUNITY@123456789-1.zip,

...

    • COLLECTION@123456789-2.zip,

...

    • ITEM@123456789-200.zip

...

    • This

...

    • general

...

    • file

...

    • naming

...

    • convention

...

    • ensures

...

    • that

...

    • you

...

    • can

...

    • easily

...

    • locate

...

    • an

...

    • object

...

    • to

...

    • restore

...

    • by

...

    • its

...

    • name

...

    • (assuming

...

    • you

...

    • know

...

    • its

...

    • Object

...

    • Type

...

    • and

...

    • Handle).

...

  • Alternatively,

...

  • if

...

  • object

...

  • doesn't

...

  • have

...

  • a

...

  • Handle,

...

  • it

...

  • uses

...

  • this

...

  • File

...

  • Name

...

  • Format:

...

  • <Obj-Type>@internal-id-<DSpace-ID>.zip

...

  • (e.g.

...

  • ITEM@internal-id-234.zip)

...

Exporting

...

Entire

...

Site

...

To

...

export

...

an

...

entire

...

DSpace

...

Site,

...

pass

...

the

...

packager

...

the

...

Handle

...

<site-handle-prefix>/0

...

.

...

For

...

example,

...

if

...

your

...

site

...

prefix

...

is

...

"4321",

...

you'd

...

run

...

a

...

command

...

similar

...

to

...

the

...

following:

{
Code Block
} /dspace/bin/dspace packager -d -a -t AIP -e admin@myu.edu -i 4321/0 sitewide-aip.zip
{code}

Again,

...

this

...

would

...

export

...

the

...

DSpace

...

Site

...

AIP

...

into

...

the

...

file

...

"sitewide-aip.zip",

...

and

...

export

...

AIPs

...

for

...

all

...

Communities,

...

Collections

...

and

...

Items

...

into

...

the

...

same

...

directory

...

as

...

the

...

Site

...

AIP.

...

Ingesting

...

/

...

Restoring

...

AIPs

...

Ingestion

...

Modes

...

&

...

Options

...

Ingestion

...

of

...

AIPs

...

is

...

a

...

bit

...

more

...

complex

...

than

...

Dissemination,

...

as

...

there

...

are

...

several

...

different

...

"modes"

...

available:

...

  1. Submit/Ingest

...

  1. Mode

...

  1. (

...

  1. -s

...

  1. option,

...

  1. default)

...

  1. submit

...

  1. AIP(s)

...

  1. to

...

  1. DSpace

...

  1. in

...

  1. order

...

  1. to

...

  1. create

...

  1. a

...

  1. new

...

  1. object(s)

...

  1. (i.e.

...

  1. AIP

...

  1. is

...

  1. treated

...

  1. like

...

  1. a

...

  1. SIP

...

  1. Submission

...

  1. Information

...

  1. Package)

...

  1. Restore

...

  1. Mode

...

  1. (

...

  1. -r

...

  1. option)

...

  1. restore

...

  1. pre-existing

...

  1. object(s)

...

  1. in

...

  1. DSpace

...

  1. based

...

  1. on

...

  1. AIP(s).

...

  1. This

...

  1. also

...

  1. attempts

...

  1. to

...

  1. restore

...

  1. all

...

  1. handles

...

  1. and

...

  1. relationships

...

  1. (parent/child

...

  1. objects).

...

  1. This

...

  1. is

...

  1. a

...

  1. specialized

...

  1. type

...

  1. of

...

  1. "submit",

...

  1. where

...

  1. the

...

  1. object

...

  1. is

...

  1. created

...

  1. with

...

  1. a

...

  1. known

...

  1. Handle

...

  1. and

...

  1. known

...

  1. relationships.

...

  1. Replace

...

  1. Mode

...

  1. (

...

  1. -r

...

  1. -f

...

  1. option)

...

  1. replace

...

  1. existing

...

  1. object(s)

...

  1. in

...

  1. DSpace

...

  1. based

...

  1. on

...

  1. AIP(s).

...

  1. This

...

  1. also

...

  1. attempts

...

  1. to

...

  1. restore

...

  1. all

...

  1. handles

...

  1. and

...

  1. relationships

...

  1. (parent/child

...

  1. objects).

...

  1. This

...

  1. is

...

  1. a

...

  1. specialized

...

  1. type

...

  1. of

...

  1. "restore"

...

  1. where

...

  1. the

...

  1. contents

...

  1. of

...

  1. existing

...

  1. object(s)

...

  1. is

...

  1. replaced

...

  1. by

...

  1. the

...

  1. contents

...

  1. in

...

  1. the

...

  1. AIP(s).

...

  1. By

...

  1. default,

...

  1. if

...

  1. a

...

  1. normal

...

  1. "restore"

...

  1. finds

...

  1. the

...

  1. object

...

  1. already

...

  1. exists,

...

  1. it

...

  1. will

...

  1. back

...

  1. out

...

  1. (i.e.

...

  1. rollback

...

  1. all

...

  1. changes)

...

  1. and

...

  1. report

...

  1. which

...

  1. object

...

  1. already

...

  1. exists.

...

Again,

...

like

...

export,

...

there

...

are

...

two

...

types

...

of

...

AIP

...

Ingestion

...

you

...

can

...

perform

...

(using

...

any

...

of

...

the

...

above

...

modes):

...

  • Single

...

  • AIP

...

  • (default)

...

  • -

...

  • Ingests

...

  • just

...

  • an

...

  • AIP

...

  • describing

...

  • a

...

  • single

...

  • DSpace

...

  • object.

...

  • So,

...

  • if

...

  • you

...

  • ran

...

  • it

...

  • in

...

  • this

...

  • default

...

  • mode

...

  • for

...

  • a

...

  • Collection

...

  • AIP,

...

  • you'd

...

  • just

...

  • create

...

  • a

...

  • DSpace

...

  • Collection

...

  • from

...

  • the

...

  • AIP

...

  • (but

...

  • not

...

  • ingest

...

  • any

...

  • of

...

  • its

...

  • child

...

  • objects)

...

  • Hierarchy

...

  • of

...

  • AIPs

...

  • (by

...

  • including

...

  • the

...

  • -

...

  • -all

...

  • or

...

  • -a

...

  • option

...

  • after

...

  • the

...

  • mode)

...

  • -

...

  • Ingests

...

  • the

...

  • requested

...

  • AIP

...

  • describing

...

  • an

...

  • object,

...

  • plus

...

  • the

...

  • AIP

...

  • for

...

  • all

...

  • child

...

  • objects.

...

  • Some

...

  • examples

...

  • follow:

...

    • For

...

    • a

...

    • Site

...

    • -

...

    • this

...

    • would

...

    • ingest

...

    • all

...

    • Communities,

...

    • Collections

...

    • &

...

    • Items

...

    • based

...

    • on

...

    • the

...

    • located

...

    • AIP

...

    • files

...

    • For

...

    • a

...

    • Community

...

    • -

...

    • this

...

    • would

...

    • ingest

...

    • that

...

    • Community

...

    • and

...

    • all

...

    • SubCommunities,

...

    • Collections

...

    • and

...

    • Items

...

    • based

...

    • on

...

    • the

...

    • located

...

    • AIP

...

    • files

...

    • For

...

    • a

...

    • Collection

...

    • -

...

    • this

...

    • would

...

    • ingest

...

    • that

...

    • Collection

...

    • and

...

    • all

...

    • contained

...

    • Items

...

    • based

...

    • on

...

    • the

...

    • located

...

    • AIP

...

    • files

...

    • For

...

    • an

...

    • Item

...

    • this

...

    • just

...

    • ingest

...

    • the

...

    • Item

...

    • (including

...

    • all

...

    • Bitstreams

...

    • &

...

    • Bundles)

...

    • based

...

    • on

...

    • the

...

    • AIP

...

    • file.

...

The

...

difference

...

between

...

"Submit"

...

and

...

"Restore/Replace"

...

modes

...

It's

...

worth

...

understanding

...

the

...

primary

...

differences

...

between

...

a

...

Submission

...

(specified

...

by

...

-s

...

parameter)

...

and

...

a

...

Restore

...

(specified

...

by

...

-r

...

parameter).

...

  • Submission

...

  • Mode

...

  • (

...

  • -s

...

  • )

...

  • -

...

  • creates

...

  • a

...

  • new

...

  • object

...

  • (AIP

...

  • is

...

  • treated

...

  • like

...

  • a

...

  • SIP)

...

    • By

...

    • default,

...

    • a

...

    • new

...

    • Handle

...

    • is

...

    • always

...

    • assigned
      • However,

...

      • you

...

      • can

...

      • force

...

      • it

...

      • to

...

      • use

...

      • the

...

      • handle

...

      • specified

...

      • in

...

      • the

...

      • AIP

...

      • by

...

      • specifying

...

      • -o

...

      • ignoreHandle=false

...

      • as

...

      • one

...

      • of

...

      • your

...

      • parameters

...

    • By

...

    • default,

...

    • a

...

    • new

...

    • Parent

...

    • object

...

    • must

...

    • be

...

    • specified

...

    • (using

...

    • the

...

    • -p

...

    • parameter).

...

    • This

...

    • is

...

    • the

...

    • location

...

    • where

...

    • the

...

    • new

...

    • object

...

    • will

...

    • be

...

    • created.

...

      • However,

...

      • you

...

      • can

...

      • force

...

      • it

...

      • to

...

      • use

...

      • the

...

      • parent

...

      • object

...

      • specified

...

      • in

...

      • the

...

      • AIP

...

      • by

...

      • specifiying

...

      • -o

...

      • ignoreParent=false

...

      • as

...

      • one

...

      • of

...

      • your

...

      • parameters

...

    • By

...

    • default,

...

    • will

...

    • respect

...

    • a

...

    • Collection's

...

    • Workflow

...

    • process

...

    • when

...

    • you

...

    • submit

...

    • an

...

    • Item

...

    • to

...

    • a

...

    • Collection

...

      • However,

...

      • you

...

      • can

...

      • specifically

...

      • skip

...

      • any

...

      • workflow

...

      • approval

...

      • processes

...

      • by

...

      • specifying

...

      • -w

...

      • parameter.

...

    • Always adds a new Deposit License to Items
    • Always adds new DSpace System metadata to Items (includes new 'dc.date.accessioned',

...

    • 'dc.date.available',

...

    • 'dc.date.issued'

...

    • and

...

    • 'dc.description.provenance'

...

    • entries)

...

  • Restore

...

  • /

...

  • Replace

...

  • Mode

...

  • -

...

  • restores

...

  • a

...

  • new

...

  • object

...

  • (as

...

  • if

...

  • from

...

  • a

...

  • backup)

...

    • By

...

    • default,

...

    • the

...

    • Handle

...

    • specified

...

    • in

...

    • the

...

    • AIP

...

    • is

...

    • restored

...

      • However,

...

      • for

...

      • restores,

...

      • you

...

      • can

...

      • force

...

      • a

...

      • new

...

      • handle

...

      • to

...

      • be

...

      • generated

...

      • by

...

      • specifying

...

      • -o

...

      • ignoreHandle=true

...

      • as

...

      • one

...

      • of

...

      • your

...

      • parameters.

...

      • (NOTE:

...

      • Doesn't

...

      • work

...

      • for

...

      • replace

...

      • mode

...

      • as

...

      • the

...

      • new

...

      • object

...

      • always

...

      • retains

...

      • the

...

      • handle

...

      • of

...

      • the

...

      • replaced

...

      • object)

...

    • By

...

    • default,

...

    • the

...

    • object

...

    • is

...

    • restored

...

    • under

...

    • the

...

    • Parent

...

    • specified

...

    • in

...

    • the

...

    • AIP

...

      • However,

...

      • for

...

      • restores,

...

      • you

...

      • can

...

      • force

...

      • it

...

      • to

...

      • restore

...

      • under

...

      • a

...

      • different

...

      • parent

...

      • object

...

      • by

...

      • using

...

      • the

...

      • -p

...

      • parameter.

...

      • (NOTE:

...

      • Doesn't

...

      • work

...

      • for

...

      • replace

...

      • mode,

...

      • as

...

      • the

...

      • new

...

      • object

...

      • always

...

      • retains

...

      • the

...

      • parent

...

      • of

...

      • the

...

      • replaced

...

      • object)

...

    • Always skips any Collection workflow approval processes when restoring/replacing

...

    • an

...

    • Item

...

    • in a Collection
    • Never adds a new Deposit License to Items (rather it restores the previous deposit license, as long as it is stored in the AIP)
    • Never adds new DSpace System metadata to Items (rather it just restores the metadata as specified in the AIP)

Submitting AIP(s) to create a new object

Submitting a Single AIP
Note
titleAIPs treated as SIPs

This option allows you to essentially use an AIP as a SIP (Submission Information Package). The default settings will create a new DSpace object (with a new handle and a new parent object, if specified) from your AIP.

To ingest a single AIP and create a new DSpace object under a parent of your choice, specify the -p (or --parent) package parameter to the command. Also, note that you are running the packager in -s (submit) mode.

NOTE: This only ingests the single AIP specified. It does not ingest all children objects.

Code Block
 a Collection
** *Never* adds a new Deposit License to Items (rather it restores the previous deposit license, as long as it is stored in the AIP)
** *Never* adds new DSpace System metadata to Items (rather it just restores the metadata as specified in the AIP)

h4. Submitting AIP(s) to create a new object

h5. Submitting a Single AIP

{note=AIPs treated as SIPs}This option allows you to essentially use an AIP as a SIP (Submission Information Package).  The default settings will create a new DSpace object (with a new handle and a new parent object, if specified) from your AIP.{note}

To ingest a single AIP and create a new DSpace object under a parent of your choice, specify the {{\-p}} (or {{\--parent}}) package parameter to the command.  Also, note that you are running the {{packager}} in {{\-s}} (submit) mode.

_NOTE:_ This only ingests the single AIP specified.  It does *not* ingest all children objects.

{code} /dspace/bin/dspace packager -s -t AIP -e <eperson> -p <parent-handle> <file-path>
{code}

If

...

you

...

leave

...

out

...

the

...

-p

...

parameter,

...

the

...

AIP

...

package

...

ingester

...

will

...

attempt

...

to

...

install

...

the

...

AIP

...

under

...

the

...

same

...

parent

...

it

...

had

...

before.

...

As

...

you

...

are

...

also

...

specifying

...

the

...

-s

...

(submit)

...

parameter,

...

the

...

packager

...

will

...

assume

...

you

...

want

...

a

...

new

...

Handle

...

to

...

be

...

assigned

...

(as

...

you

...

are

...

effectively

...

specifying

...

that

...

you

...

are

...

submitting

...

a

...

new

...

object).

...

If

...

you

...

want

...

the

...

object

...

to

...

retain

...

the

...

Handle

...

specified

...

in

...

the

...

AIP,

...

you

...

can

...

specify

...

the

...

-o

...

ignoreHandle=false

...

option

...

to

...

force

...

the

...

packager

...

to

...

not

...

ignore

...

the

...

Handle

...

specified

...

in

...

the

...

AIP.

Submitting an AIP Hierarchy
Note
titleAIPs treated as SIPs

This option allows you to essentially use a set of AIPs as SIPs (Submission Information Packages). The default settings will create a new DSpace object (with a new handle and a new parent object, if specified) from each AIP

To ingest an AIP hierarchy from a directory of AIPs, use the -a (or --all) package parameter.

For example, use this 'packager' command template:

Code Block



h5. Submitting an AIP Hierarchy

{note:title=AIPs treated as SIPs}This option allows you to essentially use a set of AIPs as SIPs (Submission Information Packages).  The default settings will create a new DSpace object (with a new handle and a new parent object, if specified) from each AIP {note}

To ingest an AIP hierarchy from a directory of AIPs, use the {{\-a}} (or {{\--all}}) package parameter.

For example, use this 'packager' command template:

{code} /dspace/bin/dspace packager -s -a -t AIP -e <eperson> -p <parent-handle> <file-path>
{code}

for

...

example:

{
Code Block
} /dspace/bin/dspace packager -s -a -t AIP -e admin@myu.edu -p 4321/12 aip4567.zip
{code}

The

...

above

...

command

...

will

...

ingest

...

the

...

package

...

named

...

"aip4567.zip"

...

as

...

a

...

child

...

of

...

the

...

specified

...

Parent

...

Object

...

(handle="4321/12").

...

The

...

resulting

...

object

...

is

...

assigned

...

a

...

new

...

Handle

...

(since

...

-s

...

is

...

specified).

...

In

...

addition,

...

any

...

child

...

AIPs

...

referenced

...

by

...

"aip4567.zip"

...

are

...

also

...

recursively

...

ingested

...

(a

...

new

...

Handle

...

is

...

also

...

assigned

...

for

...

each

...

child

...

AIP).

...

Another

...

example

...

Ingesting

...

a

...

Top-Level

...

Community

...

(by

...

using

...

the

...

Site

...

Handle,

...

<site-handle-prefix>/0

...

):

{
Code Block
} /dspace/bin/dspace packager -s -a -t AIP -e admin@myu.edu -p 4321/0 community-aip.zip
{code}

The

...

above

...

command

...

will

...

ingest

...

the

...

package

...

named

...

"community-aip.zip"

...

as

...

a

...

top-level

...

community

...

(i.e.

...

the

...

specified

...

parent

...

is

...

"4321/0"

...

which

...

is

...

a

...

Site

...

Handle).

...

Again,

...

the

...

resulting

...

object

...

is

...

assigned

...

a

...

new

...

Handle.

...

In

...

addition,

...

any

...

child

...

AIPs

...

referenced

...

by

...

"community-aip.zip"

...

are

...

also

...

recursively

...

ingested

...

(a

...

new

...

Handle

...

is

...

also

...

assigned

...

for

...

each

...

child

...

AIP).

...

Restoring/Replacing

...

using

...

AIP(s)

...

Restoring

...

is

...

slightly

...

different

...

than

...

just

...

submitting

...

.

...

When

...

restoring,

...

we

...

make

...

every

...

attempt

...

to

...

restore

...

the

...

object

...

as

...

it

...

used

...

to

...

be

...

(including

...

its

...

handle,

...

parent

...

object,

...

etc.).

...

There

...

are

...

currently

...

three

...

restore

...

modes:

...

  1. Default

...

  1. Restore

...

  1. Mode

...

  1. (

...

  1. -r

...

  1. )

...

  1. =

...

  1. Attempt

...

  1. to

...

  1. restore

...

  1. object

...

  1. (and

...

  1. optionally

...

  1. children).

...

  1. Rollback

...

  1. all

...

  1. changes

...

  1. if

...

  1. any

...

  1. object

...

  1. is

...

  1. found

...

  1. to

...

  1. already

...

  1. exist.

...

  1. Restore,

...

  1. Keep

...

  1. Existing

...

  1. Mode

...

  1. (

...

  1. -r

...

  1. -k

...

  1. )

...

  1. =

...

  1. Attempt

...

  1. to

...

  1. restore

...

  1. object

...

  1. (and

...

  1. optionally

...

  1. children).

...

  1. If

...

  1. an

...

  1. object

...

  1. is

...

  1. found

...

  1. to

...

  1. already

...

  1. exist,

...

  1. skip

...

  1. over

...

  1. it

...

  1. (and

...

  1. all

...

  1. children

...

  1. objects),

...

  1. and

...

  1. continue

...

  1. to

...

  1. restore

...

  1. all

...

  1. other

...

  1. non-existing

...

  1. objects.

...

  1. Force

...

  1. Replace

...

  1. Mode

...

  1. (

...

  1. -r

...

  1. -f

...

  1. )

...

  1. =

...

  1. Restore

...

  1. an

...

  1. object

...

  1. (and

...

  1. optionally

...

  1. children)

...

  1. and

...

  1. overwrite

...

  1. any

...

  1. existing

...

  1. objects

...

  1. in

...

  1. DSpace.

...

  1. Therefore,

...

  1. if

...

  1. an

...

  1. object

...

  1. is

...

  1. found

...

  1. to

...

  1. already

...

  1. exist

...

  1. in

...

  1. DSpace,

...

  1. its

...

  1. contents

...

  1. are

...

  1. replaced

...

  1. by

...

  1. the

...

  1. contents

...

  1. of

...

  1. the

...

  1. AIP.

...

  1. WARNING:

...

  1. This

...

  1. mode

...

  1. is

...

  1. potentially

...

  1. dangerous

...

  1. as

...

  1. it

...

  1. will

...

  1. permanently

...

  1. destroy

...

  1. any

...

  1. object

...

  1. contents

...

  1. that

...

  1. do

...

  1. not

...

  1. currently

...

  1. exist

...

  1. in

...

  1. the

...

  1. AIP.

...

  1. You

...

  1. may

...

  1. want

...

  1. to

...

  1. perform

...

  1. a

...

  1. secondary

...

  1. backup,

...

  1. unless

...

  1. you

...

  1. are

...

  1. sure

...

  1. you

...

  1. know

...

  1. what

...

  1. you

...

  1. are

...

  1. doing

...

  1. !
Info
titleRestoring a Single AIP

All of the below examples show how to restore an entire hierarchy of objects (using -a option). To restore a single object, you can use the same commands, but remove the -a option.

Default Restore Mode

By default, the restore mode (-r option) will rollback all changes if any object is found to already exist. The user will be informed if which object already exists within their DSpace installation.

Use this 'packager' command template:

Code Block
_

{info:title=Restoring a Single AIP}All of the below examples show how to restore an entire hierarchy of objects (using {{-a}} option).   To restore a single object, you can use the same commands, but remove the {{-a}} option.{info}

h5. Default Restore Mode

By default, the restore mode ({{\-r}} option) will rollback all changes if any object is found to already exist.  The user will be informed if which object already exists within their DSpace installation.

Use this 'packager' command template:
{code} /dspace/bin/dspace packager -r -a -t AIP -e <eperson> <file-path>
{code}

For

...

example:

{
Code Block
} /dspace/bin/dspace packager -r -a -t AIP -e admin@myu.edu aip4567.zip
{code}

_Notice that unlike_ {{_\-s{_}}} _option (for 

Notice that unlike -s option (for submission/ingesting),

...

the

...

-r

...

option

...

does

...

not

...

require

...

the

...

Parent

...

Object

...

(

...

-p

...

option)

...

to

...

be

...

specified

...

if

...

it

...

can

...

be

...

determined

...

from

...

the

...

package

...

itself.

...

In

...

the

...

above

...

example,

...

the

...

package

...

"aip4567.zip"

...

is

...

restored

...

to

...

the

...

DSpace

...

installation

...

with

...

the

...

Handle

...

provided

...

within

...

the

...

package

...

itself

...

(and

...

added

...

as

...

a

...

child

...

of

...

the

...

parent

...

object

...

specified

...

within

...

the

...

package

...

itself).

...

In

...

addition,

...

any

...

child

...

AIPs

...

referenced

...

by

...

"aip4567.zip"

...

are

...

also

...

recursively

...

ingested

...

(the

...

-a

...

option

...

specifies

...

to

...

also

...

restore

...

all

...

child

...

AIPs).

...

They

...

are

...

also

...

restored

...

with

...

the

...

Handles

...

&

...

Parent

...

Objects

...

provided

...

with

...

their

...

package.

...

If

...

any

...

object

...

is

...

found

...

to

...

already

...

exist,

...

all

...

changes

...

are

...

rolled

...

back

...

(i.e.

...

nothing

...

is

...

restored

...

to

...

DSpace)

...

Restore,

...

Keep

...

Existing

...

Mode

...

When

...

the

...

"Keep

...

Existing"

...

flag

...

(

...

-k

...

option)

...

is

...

specified,

...

the

...

restore

...

will

...

attempt

...

to

...

skip

...

over

...

any

...

objects

...

found

...

to

...

already

...

exist.

...

It

...

will

...

report

...

to

...

the

...

user

...

that

...

the

...

object

...

was

...

found

...

to

...

exist

...

(and

...

was

...

not

...

modified

...

or

...

changed).

...

It

...

will

...

then

...

continue

...

to

...

restore

...

all

...

objects

...

which

...

do

...

not

...

already

...

exist.

...

One

...

special

...

case

...

to

...

note:

...

If

...

a

...

Collection

...

or

...

Community

...

is

...

found

...

to

...

already

...

exist,

...

its

...

child

...

objects

...

are

...

also

...

skipped

...

over.

...

So,

...

this

...

mode

...

will

...

not

...

auto-restore

...

items

...

to

...

an

...

existing

...

Collection.

...

Use

...

this

...

'packager'

...

command

...

template:

{
Code Block
} /dspace/bin/dspace packager -r -a -k -t AIP -e <eperson> <file-path>
{code}

For

...

example:

{
Code Block
} /dspace/bin/dspace packager -r -a -k -t AIP -e admin@myu.edu aip4567.zip
{code}

In

...

the

...

above

...

example,

...

the

...

package

...

"aip4567.zip"

...

is

...

restored

...

to

...

the

...

DSpace

...

installation

...

with

...

the

...

Handle

...

provided

...

within

...

the

...

package

...

itself

...

(and

...

added

...

as

...

a

...

child

...

of

...

the

...

parent

...

object

...

specified

...

within

...

the

...

package

...

itself).

...

In

...

addition,

...

any

...

child

...

AIPs

...

referenced

...

by

...

"aip4567.zip"

...

are

...

also

...

recursively

...

restored

...

(the

...

-a

...

option

...

specifies

...

to

...

also

...

restore

...

all

...

child

...

AIPs).

...

They

...

are

...

also

...

restored

...

with

...

the

...

Handles

...

&

...

Parent

...

Objects

...

provided

...

with

...

their

...

package.

...

If

...

any

...

object

...

is

...

found

...

to

...

already

...

exist,

...

it

...

is

...

skipped

...

over

...

(child

...

objects

...

are

...

also

...

skipped).

...

All

...

non-existing

...

objects

...

are

...

restored.

...

Force

...

Replace

...

Mode

...

When

...

the

...

"Force

...

Replace"

...

flag

...

(

...

-f

...

option)

...

is

...

specified,

...

the

...

restore

...

will

...

overwrite

...

any

...

objects

...

found

...

to

...

already

...

exist

...

in

...

DSpace.

...

In

...

other

...

words,

...

existing

...

content

...

is

...

deleted

...

and

...

then

...

replaced

...

by

...

the

...

contents

...

of

...

the

...

AIP(s).

{:=
Warning
title
Potential
for
Data
Loss
}

Because

this

mode

actually

*

destroys

*

existing

content

in

DSpace,

it

is

potentially

dangerous

and

may

result

in

data

loss

\

!

It

is

recommended

to

always

perform

a

secondary

full

backup

(assetstore

files

&

database)

before

attempting

to

replace

any

existing

object(s)

in

DSpace.

{warning} {warning:

Warning
title:Full Site Replace Not Recommended
title:Full
Site
Replace
Not
Recommended
}

This

doesn't

100%

work

yet

for

an

entire

Site

\

!

You've

been

warned

\

!

\

!

\

!

-

Tim

{warning}

Use

...

this

...

'packager'

...

command

...

template:

{
Code Block
} /dspace/bin/dspace packager -r -a -f -t AIP -e <eperson> <file-path>
{code}

For

...

example:

{
Code Block
} /dspace/bin/dspace packager -r -a -f -t AIP -e admin@myu.edu aip4567.zip
{code}

In

...

the

...

above

...

example,

...

the

...

package

...

"aip4567.zip"

...

is

...

restored

...

to

...

the

...

DSpace

...

installation

...

with

...

the

...

Handle

...

provided

...

within

...

the

...

package

...

itself

...

(and

...

added

...

as

...

a

...

child

...

of

...

the

...

parent

...

object

...

specified

...

within

...

the

...

package

...

itself).

...

In

...

addition,

...

any

...

child

...

AIPs

...

referenced

...

by

...

"aip4567.zip"

...

are

...

also

...

recursively

...

ingested.

...

They

...

are

...

also

...

restored

...

with

...

the

...

Handles

...

&

...

Parent

...

Objects

...

provided

...

with

...

their

...

package.

...

If

...

any

...

object

...

is

...

found

...

to

...

already

...

exist,

...

its

...

contents

...

are

...

replaced

...

by

...

the

...

contents

...

of

...

the

...

appropriate

...

AIP.

...

If

...

any

...

error

...

occurs,

...

the

...

script

...

attempts

...

to

...

rollback

...

the

...

entire

...

replacement

...

process.

...

Restoring

...

Entire

...

Site

...

Details

...

Coming

...

Soon

...

!

...

In

...

all

...

likelihood

...

it

...

will

...

take

...

the

...

same

...

parameters

...

as

...

the

...

"Exporting

...

entire

...

Site",

...

except

...

that

...

you'll

...

be

...

running

...

the

...

packager

...

in

...

-r

...

(restore)

...

mode.

...

Configuration in 'dspace.cfg'

...

The

...

following

...

new

...

configurations

...

relate

...

to

...

AIPs:

...

AIP

...

Metadata

...

Dissemination

...

Configurations

...

The

...

following

...

configurations

...

allow

...

you

...

to

...

specify

...

what

...

metadata

...

is

...

stored

...

within

...

each

...

METS-based

...

AIP.

...

In

...

'dspace.cfg',

...

the

...

general

...

format

...

for

...

each

...

of

...

these

...

settings

...

is:

...

  • Wiki Markup
    {{aip.disseminate.<setting> = <mdType>:<DSpace-crosswalk-name> \[, ...\]}}

...

    • <setting>

...

    • is

...

    • the

...

    • setting

...

    • name

...

    • (see

...

    • below

...

    • for

...

    • the

...

    • full

...

    • list

...

    • of

...

    • valid

...

    • settings)

...

    • <mdType>

...

    • is

...

    • optional.

...

    • It

...

    • allows

...

    • you

...

    • to

...

    • specify

...

    • the

...

    • value

...

    • of

...

    • the

...

    • @MDTYPE

...

    • or

...

    • @OTHERMDTYPE

...

    • attribute

...

    • in

...

    • the

...

    • corresponding

...

    • METS

...

    • element.

...

    • <DSpace-crosswalk-name>

...

    • is

...

    • required.

...

    • It

...

    • specifies

...

    • the

...

    • name

...

    • of

...

    • the

...

    • DSpace

...

    • Crosswalk

...

    • which

...

    • should

...

    • be

...

    • used

...

    • to

...

    • generate

...

    • this

...

    • metadata.

...

    • Zero

...

    • or

...

    • more

...

    • <label-for-METS>:<DSpace-crosswalk-name>

...

    • may

...

    • be

...

    • specified

...

    • for

...

    • each

...

    • setting
{:=
Info
title
AIP
Metadata
Recommendations
}

It

is

recommended

to

*

minimally

*

use

the

default

settings

when

generating

AIPs.

DSpace

can

only

restore

information

that

is

included

within

an

AIP.

Therefore,

if

you

choose

to

no

longer

include

some

information

in

an

AIP,

DSpace

will

no

longer

be

able

to

restore

that

information

from

an

AIP

backup {info} The default settings in

backup

The default settings in 'dspace.cfg'

...

are:

...

  • aip.disseminate.techMD

...

  • -

...

  • Lists

...

  • the

...

  • DSpace

...

  • Crosswalks

...

  • (by

...

  • name)

...

  • which

...

  • should

...

  • be

...

  • called

...

  • to

...

  • populate

...

  • the

...

  • <techMD>

...

  • section

...

  • of

...

  • the

...

  • METS

...

  • file

...

  • within

...

  • the

...

  • AIP

...

  • (Default:

...

  • PREMIS)

...

    • The

...

    • PREMIS

...

    • Crosswalk

...

    • generates

...

    • PREMIS

...

    • metadata

...

    • for

...

    • the

...

    • object

...

    • specified

...

    • by

...

    • the

...

    • AIP

...

  • aip.disseminate.sourceMD

...

  • -

...

  • Lists

...

  • the

...

  • DSpace

...

  • Crosswalks

...

  • (by

...

  • name)

...

  • which

...

  • should

...

  • be

...

  • called

...

  • to

...

  • populate

...

  • the

...

  • <sourceMD>

...

  • section

...

  • of

...

  • the

...

  • METS

...

  • file

...

  • within

...

  • the

...

  • AIP

...

  • (Default:

...

  • AIP-TECHMD)

...

    • The

...

    • AIP-TECHMD

...

    • Crosswalk

...

    • generates

...

    • technical

...

    • metadata

...

    • (in

...

    • DIM

...

    • format)

...

    • for

...

    • the

...

    • object

...

    • specified

...

    • by

...

    • the

...

    • AIP

...

  • aip.disseminate.digiprovMD

...

  • -

...

  • Lists

...

  • the

...

  • DSpace

...

  • Crosswalks

...

  • (by

...

  • name)

...

  • which

...

  • should

...

  • be

...

  • called

...

  • to

...

  • populate

...

  • the

...

  • <digiprovMD>

...

  • section

...

  • of

...

  • the

...

  • METS

...

  • file

...

  • within

...

  • the

...

  • AIP

...

  • (Default:

...

  • None

...

  • )

...

  • aip.disseminate.rightsMD

...

  • -

...

  • Lists

...

  • the

...

  • DSpace

...

  • Crosswalks

...

  • (by

...

  • name)

...

  • which

...

  • should

...

  • be

...

  • called

...

  • to

...

  • populate

...

  • the

...

  • <rightsMD>

...

  • section

...

  • of

...

  • the

...

  • METS

...

  • file

...

  • within

...

  • the

...

  • AIP

...

  • (Default:

...

  • DSpaceDepositLicense:DSPACE_DEPLICENSE,

...

  • CreativeCommonsRDF:DSPACE_CCRDF,

...

  • CreativeCommonsText:DSPACE_CCTEXT)

...

    • The

...

    • DSPACE_DEPLICENSE

...

    • crosswalk

...

    • ensures

...

    • the

...

    • DSpace

...

    • Deposit

...

    • License

...

    • is

...

    • referenced/stored

...

    • in

...

    • AIP

...

    • The

...

    • DSPACE_CCRDF

...

    • crosswalk

...

    • ensures

...

    • any

...

    • Creative

...

    • Commons

...

    • RDF

...

    • Licenses

...

    • are

...

    • reference/stored

...

    • in

...

    • AIP

...

    • The

...

    • DSPACE_CCTEXT

...

    • crosswalk

...

    • ensures

...

    • any

...

    • Creative

...

    • Commons

...

    • Textual

...

    • Licenses

...

    • are

...

    • referenced/stored

...

    • in

...

    • AIP

...

  • aip.disseminate.dmd

...

  • -

...

  • Lists

...

  • the

...

  • DSpace

...

  • Crosswalks

...

  • (by

...

  • name)

...

  • which

...

  • should

...

  • be

...

  • called

...

  • to

...

  • populate

...

  • the

...

  • <dmdSec>

...

  • section

...

  • of

...

  • the

...

  • METS

...

  • file

...

  • within

...

  • the

...

  • AIP

...

  • (Default:

...

  • MODS,

...

  • DIM)

...

    • The

...

    • MODS

...

    • crosswalk

...

    • translates

...

    • the

...

    • DSpace

...

    • descriptive

...

    • metadata

...

    • (for

...

    • this

...

    • object)

...

    • into

...

    • MODS.

...

    • As

...

    • MODS

...

    • is

...

    • a

...

    • relatively

...

    • "standard"

...

    • metadata

...

    • schema,

...

    • it

...

    • may

...

    • be

...

    • useful

...

    • to

...

    • include

...

    • a

...

    • copy

...

    • of

...

    • MODS

...

    • metadata

...

    • in

...

    • your

...

    • AIPs

...

    • if

...

    • you

...

    • should

...

    • ever

...

    • want

...

    • to

...

    • import

...

    • them

...

    • into

...

    • another

...

    • (non-DSpace)

...

    • system.

...

    • The

...

    • DIM

...

    • crosswalk

...

    • just

...

    • translates

...

    • the

...

    • DSpace

...

    • internal

...

    • descriptive

...

    • metadata

...

    • into

...

    • an

...

    • XML

...

    • format.

...

    • This

...

    • XML

...

    • format

...

    • is

...

    • proprietary

...

    • to

...

    • DSpace,

...

    • but

...

    • stores

...

    • the

...

    • metadata

...

    • in

...

    • a

...

    • format

...

    • similar

...

    • to

...

    • Qualified

...

    • Dublin

...

    • Core.

...

AIP

...

Ingestion

...

Metadata

...

Crosswalk

...

Configurations

...

The

...

following

...

configurations

...

allow

...

you

...

to

...

specify

...

what

...

DSpace

...

Crosswalks

...

are

...

used

...

during

...

the

...

ingestion/restoration

...

of

...

AIPs.

...

These

...

configurations

...

also

...

allow

...

you

...

to

...

ignore

...

areas

...

of

...

the

...

METS

...

file

...

(in

...

the

...

AIP)

...

if

...

you

...

do

...

not

...

want

...

that

...

area

...

to

...

be

...

restored.

...

In

...

dspace.cfg

...

,

...

the

...

general

...

format

...

for

...

each

...

of

...

these

...

settings

...

is:

...

  • mets.dspaceAIP.ingest.crosswalk.<mdType>

...

  • =

...

  • <DSpace-crosswalk-name>

...

    • <mdType> is the type of metadata as specified in the METS file. This corresponds to the value of the @MDTYPE attribute (of that metadata section in the METS). When the @MDTYPE attribute is "OTHER", then the <mdType> corresponds to the @OTHERMDTYPE attribute value.
    • <DSpace-crosswalk-name>

...

    • specifies

...

    • the

...

    • name

...

    • of

...

    • the

...

    • DSpace

...

    • Crosswalk

...

    • which

...

    • should

...

    • be

...

    • used

...

    • to

...

    • ingest

...

    • this

...

    • metadata

...

    • into

...

    • DSpace.

...

    • You

...

    • can

...

    • specify

...

    • the

...

    • "NULLSTREAM"

...

    • crosswalk

...

    • if

...

    • you

...

    • specifically

...

    • want

...

    • this

...

    • metadata

...

    • to

...

    • be

...

    • ignored

...

    • (and

...

    • skipped

...

    • over

...

    • during

...

    • ingestion).

...

By

...

default,

...

the

...

settings

...

in

...

dspace.cfg

...

are:

{
Code Block
}
mets.dspaceAIP.ingest.crosswalk.DSpaceDepositLicense = NULLSTREAM
mets.dspaceAIP.ingest.crosswalk.CreativeCommonsRDF = NULLSTREAM
mets.dspaceAIP.ingest.crosswalk.CreativeCommonsText = NULLSTREAM
{code}

The

...

above

...

settings

...

tell

...

the

...

ingester

...

to

...

ignore

...

any

...

metadata

...

sections

...

which

...

reference

...

DSpace

...

Deposit

...

Licenses

...

or

...

Creative

...

Commons

...

Licenses.

...

These

...

metadata

...

sections

...

can

...

be

...

safely

...

ignored

...

as

...

long

...

as

...

the

...

"LICENSE"

...

and

...

"CC_LICENSE"

...

bundles

...

are

...

included

...

in

...

AIPs

...

(which

...

is

...

the

...

default

...

setting).

...

As

...

the

...

Licenses

...

are

...

included

...

in

...

those

...

Bundles,

...

they

...

will

...

already

...

be

...

restored

...

when

...

restoring

...

the

...

bundle

...

contents.

{:=
Info
title
More
Info
on
Default
Crosswalks
used
}

If

unspecified

in

the

above

settings,

the

AIP

ingester

will

automatically

use

the

Crosswalk

which

is

named

the

same

as

the

@MDTYPE

or

@OTHERMDTYPE

attribute

for

the

metadata

section.

For

example,

a

metadata

section

with

an

@MDTYPE="PREMIS"

will

be

processed

by

the

DSpace

Crosswalk

named

"PREMIS".

{info} h3. AIP Ingestion EPerson Configurations The following setting determines whether the AIP Ingester should create an EPerson (if necessary) when attempting to restore or ingest an Item whose Submitter cannot be located in the system. By default it is set to "false" * {{

AIP Ingestion EPerson Configurations

The following setting determines whether the AIP Ingester should create an EPerson (if necessary) when attempting to restore or ingest an Item whose Submitter cannot be located in the system. By default it is set to "false"

  • mets.dspaceAIP.ingest.createSubmitter

...

  • =

...

  • false

AIP Configurations To Improve Ingestion Speed while Validating

It is recommended to validate all AIPs on ingestion (when possible). But validation can be extremely slow, as each validation request first must download all referenced Schema documents from various locations on the web (sometimes as many as 10 schemas may be necessary to download in order to validate a single METS file).

In order to perform validations in a speedy fashion, you can pull down a local copy of all schemas. Validation will then use this local cache, which can sometimes increase the speed up to 10X.

To use a local cache of XML schemas when validating, use the following settings in 'dspace.cfg'. The general format is:

  • mets.xsd.<abbreviation> = <namespace> <local-file-name>
    • <abbreviation> is a unique abbreviation (of your choice) for this schema
    • <namespace> is the Schema namespace
    • Wiki Markup
      {{<local-file-name>}} the full name of the cached schema file (which should reside in your {{\[dspace\]/config/schemas/}} directory)

...

Wiki Markup
The default settings are all commented out.  But, they provide a full listing of all schemas currently used during validation of AIPs.  In order to utilize them, uncomment the settings, download the appropriate schema file, and save it to your {{\[dspace\]/config/schemas/}} using the specified file name:

{
Code Block
}
#mets.xsd.mets = http://www.loc.gov/METS/ mets.xsd
#mets.xsd.xlink = http://www.w3.org/1999/xlink xlink.xsd
#mets.xsd.mods = http://www.loc.gov/mods/v3 mods.xsd
#mets.xsd.xml = http://www.w3.org/XML/1998/namespace xml.xsd
#mets.xsd.dc = http://purl.org/dc/elements/1.1/ dc.xsd
#mets.xsd.dcterms = http://purl.org/dc/terms/ dcterms.xsd
#mets.xsd.premis = http://www.loc.gov/standards/premis PREMIS.xsd
#mets.xsd.premisObject = http://www.loc.gov/standards/premis PREMIS-Object.xsd
#mets.xsd.premisEvent = http://www.loc.gov/standards/premis PREMIS-Event.xsd
#mets.xsd.premisAgent = http://www.loc.gov/standards/premis PREMIS-Agent.xsd
#mets.xsd.premisRights = http://www.loc.gov/standards/premis PREMIS-Rights.xsd
{code}

h2. 

To-Do

...

List

...

What

...

remains

...

to

...

be

...

done!

...

Testing

...

Special

...

Cases

...

during

...

Restore/Replace

...

The

...

below

...

special

...

cases

...

need

...

further

...

testing,

...

especially

...

when

...

performing

...

a

...

"Restore"

...

or

...

"Replace".

...

Mostly,

...

these

...

are

...

just

...

notes

...

for

...

Tim

...

(and

...

other

...

developers),

...

to

...

ensure

...

that

...

all

...

these

...

various

...

"edge"

...

cases

...

can

...

be

...

restored

...

properly

...

(or

...

perhaps

...

not

...

restored

...

properly,

...

if

...

the

...

decision

...

is

...

made

...

that

...

it

...

needs

...

not

...

be

...

restored).

...

As

...

each

...

special

...

case

...

is

...

implemented,

...

we

...

can

...

check

...

off

...

the

...

item

...

in

...

the

...

below

...

list.

...

Special

...

cases

...

which

...

have

...

been

...

fully

...

tested

...

&

...

implemented

...

are

...

marked

...

with

...

a

...

(tick). Feel free to add more special cases to this listing, if we missed anything.

Item Restoration/Replacement

Special Cases

  • (tick) Restore existing Deposit License from AIP – i.e. do not add a new license (or change the license) during restore/replace
  • (tick) Restore existing CC License(s)
  • Restore item mappings to multiple collections (for items which are mapped to several collections)
  • (tick) Restore withdrawal state
  • Restore embargo state
  • Restore permissions & roles (user/group permissions), if possible
  • Options to restore just metadata or just particular bitstreams/bundles?
  • Will not restore items which have not made it into the "archived" state. In other words, at this time, there are no plans to restore items which are still in an approval workflow (WorkflowItems) or items which are unfinished submissions (WorkspaceItems). WorkspaceItems and WorkflowItems are never exported as AIPs.

Collection Restoration/Replacement

Special Cases

  • Restore permissions & roles (user/group permissions), if possible
    • Restore Workflow approval groups
  • (tick) Restore Collection-specific license
  • Restore Collection's Item Template?
  • Restore Collection's content source info? (e.g. OAI-Harvesting Collections versus normal Collections)

Community Restoration/Replacement

Special Cases

  • Restore permissions & roles (user/group permissions), if possible

Admin UI work

As part of the CurationTaskProposal (led by Richard Rodgers & MIT), a new Curation Framework is in the works. This Curation Framework will have a Command Line interface initially. However, the goal for 1.7, is to also have Administrative UI tools which are able to kick off various "curation tools". Among these curation tools will be the ability to export/import AIPs via the Admin UI.

Notes on AIP ingest speed & improving it

Some very basic ingestion speed tests were performed on a set of 26 AIPs (which represented a Community containing a Collection containing 24 Items). These tests found that, by default, the parsing/ingest settings are currently not optimized for speed.

Here are the basic (non-scientific) results

  • Default Settings (validates all METS files using external Schemas): took about 1 minute, 12 seconds to ingest all 26 AIPs
  • Locally cached all schemas (with validation turned on): took about 12 seconds to ingest all 26 AIPs
    • You can locally cache all schemas by using the mets.xsd.* settings in dspace.cfg
  • No validation (-o validate=false flag): took about 11 seconds to ingest all 26 AIPs

Discussion / Use Cases

Please add your own potential use cases or discussion topics

  • MIT Use Cases - Notes on defining common operations in a replication system.

Questions / Comments?

Questions or comments – either add them inline above, or contact Tim Donohue