Page History
Table of Contents | ||||
---|---|---|---|---|
|
Panelnote |
---|
This code is now available on the current DSpace SVN Trunk (http://scm.dspace.org/svn/repo/dspace/trunk/ |
...
Note |
---|
Additional background information available in the OR10 Presentation entitled Improving DSpace Backups, Restores & Migrations |
This work comes out of a requirement for DSpace integration with DuraCloud (http://www.duracloud.org). One of these requirements is to be able to essentially "backup" local DSpace contents into the cloud (as a type of offsite backup), and "restore" those contents at a later time.
Essentially, we 'd like need a way to be able to export the entire hierarchy (i.e. bitstreams, metadata and relationships between Communities/Collections/Items) into a relatively standard format (e.g. METS or similar structured packaging format). This entire hierarchy should also be able to be re-imported into DSpace in the same format, to allow for "roundtrippinground-tripping" of that content (essentially a restore of that content in the same or different DSpace installation).
...
- Would allow folks to more easily move entire Communities or Collections between DSpace instances.
- Would allow for a potentially more consistent backup of this hierarchy (e.g. to DuraCloud, or just to your own local backup system), rather than relying on synchronizing a backup of your DB (metadata/relationships) and assetstore (bitstreams).
- Would provide a way for people to more easily get their data out of DSpace (whatever the purpose may be).
- Would provide a relatively standard format for people to migrate entire hierarchies (Communities/Collections) into DSpace (from another system).
Known Issues:
- Exporting/Importing the Community/Collection/Item hierarchy technically doesn't cover all the "content" held in DSpace. There are also Groups, EPeople and permissions/rights (which would get you closer to a full export/import of all DSpace content). However, concentrating on just the hierarchy of Community/Collection/Item seems like a good first step.
This is related to (and a partial subset of) MIT's AipPrototype: http://jira.dspace.org/jira/browse/DS-465 However, the This is related to (and a partial subset of) MIT's AipPrototype. However, the original AIP prototype did not make it very easy to re-import the exported AIPs for Communities or Collections. So, this prototype extends on the old AIP prototype's packagers/crosswalks to allow for an full export and import of an entire DSpace hierarchy, or just a set of Communities, Collections or Items.
How does this work help DSpace interact with DuraCloud?
In this initial prototype, this This work is entirely about exporting DSpace content objects to a location on a local filesystem. So, this work doesn't interact solely with DuraCloud, and could be used by any backup storage system to backup your DSpace contents.
...
For more specific details of AIP format / structure, along with examples, please see DSpaceAIPFormat
Where to get the Code
There is an SVN sandbox area for this work (so that others can help out, if it interests them). If anyone has comments, suggestions or feedback on this idea, or would like to be involved in this project, definitely let me know (or add comments to this wiki page).The latest code is available on DSpace Trunk (and will be released in DSpace 1.7.0)
Code Block |
---|
svn co http://scm.dspace.org/svn/repo/sandbox/aip-external-1_6-prototypedspace/trunk/ |
What code has really changed?
...
- org.dspace.content.packager.* - Packager classes
PackageIngester
interface - Now ingests 'java.io.File' objects instead of InputStreams (to better support recursive imports of Communities/Collections)PackageDisseminator
interface - Now exports 'java.io.File' objects instead of OutputStreams (to better support recursive exports of Communities/Collections)DSpaceAIPDisseminator
- Disseminates/Exports AIP(s)DSpaceAIPIngester
- Ingests exported AIP(s)\- Changes were also made to refactor / enhance the
AbstractMETSDisseminator
,AbstractMETSIngester
, andMETSManifest
classes
- org.dspace.content.crosswalk.*
AIPDIMCrosswalk
- Crosswalks DIM metadata for AIPsAIPTechMDCrosswalk
- Crosswalks METS TechMD sections for AIPs- There were also changes to the
MODSDisseminationCrosswalk
andXSLTDisseminationCrosswalk
to support creating "Site" AIPs
Note |
---|
For a full list of code changes (including patches) see: AipCoreAPIChanges |
Install Prototype
- Download the code from the SVN Sandbox (see above).
- Build & Install the prototype. This is just a modified version of DSpace 1.6.0 – so, follow the normal DSpace 1.6.0 Installation procedure.
- If you have a DSpace 1.6.0 instance already running, you can just build the code and point it at your existing DSpace 1.6.0 database & assetstore.
...
Warning | ||
---|---|---|
|
Running the Code
Exporting AIPs
...