Old Release
This documentation relates to an old version of DSpace, version 6.x. Looking for another version? See all documentation.
Support for DSpace 6 ended on July 1, 2023. See Support for DSpace 5 and 6 is ending in 2023
Package Importer and Exporter
This command-line tool gives you access to the Packager plugins. It can ingest a package to create a new DSpace Object (Community, Collection or Item), or disseminate a DSpace Object as a package.
To see all the options, invoke it as:
[dspace]/bin/dspace packager --help
This mode also displays a list of the names of package ingestion and dissemination plugins that are currently installed in your DSpace. Each Packager plugin also may allow for custom options, which may provide you more control over how a package is imported or exported. You can see a listing of all specific packager options by invoking --help
(or -h
) with the --type
(or -t
) option:
[dspace]/bin/dspace packager --help --type METS
The above example will display the normal help message, while also listing any additional options available to the "METS" packager plugin.
Supported Package Formats
DSpace comes with several pre-configured package ingestion and dissemination plugins, which allow you to import/export content in a variety of formats.
Pre-Configured Submission Package (SIP) Types
- AIP - Ingests content which is in the DSpace Archival Information Package (AIP) format. This is used as part of the DSpace AIP Backup and Restore process
- DSPACE-ROLES - Ingests DSpace users/groups in the DSPACE-ROLES XML Schema. This is primarily used by the DSpace AIP Backup and Restore process to ingest/replace DSpace Users & Groups.
- METS - Ingests content which is in the DSpace METS SIP format
- PDF - Ingests a single PDF file (where basic metadata is extracted from the file properties in the PDF Document).
Pre-Configured Dissemination Package (DIP) Types
- AIP - Exports content which is in the DSpace Archival Information Package (AIP) format. This is used as part of the DSpace AIP Backup and Restore process
- DSPACE-ROLES - Exports DSpace users/groups in the DSPACE-ROLES XML Schema. This is primarily used by the DSpace AIP Backup and Restore process to export DSpace Users & Groups.
- METS - Exports content in the DSpace METS SIP format
For a list of all package ingestion and dissemination plugins that are currently installed in your DSpace, you can execute:
[dspace]/bin/dspace packager --help
Some packages ingestion and dissemination plugins also have custom options/parameters. For example, to see a listing of the custom options for the "METS" plugin, you can execute:
[dspace]/bin/dspace packager --help --type METS
Ingesting
Ingestion Modes & Options
When ingesting packages DSpace supports several different "modes". (Please note that not all packager plugins may support all modes of ingestion)
- Submit/Ingest Mode (
-s
option, default) – submit package to DSpace in order to create a new object(s) - Restore Mode (
-r
option) – restore pre-existing object(s) in DSpace based on package(s). This also attempts to restore all handles and relationships (parent/child objects). This is a specialized type of "submit", where the object is created with a known Handle and known relationships. - Replace Mode (
-r -f
option) – replace existing object(s) in DSpace based on package(s). This also attempts to restore all handles and relationships (parent/child objects). This is a specialized type of "restore" where the contents of existing object(s) is replaced by the contents in the AIP(s). By default, if a normal "restore" finds the object already exists, it will back out (i.e. rollback all changes) and report which object already exists.
Ingesting a Single Package
To ingest a single package from a file, give the command:
[dspace]/bin/dspace packager -e [user-email] -p [parent-handle] -t [packager-name] /full/path/to/package
Where [user-email] is the e-mail address of the E-Person under whose authority this runs; [parent-handle] is the Handle of the Parent Object into which the package is ingested, [packager-name] is the plugin name of the package ingester to use, and /full/path/to/package is the path to the file to ingest (or "-" to read from the standard input).
Here is an example that loads a PDF file with internal metadata as a package:
[dspace]/bin/dspace packager -e admin@myu.edu -p 4321/10 -t PDF thesis.pdf
This example takes the result of retrieving a URL and ingests it:
wget -O - http://alum.mit.edu/jarandom/my-thesis.pdf | [dspace]/bin/dspace packager -e admin@myu.edu -p 4321/10 -t PDF -
Ingesting Multiple Packages at Once
Some Packager plugins support bulk ingest functionality using the --all
(or -a
) flag. When --all
is used, the packager will attempt to ingest all child packages referenced by the initial package (and continue on recursively). Some examples follow:
- For a Site-based package - this would ingest all Communities, Collections & Items based on the located package files
- For a Community-based package - this would ingest that Community and all SubCommunities, Collections and Items based on the located package files
- For a Collection - this would ingest that Collection and all contained Items based on the located package files
- For an Item – this just ingest the Item (including all Bitstreams & Bundles) based on the package file.
Here is a basic example of a bulk ingest 'packager' command template:
[dspace]/bin/dspace packager -s -a -t AIP -e <eperson> -p <parent-handle> <file-path>
for example:
[dspace]/bin/dspace packager -s -a -t AIP -e admin@myu.edu -p 4321/12 collection-aip.zip
The above command will ingest the package named "collection-aip.zip" as a child of the specified Parent Object (handle="4321/12"). The resulting object is assigned a new Handle (since -s
is specified). In addition, any child packages directly referenced by "collection-aip.zip" are also recursively ingested (a new Handle is also assigned for each child AIP).
Not All Packagers Support Bulk Ingest
Because the packager plugin must know how to locate all child packages from an initial package file, not all plugins can support bulk ingest. Currently, in DSpace the following Packager Plugins support bulk ingest capabilities:
- METS Packager Plugin
- AIP Packager Plugin
Restoring/Replacing using Packages
Restoring is slightly different than just ingesting. When restoring, the packager makes every attempt to restore the object as it used to be (including its handle, parent object, etc.).
There are currently three restore modes:
- Default Restore Mode (
-r
) = Attempt to restore object (and optionally children). Rollback all changes if any object is found to already exist. - Restore, Keep Existing Mode (
-r -k
) = Attempt to restore object (and optionally children). If an object is found to already exist, skip over it (and all children objects), and continue to restore all other non-existing objects. - Force Replace Mode (
-r -f
) = Restore an object (and optionally children) and overwrite any existing objects in DSpace. Therefore, if an object is found to already exist in DSpace, its contents are replaced by the contents of the package. WARNING: This mode is potentially dangerous as it will permanently destroy any object contents that do not currently exist in the package. You may want to first perform a backup, unless you are sure you know what you are doing!
Default Restore Mode
By default, the restore mode (-r
option) will rollback all changes if any object is found to already exist. The user will be informed if which object already exists within their DSpace installation.
Use this 'packager' command template:
[dspace]/bin/dspace packager -r -t AIP -e <eperson> <file-path>
For example:
[dspace]/bin/dspace packager -r -t AIP -e admin@myu.edu aip4567.zip
Notice that unlike -s
option (for submission/ingesting), the -r
option does not require the Parent Object (-p
option) to be specified if it can be determined from the package itself.
In the above example, the package "aip4567.zip" is restored to the DSpace installation with the Handle provided within the package itself (and added as a child of the parent object specified within the package itself). If the object is found to already exist, all changes are rolled back (i.e. nothing is restored to DSpace)
Restore, Keep Existing Mode
When the "Keep Existing" flag (-k
option) is specified, the restore will attempt to skip over any objects found to already exist. It will report to the user that the object was found to exist (and was not modified or changed). It will then continue to restore all objects which do not already exist. This flag is most useful when attempting a bulk restore (using the --all
(or -a
) option.
One special case to note: If a Collection or Community is found to already exist, its child objects are also skipped over. So, this mode will not auto-restore items to an existing Collection.
Here's an example of how to use this 'packager' command:
[dspace]/bin/dspace packager -r -a -k -t AIP -e <eperson> <file-path>
For example:
[dspace]/bin/dspace packager -r -a -k -t AIP -e admin@myu.edu aip4567.zip
In the above example, the package "aip4567.zip" is restored to the DSpace installation with the Handle provided within the package itself (and added as a child of the parent object specified within the package itself). In addition, any child packages referenced by "aip4567.zip" are also recursively restored (the -a
option specifies to also restore all child pacakges). They are also restored with the Handles & Parent Objects provided with their package. If any object is found to already exist, it is skipped over (child objects are also skipped). All non-existing objects are restored.
Force Replace Mode
When the "Force Replace" flag (-f
option) is specified, the restore will overwrite any objects found to already exist in DSpace. In other words, existing content is deleted and then replaced by the contents of the package(s).
Potential for Data Loss
Because this mode actually destroys existing content in DSpace, it is potentially dangerous and may result in data loss! It is recommended to always perform a full backup (assetstore files & database) before attempting to replace any existing object(s) in DSpace.
Here's an example of how to use this 'packager' command:
[dspace]/bin/dspace packager -r -f -t AIP -e <eperson> <file-path>
For example:
[dspace]/bin/dspace packager -r -f -t AIP -e admin@myu.edu aip4567.zip
In the above example, the package "aip4567.zip" is restored to the DSpace installation with the Handle provided within the package itself (and added as a child of the parent object specified within the package itself). In addition, any child packages referenced by "aip4567.zip" are also recursively ingested. They are also restored with the Handles & Parent Objects provided with their package. If any object is found to already exist, its contents are replaced by the contents of the appropriate package.
If any error occurs, the script attempts to rollback the entire replacement process.
Disseminating
Disseminating a Single Object
To disseminate a single object as a package, give the command:
[dspace]/bin/dspace packager -d -e [user-email] -i [handle] -t [packager-name] [file-path]
Where [user-email] is the e-mail address of the E-Person under whose authority this runs; [handle] is the Handle of the Object to disseminate; [packager-name] is the plugin name of the package disseminator to use; and [file-path] is the path to the file to create (or "-" to write to the standard output). For example:
[dspace]/bin/dspace packager -d -e admin@myu.edu -i 4321/4567 -t METS 4567.zip
The above code will export the object of the given handle (4321/4567) into a METS file named "4567.zip".
Disseminating Multiple Objects at Once
To export an object hierarchy, use the -a
(or --all
) package parameter.
For example, use this 'packager' command template:
[dspace]/bin/dspace packager -d -a -e [user-email] -i [handle] -t [packager-name][file-path]
for example:
[dspace]/bin/dspace packager -d -a -t METS -e admin@myu.edu -i 4321/4567 4567.zip
The above code will export the object of the given handle (4321/4567) into a METS file named "4567.zip". In addition it would export all children objects to the same directory as the "4567.zip" file.
Archival Information Packages (AIPs)
As of DSpace 1.7, DSpace now can backup and restore all of its contents as a set of AIP Files. This includes all Communities, Collections, Items, Groups and People in the system.
This feature came out of a requirement for DSpace to better integrate with DuraCloud (http://www.duracloud.org), and other backup storage systems. One of these requirements is to be able to essentially "backup" local DSpace contents into the cloud (as a type of offsite backup), and "restore" those contents at a later time.
Essentially, this means DSpace can export the entire hierarchy (i.e. bitstreams, metadata and relationships between Communities/Collections/Items) into a relatively standard format (a METS-based, AIP format). This entire hierarchy can also be re-imported into DSpace in the same format (essentially a restore of that content in the same or different DSpace installation).
For more information, see the section on AIP backup & Restore for DSpace.
METS packages
Since DSpace 1.4 release, the software includes a package disseminator and matching ingester for the DSpace METS SIP (Submission Information Package) format. They were created to help end users prepare sets of digital resources and metadata for submission to the archive using well-defined standards such as METS, MODS, and PREMIS. The plugin name is METS by default, and it uses MODS for descriptive metadata.
The DSpace METS SIP profile is available at: DSpaceMETSSIPProfile