- DSpace 7.x (Current Release)
- DSpace 8.x (Unreleased)
- DSpace 6.x (EOL)
- DSpace 5.x (EOL)
- More Versions...
This documentation relates to an old version of DSpace, version 6.x. Looking for another version? See all documentation.
Support for DSpace 6 ended on July 1, 2023. See Support for DSpace 5 and 6 is ending in 2023
Registering is not Importing
The procedures below will not import the actual bitstreams into DSpace. They will merely inform DSpace of an existing location where these Bitstreams can be found. Please refer to Importing and Exporting Items via Simple Archive Format for information on importing metadata and bitstreams.
Registration is an alternate means of incorporating items, their metadata, and their bitstreams into DSpace by taking advantage of the bitstreams already being in storage accessible to DSpace. An example might be that there is a repository for existing digital assets. Rather than using the normal interactive ingest process or the batch import to furnish DSpace the metadata and to upload bitstreams, registration provides DSpace the metadata and the location of the bitstreams. DSpace uses a variation of the import tool to accomplish registration.
To register an item its bitstreams must reside on storage accessible to DSpace and therefore referenced by an asset store number in dspace.cfg. The configuration file dspace.cfg establishes one or more asset stores through the use of an integer asset store number. This number relates to a directory in the DSpace host's file system or a set of SRB account parameters. This asset store number is described in The dspace.cfg Configuration Properties File section and in the dspace.cfg file itself. The asset store number(s) used for registered items should generally not be the value of the assetstore.incoming property since it is unlikely that you will want to mix the bitstreams of normally ingested and imported items and registered items.
DSpace uses the same import tool that is used for batch import except that several variations are employed to support registration. The discussion that follows assumes familiarity with the import tool.
The DSpace Simple Archive Format for registration does not include the actual content files (bitstreams) being registered. The format is however a directory full of items to be registered, with a subdirectory per item. Each item directory contains a file for the item's descriptive metadata (dublin_core.xml) and a file listing the item's content files (contents), but not the actual content files themselves.
The dublin_core.xml file for item registration is exactly the same as for regular item import.
The contents file, like that for regular item import, lists the item's content files, one content file per line, but each line has the one of the following formats:
-r indicates this is a file to be registered
-s n indicates the asset store number (n)
-f filepath indicates the path and name of the content file to be registered (filepath)
\t is a tab character
bundle:bundlename is an optional bundle name
permissions: -[r|w] 'group name' is an optional read or write permission that can be attached to the bitstream
description: some text is an optional description field to add to the file
The command line for registration is just like the one for regular import:
(or by using the long form)
--test flags will function as described in Importing Items.
--delete flag will function as described in Importing Items but the registered content files will not be removed from storage. See Deleting Registered Items.
--replace flag will function as described in Importing Items but care should be taken to consider different cases and implications. With old items and new items being registered or ingested normally, there are four combinations or cases to consider. Foremost, an old registered item deleted from DSpace using
--replace will not be removed from the storage. See Deleting Registered Items. where is resides. A new item added to DSpace using
--replace will be ingested normally or will be registered depending on whether or not it is marked in the contents files with the
Once an item has been registered, superficially it is indistinguishable from items ingested interactively or by batch import. But internally there are some differences:
First, the randomly generated internal ID is not used because DSpace does not control the file path and name of the bitstream. Instead, the file path and name are that specified in the contents file.
Second, the store_number column of the bitstream database row contains the asset store number specified in the contents file.
Third, the internal_id column of the bitstream database row contains a leading flag (
-R) followed by the registered file path and name. For example,
-Rfilepath where filepath is the file path and name relative to the asset store corresponding to the asset store number. The asset store could be traditional storage in the DSpace server's file system or an SRB account.
Fourth, an MD5 checksum is calculated by reading the registered file if it is in local storage.
Registered items and their bitstreams can be retrieved transparently just like normally ingested items.
Registered items may be exported as described in Exporting Items. If so, the export directory will contain actual copies of the files being exported but the lines in the contents file will flag the files as registered. This means that if DSpace items are "round tripped" (see Transferring Items Between DSpace Instances) using the exporter and importer, the registered files in the export directory will again registered in DSpace instead of being uploaded and ingested normally.
If a registered item is deleted from DSpace, (either interactively or by using the
--replace flags described in Importing and Exporting Items via Simple Archive Format) the item will disappear from DSpace but its registered content files will remain in place just as they were prior to registration. Bitstreams not registered but added by DSpace as part of registration, such as
license.txt files, will be deleted.