Page History
Table of Contents | ||||||
---|---|---|---|---|---|---|
|
Item Importer and Exporter
DSpace has a set of command line tools for importing and exporting items in batches, using the DSpace simple archive formatSimple Archive Format. The tools are not terribly robust, but are useful and are easily modified. They also give a good demonstration of how to implement your own item importer if desired.
...
The basic concept behind the DSpace's simple archive format Simple Archive Format is to create an archive, which is directory full of items, with a subdirectory per item. Each item directory contains a file for the item's descriptive metadata, and the files that make up the item.
Code Block |
---|
archive_directory/
item_000/
dublin_core.xml -- qualified Dublin Core metadata for metadata fields belonging to the dc schema
metadata_[prefix].xml -- metadata in another schema, the prefix is the name of the schema as registered with the metadata registry
contents -- text file containing one line per filename
file_1.doc -- files to be added as bitstreams to the item
file_2.pdf
item_001/
dublin_core.xml
contents
file_1.png
...
|
...
<element>
- the Dublin Core element<qualifier>
- the element's qualifier<language>
- (optional)ISO language code for elementCode Block <dublin_core> <dcvalue element="title" qualifier="none">A Tale of Two Cities</dcvalue> <dcvalue element="date" qualifier="issued">1990</dcvalue> <dcvalue element="title" qualifier="alternate" language="fr">J'aime les Printemps</dcvalue> </dublin_core>
(Note the optional language tag attribute which notifies the system that the optional title is in French.)
Every metadata field used, must be registered via the metadata registry of the DSpace instance first.
The contents
file simply enumerates, one file per line, the bitstream file names. See the following example:
Code Block |
---|
file_1.doc
file_2.pdf
license
|
...
- Create a separate file for the other schema named
metadata_[prefix].xml
, where the[prefix]
is replaced with the schema's prefix. - Inside the xml file use the dame Dublin Core syntax, but on the
<dublin_core>
element include the attributeschema=[prefix]
. Here is an example for ETD metadata, which would be in the file
metadata_etd.xml
:Code Block <?xml version="1.0" encoding="UTF-8"?> <dublin_core schema="etd"> <dcvalue element="degree" qualifier="department">Computer Science</dcvalue> <dcvalue element="degree" qualifier="level">Masters</dcvalue> <dcvalue element="degree" qualifier="grantor">Texas A & M</dcvalue> </dublin_core>
...
Using the -m
argument will export the item/collection and also perform the migration step. It will perform the same process that the next section Transferring Items Between DSpace Instances performsor Copying Content Between Repositories performs. We recommend that the next section to be read in conjunction with this flag being used.