Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: changed link to repo to github, updated documentation including instructions on how to compile and run

The repository home for this project is:http https://scmgithub.dspace.orgcom/svn/repo/sandbox/SimpleArchiveFormat_Builder-prototype/peterdietz/SAFBuilder

The input for a command-line batch ingest of materials to DSpace is well documented, and is called "Simple Archive Format", however there don't seem needs to be tools available a tool that easily facilitate facilitates creating a Simple Archive Format package. The approach that use case satisfied with the Simple Archive Format Packager is taking is that someone is tracking all of the items of future collection in a folder, and keeping metadata about it in a spreadsheethas a spreadsheet filled with metadata as well as content files that are eventually destined for repository ingest.

Thus the input to the Simple Archive Format Packager is a spreadsheet /CSV (.csv) that has a column the following columns:

  • filename for the bitstream/file

...

  • metadata with namespace.element.(qualifer). Examples would be: dc.description or dc.contributor.author

...

.h2 To get started with the code in the sandbox repo.
#Check out directory from the svn repository.
#In an IDE, (tested in NetBeans), Create a Java Application with Existing Sources.
#Add the source directory 'src'
#Download and add the third party libraries (.jars) as mentioned in the README .

You will then need to edit BatchProcess.java so that inputDir, and metaFile match the path to the sample_data, or whatever collection data you are throwing at it.

In the source tree there is a sample_data directory to help kick start testing and development of this tool.

The expected output of this tool is going to be something that satisfies the specification laid out by "Simple Archive Format".
Image Removed

...

Java Compiling and Running Instructions

The commands below will: check out the code from Git, download the external java libraries used to make the tool, compile the source code, and execute it.

Code Block

git clone git://github.com/peterdietz/SAFBuilder.git
cd SAFBuilder
wget http://mirrors.ibiblio.org/pub/mirrors/maven2/net/sourceforge/javacsv/javacsv/2.0/javacsv-2.0.jar
wget http://mirrors.ibiblio.org/pub/mirrors/maven2/xmlwriter/xmlwriter/2.2/xmlwriter-2.2.jar
wget http://mirrors.ibiblio.org/pub/mirrors/maven2/commons-io/commons-io/1.4/commons-io-1.4.jar
mkdir classes
javac -classpath javacsv-2.0.jar:commons-io-1.4.jar:xmlwriter-2.2.jar src/edu/osu/kb/batch/*.java -d classes
java -cp classes edu.osu.kb.batch.BatchProcess

The final command will then give you the arguments used to invoke the program.

Panel

USAGE: BatchProcess /path/to/directory metadatafilename.csv
Hint -- directory: Use absolute path and no trailing slashes
Hint -- metadatafilename: needs to be in the directory, as do the content files

There is sample data included with the tool to give an idea of how to use this.

To run the tool over the sample data:

Code Block

java -cp classes:javacsv-2.0.jar:commons-io-1.4.jar:xmlwriter-2.2.jar edu.osu.kb.batch.BatchProcess /home/peter/NetBeansProjects/SAFBuilder/src/edu/osu/kb/sample_data AAA_batch-metadata.csv

This creates the SimpleArchiveFormat directory inside of the directory specified, along with subdirectories, content files, metadata files that is ready to import into DSpace.

Image Added

Further Work

This packager works as a stand-alone tool, and requires knowledge of Java to be able to run. Thus satisfying the initial need to be able to package many items to be batch loaded into DSpace, using DSpace's launcher item-import. So the remaining goal of this project is to streamline the process of batch loading materials into DSpace.

Possibilities include:

  • refactoring so that it can become a Packager Plugin. Packager plugins allow you to implement a way for DSpace to accept an input package (containing content files, manifest, and metadata) that then creates DSpace items.
  • creating a client GUI for the desktop.