Another area of interest identified in the DSpace+1.6 survey was embargo. Several efforts have already been made against recent versions of the codebase, so the expectation that one or a combination of them could be made ready for 1.6 is not unreasonable. Please use this page to help us understand the range of desired functionality, or if you have done work that has not been advertised to the community, bring it to our attention. Record your comments below, or if you are more comfortable using email, send them to rrodgers@mit.edu Richard Rodgers, and they will be added to the page. Since we hope to have 1.6 in the not too distant future, the sooner you respond the more likely your input will shape the released feature set by creating a superior papers.
A few examples in the requirements space:
We have an additional complication that only some Items are eligible for embargo. Our
Items have a license keyword value chosen from a small controlled vocabulary, and only
some license types can be embargoed (more about this later..). However, I'm content to just
implement that restriction in the submission UI and hope that Administrator users are careful making changes.
Other requirements:
So the UI operations we forsee needing are:
I'm in favor of implementing embargoes using the existing data model, i.e. Item metadata fields
and access control policies. Most of the requirements are easily met this way. For example:
site.embargo.until |
Using a metadata field to record the embargo has many benefits:
There are some drawbacks of this design:
The logic to set Bitstream access controls might need to be customized at each site, so it could be a good idea to make that a plugin so it's easy to change. It would have to check the ResorucePolicies on all Bitstreams and change them if necessary to syncrhonize with the embargo status.
The Dryad project was recently mentioned on the lists. It includes (among other things) a fine prototype implementation of a similar embargo design. There is even a detailed writeup by @mire. They also take the approach of storing the date in a metadata field, and a cron daemon does all the policy updates. It's well worth a good look.
LarryStone 02:00, 29 May 2009 (EDT)
In our implementation, an embargoed item returns empty
Bundle[] |
s, and does not allow any bitstreams to be created on the Item (by throwing
AuthorizeException |
in response to the bitstream creation methods).
It should be noted that in our implementation, the policy (represented by DspaceEmbargoConfig ) was used at the UI layer to drive workflow. At the API layer, the developer can do whatever they want (e.g. set an embargo on an item that resides in a collection that has embargoes disabled). It is up to the developer to consult the policy and do the right thing.
We define discovery to be visibility of Item metadata in browse, search, and syndication (rss/atom and OAI-PMH). In our original design (see DspaceEmbargoConfig), we had individual discovery flags for these. The reality is that we never had a need for that level of granularity, so we added a convenience method isDiscoverable() which represents the abstract concept.
In our implementation, we record audit information in dc.provenance metadata fields. Anyone can view these audit trails.
Our implementation is covered here, including links to the source code and Maven website.
We took a lot of liberties with the DSpace codebase. The biggest change was to modify HandleManager to consult the embargo service and return instances of EmbargoedItem if the Item was embargoed. There were a lot of other bits touched in the browse, search, oai, and XMLUI BitstreamReader.
I agree with Larry and other implementations, including @mire's, that using
ResourcePolicy |
s to represent embargoes are a good idea.
We also created an eventing system which is used to manage metadata when an embargo is enacted or lifted. Event producers and listeners are wired up in
dspace.cfg |
.
These are the requirements that were used by @mire to implement the embargo functionality in Dryad:
We came up with an embargo option while implementing our ETD pilot project.
(Starting October 1st 2009, ETD's will be mandatory on campus. Meaning, students will need to archive their ETD in our institutional repository to get their diploma.)
Students submit their ETDs in DSpace and select the time length of the embargo they want directly in the submission form. Students must have written permission from department before selecting this submission option.
Personnel in the department can double check if student selected an embargo and make sure they were granted permission to do so.
(This is standard protocol for ETD submission since we use the 3 steps validation process in DSpace. and submissions are verified by the department administrative staff at the 2nd step of validation.)
We added a dc.date.available field in the
input-forms.xml |
to store the embargo data. We use a dropdown menu to present 5 options to users. The dropdown menu is on "Publish immediately" by default with the
<stored-value> |
being "NO_RESTRICTION". Students can then choose 4 different lengths of embargo: 6 months, 1 year, 2 years & 5 years. This is coded in the <stored-value> with each embargo period encoded with MONTHS_WITHHELD#months. ("MONTHS_WITHHELD:6" = 6 months and "MONTHS_WITHHELD:60" = 5 years.)
This means that once an ETD item is published in DSpace, it contains 2 dc.date.available fields: 1 with the entry date in the repository and 1 with the embargo length.
We then modified 4 different classes:
AuthorizedManager.java |
/
Item.java |
/
ItemTag.java |
/
BitstreamServlet.java |
AuthorizedManager.java |
checkEmbargo |
dateEmbargoEnd |
Item.java |
isThisitemInEmargo |
checkEmbargo |
AuthorizeManager.java |
ItemTag.java |
isThisitemInEmargo |
Item.java |
You can check an example right here: http://hdl.handle.net/1866/5085
BitstreamServlet.java |
AuthorizedManager.java |
I must specify that credit for this work should be granted to my 2 colleagues and that I'm only explaining this after they implemented it last fall. I wanted to share this information to help in the creation of the best embargo option for DSpace.
Before any implementation work is undertaken, it is useful to have a fairly detailed description of the functionality and design of a proposed solution. Further input is welcome, and can now also address the narrower topic of how well this design addresses the expressed needs.
The goal not to be a union of all desired functionality; rather, it is both directly to address the core requirements common to all implementations, and also to make extending the solution for custom purpose easy and conformant to the DSpace architecture and APIs.
ResourcePolicy |
Item |
The lifecycle of an embargo may be roughly divided into the following phases:
For an embargo to occur, the item must be determined to be eligible and a specific policy (terms) attached to it. This may occur in a variety of contexts, e.g. the submission UI, in metadata passed through a SWORD deposit manifest or ItemImport batch, etc. Also, the terms themselves may vary: from a selection in a controlled vocabulary ("3 months", "6 months", etc), to a specific date. Given this multiplicity, the design places no restrictions on how embargo eligibility is determined, nor on how policy is expressed ('encoded') in a metadata field, except that a single configurable field be used for this purpose
(the name of the field can be placed in 'dspace.cfg', e.g).
This process occurs only when an eligible Item is
installed |
into the repository (technically, an
inArchive |
flag is set).
The process consists of two steps:
1. Interpretation (decoding) of policy terms. This means taking the 'encoded' representation of the terms and - together with any other applicable factors like which collection the Item is being installed into - assigning a specific date upon which the embargo will be lifted (the 'lift date').
2. Assigning a set of Resource Policies to the Items bitstreams consistent with the policy. Typically, only administrative access it allowed, but exception may be may made for the submitter (e.g.).
Since the design does not dictate the term encoding rules, it must allow a user-configurable means to decode them. Thus we declare an
interface - call it
SetEmbargo |
, whose configured implementation is invoked whenever an Item is installed.
SetEmbargo |
implementations are responsible for performing both the interpretation (decode) of terms and assignment of Bitstream resource policies. A reference implementation will be provided that accommodates the simple interpretation cases (exact date, controlled list), but implementers are free to extend, alter or rewrite the functionality entirely.
After an embargo is in effect, it's terms may be adjusted. This is considered to be an administrative action, and is thus available only to
actors with administrative access to the collection and Item (the DSpace administrator, and collection administrator). It is assumed that since
the terms have been 'decoded' into a standard formatted lift date, the change will be to another lift date (not encoded terms), and obey the restriction
that the revised date be no earlier that the earliest of: (a)time of revision (b)current lift date.
Another interface -
LiftEmbargo |
will manage the lifting of the embargo. This will typically involve instating a set of ResourcePolicies that
non-embargoed Items obtain when installed, but again implementers will be free to customize it's logic. As with
SetEmabargo |
, a reference implementation will be provided, together with a script suitable for use in cron-scheduled invocation.
Here is the first version of a prototype implementation, as patches against the 1.5.2 source:
In addition to these files, you will need the following patches from JIRA:
These commands are intended for a Unix-like environment such as MacOS X, Linux, Solaris.
Given an existing checkout of the DSpace 1.5.2 source,
dspace-api |
Note that the files in the
dash-api |
subdirectory are provided for reference only,
as an example of a plugin implementation. They are not included in the build.
Build and deploy DSpace as usual.
Add these lines to your
dspace.cfg |
configuration file:
## ------- Embargo configuration # DC metadata field to hold the user-supplied embargo terms embargo.field.terms = SCHEMA.ELEMENT.QUALIFIER # DC metadata field to hold computed "lift date" of embargo embargo.field.lift = SCHEMA.ELEMENT.QUALIFIER # string in terms field to indicate indefinite embargo embargo.terms.open = forever # implementation of embargo setter plugin plugin.single.org.dspace.embargo.EmbargoSetter = CLASSNAME # implementation of embargo lifter plugin ## plugin.single.org.dspace.embargo.EmbargoLifter = org.dspace.embargo.DefaultEm plugin.single.org.dspace.embargo.EmbargoLifter = CLASSNAME |
In practice, you must substitute a metadata field name for each
SCHEMA.ELEMENT.QUALIFIER |
, and a class properly implementing each plugin for
CLASSNAME |
.
The embargo terms field may be the same as the lift date field – in such a case, when the embargo is set, the value of the terms field gets deleted and replaced by the lift date.
You must run the embargo lifter task periodically to check for Items with expired embargoes and lift them. It is a command-line application invoked with the command:
dspace/bin/dsrun org.dspace.embargo.EmbargoManager |
Run it with the
--help |
option to get a list of options.
We recommend running this once per day.