Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

The org.dspace.core package provides some basic classes that are used throughout the DSpace code.

The Configuration Manager

...

The configuration manager is responsible for reading the main dspace.cfg properties file, managing the 'template' configuration files for other applications such as Apache, and for obtaining the text for e-mail messages.

The system is configured by editing the relevant files in / [dspace]/config, as described in the configuration section.

When editing configuration files for applications that DSpace uses, such as Apache Tomcat, remember you may want to edit the file copy in /dspace/config/templates [dspace-source] and then run /dspace/bin/install- ant update or ant overwrite_configs rather than editing the 'live' version directly! This will ensure you have a backup copy of your modified configuration files, so that they are not accidentally overwritten in the future.

The ConfigurationManager class can also be invoked as a command line tool, with two possible uses:

  • _/[dspace]/bin/install-configs_This processes and installs configuration files for other applications, as described in the configuration section./dspace/bin/dsrun org.dspace.core.ConfigurationManager -property /dspace dsprop property.name_ This writes the value of _property.name from dspace.cfg to the standard output, so that shell scripts can access the DSpace configuration. For an example, see /dspace/bin/start-handle-server. If the property has no value, nothing is written.

...

The e-mail texts are stored in / [dspace]/config/emails. They are processed by the standard java.text.MessageFormat. At the top of each e-mail are listed the appropriate arguments that should be filled out by the sender. Example usage is shown in the org.dspace.core.Email Javadoc API documentation.

...

The level of logging can be configured on a per-package or per-class basis by editing /[dspace]/config/templates/log4j.properties and then executing /dspace/bin/install-configs. You will need to stop and restart Tomcat for the changes to take effect.

...

Date and time, milliseconds

2002-11-11 08:11:32,903

Level (FATAL, WARN, INFO or DEBUG)

INFO

Java class

org.dspace.app.webui.servlet.DSpaceServlet

 

@

User email or anonymous

anonymous

 

:

Extra log info from context

session_id=BD84E7C194C2CF4BD0EC3A6CAD0142BB

 

:

Action

view_item

 

:

Extra info

handle=1721.1/1686

The above format allows the logs to be easily parsed and analyzed. The / [dspace]/bin/log-reporter script is a simple tool for analyzing logs. Try:

Code Block

...

[dspace]/bin/log-reporter --help

It's a good idea to 'nice' this log reporter to avoid an impact on server performance.

...

  • "Constructing" an object may be misconstrued as the action of creating an object in the DSpace system, for example one might expect something like:
    Code Block
    Context dsContent = new Context();
    Item myItem = new Item(context, id)
    to construct a brand new item in the system, rather than simply instantiating an in-memory instance of an object in the system.
  • find methods may often be called with invalid IDs, and return null in such a case. A constructor would have to throw an exception in this case. A null return value from a static method can in general be dealt with more simply in code.
  • If an instantiation representing the same underlying archival entity already exists, the find method can simply return that same instantiation to avoid multiple copies and any inconsistencies which might result.

Collection, Bundle and Bitstream do not have create methods; rather, one has to create an object using the relevant method on the container. For example, to create a collection, one must invoke createCollection on the community that the collection is to appear in:

Code Block
Context context = new Context();
Community existingCommunity = Community.find(context, 123);
Collection myNewCollection = existingCommunity.createCollection();

The primary reason for this is for determining authorization. In order to know whether an e-person may create an object, the system must know which container the object is to be added to. It makes no sense to create a collection outside of a community, and the authorization system does not have a policy for that.

Item_s are first created in the form of an implementation of _InProgressSubmission. An InProgressSubmission represents an item under construction; once it is complete, it is installed into the main archive and added to the relevant collection by the InstallItem class. The org.dspace.content package provides an implementation of InProgressSubmission called WorkspaceItem; this is a simple implementation that contains some fields used by the Web submission UI. The org.dspace.workflow also contains an implementation called WorkflowItem which represents a submission undergoing a workflow process.

...

Primarily, one should note that no change made using a particular org.dspace.core.Context object will actually be made in the underlying storage unless complete or commit is invoked on that Context. If anything should go wrong during an operation, the context should always be aborted by invoking abort, to ensure that no inconsistent state is written to the storage.unmigrated-wiki-markup

Additionally, some changes made to objects only happen in-memory. In these cases, invoking the _update_ method lines up the in-memory changes to occur in storage when the _Context_ is committed or completed. In general, methods that change any \[meta\]data field only make the change any metadata field only make the change in-memory; methods that involve relationships with other objects in the system line up the changes to be committed with the context. See individual methods in the API Javadoc.

Some examples to illustrate this are shown below:

...

  • Code Block
    static Object getSinglePlugin(Class intface)
         throws PluginConfigurationError;
    Returns an instance of the singleton (single) plugin implementing the given interface. There must be exactly one single plugin configured for this interface, otherwise the PluginConfigurationError is thrown. Note that this is the only "get plugin" method which throws an exception. It is typically used at initialization time to set up a permanent part of the system so any failure is fatal. See the plugin.single configuration key for configuration details.
  • Wiki Markupcode
    _static Object\[\] getPluginSequence(Class intface);_ 
    Returns instances of all plugins that implement the interface _intface_, in an _Array_. Returns an empty array if no there are no matching plugins. The order of the plugins in the array is the same as their class names in the configuration's value field. See the _plugin.sequence_ configuration key for configuration details.
  • Code Block
    static Object getNamedPlugin(Class intface, String name);
    Returns an instance of a plugin that implements the interface intface and is bound to a name matching name. If there is no matching plugin, it returns null. The names are matched by String.equals(). See the plugin.named and plugin.selfnamed configuration keys for configuration details.
  • Code Block
    static void releasePlugin(Object plugin);
    Tells the Plugin Manager to let go of any references to a reusable plugin, to prevent it from being given out again and to allow the object to be garbage-collected. Call this when a plugin instance must be taken out of circulation.
  • Wiki Markupcode
    _static String\[\] getAllPluginNames(Class intface);_ 
    Returns all of the names under which a named plugin implementing the interface _intface_ can be requested (with _getNamedPlugin()_). The array is empty if there are no matches. Use this to populate a menu of plugins for interactive selection, or to document what the possible choices are. The names are NOT returned in any predictable order, so you may wish to sort them first. Note: Since a plugin may be bound to more than one name, the list of names this returns does not represent the list of plugins. To get the list of unique implementation classes corresponding to the names, you might have to eliminate duplicates (i.e. create a Set of classes).
  • Code Block
    static void checkConfiguration();
    Validates the keys in the DSpace ConfigurationManager pertaining to the Plugin Manager and reports any errors by logging them. This is intended to be used interactively by a DSpace administrator, to check the configuration file after modifying it. See the section about validating configuration for details.

...

  1. Interface: Classname of the Java interface which defines the plugin, including package name. e.g. org.dspace.app.mediafilter.FormatFilter
  2. Implementation Class: Classname of the implementation class, including package. e.g. org.dspace.app.mediafilter.PDFFilter
  3. Names: (Named plugins only) There are two ways to bind names to plugins: listing them in the value of a plugin.named.interface key, or configuring a class in plugin.selfnamed.interface which extends the SelfNamedPlugin class.
  4. Reusable option: (Optional) This is declared in a plugin.reusable configuration line. Plugins are reusable by default, so you only need to configure the non-reusable ones.

Configuring Singleton (Single) Plugins

This entry configures a Single Plugin for use with getSinglePlugin():

Code Block
plugin.single.interface = classname

For example, this configures the class org.dspace.checker.SimpleDispatcher as the plugin for interface org.dspace.checker.BitstreamDispatcher:

Code Block
plugin.single.org.dspace.checker.BitstreamDispatcher=org.dspace.checker.SimpleDispatcher

Configuring Sequence of Plugins

...

There are two ways of configuring named plugins:

...

  1. *Plugins Named in the Configuration* A named plugin which gets its name(s) from the configuration is listed in this kind of entry:_plugin.named.interface = classname = name \ [ , name.. \ ] \ [ classname = name.. \ ]_The syntax of the configuration value is: classname, followed by an equal-sign and then at least one plugin name. Bind more names to the same implementation class by adding them here, separated by commas. Names may include any character other than comma (,) and equal-sign (=).For example, this entry creates one plugin with the names GIF, JPEG, and image/png, and another with the name TeX:
    Code Block
    plugin.named.org.dspace.app.mediafilter.MediaFilter = \
            org.dspace.app.mediafilter.JPEGFilter = GIF, JPEG, image/png \
            org.dspace.app.mediafilter.TeXFilter = TeX
    This example shows a plugin name with an embedded whitespace character. Since comma (,) is the separator character between plugin names, spaces are legal (between words of a name; leading and trailing spaces are ignored).This plugin is bound to the names "Adobe PDF", "PDF", and "Portable Document Format".
    Code Block
    plugin.named.org.dspace.app.mediafilter.MediaFilter = \
          org.dspace.app.mediafilter.TeXFilter = TeX \
          org.dspace.app.mediafilter.PDFFilter =  Adobe PDF, PDF, Portable Document Format
    NOTE: Since there can only be one key with plugin.named. followed by the interface name in the configuration, all of the plugin implementations must be configured in that entry.unmigrated-wiki-markup
  2. *Self-Named Plugins* Since a self-named plugin supplies its own names through a static method call, the configuration only has to include its interface and classname:_plugin.selfnamed.interface = classname \ [ , classname.. \ ]_The following example first demonstrates how the plugin class, \ _XsltDisseminationCrosswalk_ is configured to implement its own names "MODS" and "DublinCore". These come from the keys starting with _crosswalk.dissemination.stylesheet._. The value is a stylesheet file. The class is then configured as a self-named plugin:
    Code Block
    crosswalk.dissemination.stylesheet.DublinCore = xwalk/TESTDIM-2-DC_copy.xsl
    crosswalk.dissemination.stylesheet.MODS = xwalk/mods.xsl
    
    plugin.selfnamed.crosswalk.org.dspace.content.metadata.DisseminationCrosswalk = \
            org.dspace.content.metadata.MODSDisseminationCrosswalk, \
            org.dspace.content.metadata.XsltDisseminationCrosswalk
    
    NOTE: Since there can only be one key with plugin.selfnamed. followed by the interface name in the configuration, all of the plugin implementations must be configured in that entry. The MODSDisseminationCrosswalk class is only shown to illustrate this point.

...

Plugins are assumed to be reusable by default, so you only need to configure the ones which you would prefer not to be reusable. The format is as follows:

Code Block
plugin.reusable.classname = ( true | false )

For example, this marks the PDF plugin from the example above as non-reusable:

Code Block
plugin.reusable.org.dspace.app.mediafilter.PDFFilter = false

Validating the Configuration

...

The WorkflowManager is invoked by events. While an Item is being submitted, it is held by a WorkspaceItem. Calling the start() method in the WorkflowManager converts a WorkspaceItem to a WorkflowItem, and begins processing the WorkflowItem's state. Since all three steps of the workflow are optional, if no steps are defined, then the Item is simply archived.unmigrated-wiki-markup

Workflows are set per Collection, and steps are defined by creating corresponding entries in the List named workflowGroup. If you wish the workflow to have a step 1, use the administration tools for Collections to create a workflow Group with members who you want to be able to view and approve the Item, and the workflowGroup\[0\] becomes set with the ID of that Group.

If a step is defined in a Collection's workflow, then the WorkflowItem's state is set to that step_POOL. This pooled state is the WorkflowItem waiting for an EPerson in that group to claim the step's task for that WorkflowItem. The WorkflowManager emails the members of that Group notifying them that there is a task to be performed (the text is defined in config/emails,) and when an EPerson goes to their 'My DSpace' page to claim the task, the WorkflowManager is invoked with a claim event, and the WorkflowItem's state advances from STEP_x_POOL to STEP_x (where x is the corresponding step.) The EPerson can also generate an 'unclaim' event, returning the WorkflowItem to the STEP_x_POOL.

...

The CreateAdministrator class is a simple command-line tool, executed via / [dspace]/bin/dspace create-administrator, that creates an administrator e-person with information entered from standard input. This is generally used only once when a DSpace system is initially installed, to create an initial administrator who can then use the Web administration UI to further set up the system. This script does not check for authorization, since it is typically run before there are any e-people to authorize! Since it must be run as a command-line tool on the server machine, generally this shouldn't cause a problem. A possibility is to have the script only operate when there are no e-people in the system already, though in general, someone with access to command-line scripts on your server is probably in a position to do what they want anyway!

...

DSpace keeps track of registered users with the org.dspace.eperson.EPerson class. The class has methods to create and manipulate an EPerson such as get and set methods for first and last names, email, and password. (Actually, there is no getPassword() method‚Äîan method‚ an MD5 hash of the password is stored, and can only be verified with the checkPassword() method.) There are find methods to find an EPerson by email (which is assumed to be unique,) or to find all EPeople in the system.

...

Another kind of Group is also implemented in DSpace‚Äîspecial DSpace‚ special Groups. The Context object for each session carries around a List of Group IDs that the user is also a member of‚Äîcurrently of‚ currently the MITUser Group ID is added to the list of a user's special groups if certain IP address or certificate criteria are met.

...

Currently most of the read policy checking is done with items‚Äîcommunities items‚ communities and collections are assumed to be openly readable, but items and their bitstreams are checked. Separate policy checks for items and their bitstreams enables policies that allow publicly readable items, but parts of their content may be restricted to certain groups.

...

Where do items get their read policies? From the their collection's read policy. There once was a separate item read default policy in each collection, and perhaps there will be again since it appears that administrators are notoriously bad at defining collection's read policies. There is also code in place to enable policies that are timed‚Äîhave timed‚ have a start and end date. However, the admin tools to enable these sorts of policies have not been written.

...

Note that since the Handle server runs as a separate JVM to the DSpace Web applications, it uses a separate 'Log4J' configuration, since Log4J does not support multiple JVMs using the same daily rolling logs. This alternative configuration is held as a template in /dspace/config/templates/log4j-handle-plugin.properties, written to /dspacelocated at [dspace]/config/log4j-handle-plugin.properties by the install-configs script. The / [dspace]/bin/start-handle-server script passes in the appropriate command line parameters so that the Handle server uses this configuration.

...

Which fields are indexed by DSIndexer? These fields are defined in dspace.cfg in the section "Fields to index for search" as name-value-pairs. The name must be unique in the form search.index.i (i is an arbitrary positive number). The value on the right side has a unique value again, which can be referenced in search-form (e.g. title, author). Then comes the metadata element which is indexed. '*' is a wildcard which includes all sub elements. For example:

Code Block
search.index.4 = keyword:dc.subject.*

tells the indexer to create a keyword index containing all dc.subject element values. Since the wildcard ('*') character was used in place of a qualifier, all subject metadata fields will be indexed (e.g. dc.subject.other, dc.subject.lcsh, etc)

...

The query class DSQuery contains the three flavors of doQuery() methods‚Äîone methods‚ one searches the DSpace site, and the other two restrict searches to Collections and Communities. The results from a query are returned as three lists of handles; each list represents a type of result. One list is a list of Items with matches, and the other two are Collections and Communities that match. This separation allows the UI to handle the types of results gracefully without resolving all of the handles first to see what kind of content the handle points to. The DSQuery class also has a main() method for debugging via command-line searches.

...

The browse API maintains indexes of dates, authors, titles and subjects, and allows callers to extract parts of these:

  • *Title: Values of the Dublin Core element *title (unqualified) are indexed. These are sorted in a case-insensitive fashion, with any leading article removed. For example: _"The DSpace System_Appears " would appear under 'D' rather than 'T'.
  • *Author: Values of the *contributor (any qualifier or unqualified) element are indexed. Since contributor values typically are in the form 'last name, first name', a simple case-insensitive alphanumeric sort is used which orders authors in last name order. Note that this is an index of authors, and not items by author. If four items have the same author, that author will appear in the index only once. Hence, the index of authors may be greater or smaller than the index of titles; items often have more than one author, though the same author may have authored several items. The author indexing in the browse API does have limitations:
    • Ideally, a name that appears as an author for more than one item would appear in the author index only once. For example, 'Doe, John' may be the author of tens of items. However, in practice, author's names often appear in slightly differently forms, for example:
      Code Block
      Doe, John
      Doe, John Stewart
      Doe, John S.
      Currently, the above three names would all appear as separate entries in the author index even though they may refer to the same author. In order for an author of several papers to be correctly appear once in the index, each item must specify exactly the same form of their name, which doesn't always happen in practice.
    • Another issue is that two authors may have the same name, even within a single institution. If this is the case they may appear as one author in the index. These issues are typically resolved in libraries with authority control records, in which are kept a 'preferred' form of the author's name, with extra information (such as date of birth/death) in order to distinguish between authors of the same name. Maintaining such records is a huge task with many issues, particularly when metadata is received from faculty directly rather than trained library catalogers.
  • *Date of Issue: Items are indexed by date of issue. This may be different from the date that an item appeared in DSpace; many items may have been originally published elsewhere beforehand. The Dublin Core field used is *date.issued. The ordering of this index may be reversed so 'earliest first' and 'most recent first' orderings are possible. Note that the index is of items by date, as opposed to an index of dates. If 30 items have the same issue date (say 2002), then those 30 items all appear in the index adjacent to each other, as opposed to a single 2002 entry. Since dates in DSpace Dublin Core are in ISO8601, all in the UTC time zone, a simple alphanumeric sort is sufficient to sort by date, including dealing with varying granularities of date reasonably. For example:
    Code Block
    2001-12-10
    2002
    2002-04
    2002-04-05
    2002-04-09T15:34:12Z
    2002-04-09T19:21:12Z
    2002-04-10
  • *Date Accessioned: In order to determine which items most recently appeared, rather than using the date of issue, an item's accession date is used. This is the Dublin Core field *date.accessioned. In other aspects this index is identical to the date of issue index.
  • *Items by a Particular Author*: The browse API can perform is to extract items by a particular author. They do not have to be primary author of an item for that item to be extracted. You can specify a scope, too; that is, you can ask for items by author X in collection Y, for example.This particular flavor of browse is slightly simpler than the others. You cannot presently specify a particular subset of results to be returned. The API call will simply return all of the items by a particular author within a certain scope. Note that the author of the item must exactly match the author passed in to the API; see the explanation about the caveats of the author index browsing to see why this is the case.
  • *Subject: Values of the Dublin Core element *subject (both unqualified and with any qualifier) are indexed. These are sorted in a case-insensitive fashion.

Using the API

The API is generally invoked by creating a BrowseScope object, and setting the parameters for which particular part of an index you want to extract. This is then passed to the relevant Browse method call, which returns a BrowseInfo object which contains the results of the operation. The parameters set in the BrowseScope object are:

  • How many entries from the index you want
  • Whether you only want entries from a particular community or collection, or from the whole of DSpace
  • Which part of the index to start from (called the focus of the browse). If you don't specify this, the start of the index is used
  • How many entries to include before the focus entry

To illustrate, here is an example:

  • We want 7 entries in total
  • We want entries from collection x
  • We want the focus to be 'Really'
  • We want 2 entries included before the focus.

The results of invoking Browse.getItemsByTitle with the above parameters might look like this:

Code Block
        Rabble-Rousing Rabbis From Sardinia
        Reality TV: Love It or Hate It?
FOCUS>  The Really Exciting Research Video
        Recreational Housework Addicts: Please Visit My House
        Regional Television Variation Studies
        Revenue Streams
        Ridiculous Example Titles:  I'm Out of Ideas

Note that in the case of title and date browses, Item objects are returned as opposed to actual titles. In these cases, you can specify the 'focus' to be a specific item, or a partial or full literal value. In the case of a literal value, if no entry in the index matches exactly, the closest match is used as the focus. It's quite reasonable to specify a focus of a single letter, for example.

Being able to specify a specific item to start at is particularly important with dates, since many items may have the save issue date. Say 30 items in a collection have the issue date 2002. To be able to page through the index 20 items at a time, you need to be able to specify exactly which item's 2002 is the focus of the browse, otherwise each time you invoked the browse code, the results would start at the first item with the issue date 2002.

...

If the browse index becomes inconsistent for some reason, the InitializeBrowse class is a command line tool (generally invoked using the / [dspace]/bin/dspace index-init command) that causes the indexes to be regenerated from scratch.

...

Checksum checker is used to verify every item within DSpace. While DSpace calculates and records the checksum of every file submitted to it, the checker can determine whether the file has been changed. The idea being that the earlier you can identify a file has changed, the more likely you would be able to record it (assuming it was not a wanted change).

org.dspace.checker.CheckerCommand class, is the class for the checksum checker tool, which calculates checksums for each bitstream whose ID is in the most_recent_checksum table, and compares it against the last calculated checksum for that bitstream.

...

OpenSearch is a small set of conventions and documents for describing and using 'search engines', meaning any service that returns a set of results for a query. It is nearly ubiquitous‚Äîbut ubiquitous‚ but also nearly invisible‚Äîin invisible‚ in modern web sites with search capability. If you look at the page source of Wikipedia, Facebook, CNN, etc you will find buried a link element declaring OpenSearch support. It is very much a lowest-common-denominator abstraction (think Google box), but does provide a means to extend its expressive power. This first implementation for DSpace supports none of these extensions‚Äîmany extensions‚ many of which are of potential value‚Äîso value‚ so it should be regarded as a foundation, not a finished solution. So the short answer is that DSpace appears as a 'search-engine' to OpenSearch-aware software.

...

  • Browser IntegrationMany recent browsers (IE7+, FF2+) can detect, or 'autodiscover', links to the document describing the search engine. Thus you can easily add your or other DSpace instances to the drop-down list of search engines in your browser. This list typically appears in the upper right corner of the browser, with a search box. In Firefox, for example, when you visit a site supporting OpenSearch, the color of the drop-down list widget changes color, and if you open it to show the list of search engines, you are offered an opportunity to add the site to the list. IE works nearly the same way but instead labels the web sites 'search providers'. When you select a DSpace instance as the search engine and enter a search, you are simply sent to the regular search results page of the instance.
  • Flexible, interesting RSS FeedsBecause one of the formats that OpenSearch specifies for its results is RSS (or Atom), you can turn any search query into an RSS feed. So if there are keywords highly discriminative of content in a collection or repository, these can be turned into a URL that a feed reader can subscribe to. Taken to the extreme, one could take any search a user makes, and dynamically compose an RSS feed URL for it in the page of returned results. To see an example, if you have a DSpace with OpenSearch enabled, try:
    Code Block
    http://dspace.mysite.edu/open-search/?query
    -
    =<your query>
    The <your query>The default format returned is Atom 1.0, so you should see an Atom document containing your search results.
  • You can extend the syntax with a few other parameters, as follows: |

    Parameter

    |

    Values

    |

    format

    atom, rss, html

    scope

    <handle>—search is restricted to handle of a collection or community with the indicated handle. to restrict the search to

    rpp

    number indicating the number of results per page (i.e. per request)

    start

    number of page to start with (if paginating results)

    sort_by

    number indicating sorting criteria (same as DSpace advanced search values

    Multiple parameters may be specified on the query string, using the "&" character as the delimiter, e.g.:
    Code Block
    http://dspace.mysite.edu/open-search/?query=<your query>&format=rss&scope=123456789/1
  • Cheap metasearchSearch aggregators like A9 (Amazon) recognize OpenSearch-compliant providers, and so can be added to metasearch sets using their UIs. Then you site can be used to aggregate search results with others.

Configuration is

...

through the

...

dspace.cfg file.

...

See OpenSearch Support for more details.

Embargo Support

What is an Embargo?

...

Functionally, the embargo system allows you to attach 'terms' to an item before it is placed into the repository, which express how the embargo should be applied. What do 'we mean by terms' here? They are really any expression that the system is capable of turning into (1) the time the embargo expires, and (2) a concrete set of access restrictions. Some examples:
"2020-09-12" - an absolute date (i.e. the date embargo will be lifted)"6 months" - a time relative to when the item is accessioned"forever" - an indefinite, or open-ended embargo"local only until 2015" - both a time and an exception (public has no access until 2015, local users OK immediately)"Nature Publishing Group standard" - look-up to a policy somewhere (typically 6 months)
These terms are 'interpreted' by the embargo system to yield a specific date on which the embargo can be removed or 'lifted', and a specific set of access policies. Obviously, some terms are easier to interpret than others (the absolute date really requires none at all), and the 'default' embargo logic understands only the most basic terms (the first and third examples above). But as we will see below, the embargo system provides you with the ability to add in your own 'interpreters' to cope with any terms expressions you wish to have. This date that is the result of the interpretation is stored with the item and the embargo system detects when that date has passed, and removes the embargo ("lifts it"), so the item bitstreams become available. Here is a more detailed life-cycle for an embargoed item:

  1. Terms Assignment. The first step in placing an embargo on an item is to attach (assign) 'terms' to it. If these terms are missing, no embargo will be imposed. As we will see below, terms are carried in a configurable DSpace metadata field, so assigning terms just means assigning a value to a metadata field. This can be done in a web submission user interface form, in a SWORD deposit package, a batch import, etc. - anywhere metadata is passed to DSpace. The terms are not immediately acted upon, and may be revised, corrected, removed, etc, up until the next stage of the life-cycle. Thus a submitter could enter one value, and a collection editor replace it, and only the last value will be used. Since metadata fields are multivalued, theoretically there can be multiple terms values, but in the default implementation only one is recognized.
  2. Terms interpretation/imposition. In DSpace terminology, when an item has exited the last of any workflow steps (or if none have been defined for it), it is said to be 'installed' into the repository. At this precise time, the 'interpretation' of the terms occurs, and a computed 'lift date' is assigned, which like the terms is recorded in a configurable metadata field. It is important to understand that this interpretation happens only once, (just like the installation), and cannot be revisited later. Thus, although an administrator can assign a new value to the metadata field holding the terms after the item has been installed, this will have no effect on the embargo, whose 'force' now resides entirely in the 'lift date' value. For this reason, you cannot embargo content already in your repository (at least using standard tools). The other action taken at installation time is the actual imposition of the embargo. The default behavior here is simply to remove the read policies on all the bundles and bitstreams except for the "LICENSE" or "METADATA" bundles. See the section on Extending Embargo Functionality for how to alter this behavior. Also note that since these policy changes occur before installation, there is no time during which embargoed content is 'exposed' (accessible by non-administrators). The terms interpretation and imposition together are called 'setting' the embargo, and the component that performs them both is called the embargo 'setter'.
  3. Embargo Period. After an embargoed item has been installed, the policy restrictions remain in effect until removed. This is not an automatic process, however: a 'lifter' must be run periodically to look for items whose 'lift date' is past. Note that this means the effective removal of an embargo is not the lift date, but the earliest date after the lift date that the lifter is run. Typically, a nightly cron-scheduled invocation of the lifter is more than adequate, given the granularity of embargo terms. Also note that during the embargo period, all metadata of the item remains visible. This default behavior can be changed. One final point to note is that the 'lift date', although it was computed and assigned during the previous stage, is in the end a regular metadata field. That means, if there are extraordinary circumstances that require an administrator (or collection editor‚Äîanyone editor‚ anyone with edit permissions on metadata) to change the lift date, they can do so. Thus, they can 'revise' the lift date without reference to the original terms. This date will be checked the next time the 'lifter' is run. One could immediately lift the embargo by setting the lift date to the current day, or change it to 'forever' to indefinitely postpone lifting.
  4. Embargo Lift. When the lifter discovers an item whose lift date is in the past, it removes (lifts) the embargo. The default behavior of the lifter is to add the resource policies that would have been added had the embargo not been imposed. That is, it replicates the standard DSpace behavior, in which an item inherits it's policies from its owning collection. As with all other parts of the embargo system, you may replace or extend the default behavior of the lifter (see section V. below). You may wish, e.g. to send an email to an administrator or other interested parties, when an embargoed item becomes available.
  5. Post Embargo. After the embargo has been lifted, the item ceases to respond to any of the embargo life-cycle events. The values of the metadata fields reflect essentially historical or provenance values. With the exception of the additional metadata fields, they are indistinguishable from items that were never subject to embargo.