The history of an item should be available, and it should be possible to cite a particular version of the item.


  1.  Versioning is at the level of data files. 
  2. Data packages and data files have a relationship between their identifiers. Each version of a file/package is represented by a separate DOI. The "base" DOI points to the most recent version of the file/package, but a version number can be added to the base DOI to access previous versions.

  3. When a new version of data file is deposited, a new identifier will be created.
  4. Adding or changing README files or metadata will not result in creation of a new identifier.
  5. All metadata changes are logged by writing a metadata snapshot to the filesystem this allows us to retain a record of the changes, even though they are not available as explicit versions (and not viewable in the UI). - If all we are doing is adding a new bitstream without changing the existing bitstreams, there is no need to force a version number change.
  6. When a new identifier is created for a data file, a new identifier is automatically created for the corresponding data package.
  7. Each version of the item includes metadata about who created the version and the date/time. (it is essentially a full copy of the item, with modifications)
  8. Only the most recent version of an item is available via the search interface.
  9. On the item page, there is a link to view previous/subsequent versions.
  10. By examining the metadata, it is possible to determine whether an item is the most recent version, the original version, or an intermediate version.
  11. Previous versions of bitstreams are retained. If something was once retrievable, it is always retrievable.
  12. Creation of a new version is initiated by the author. On the "submissions" page, users should see all of their archived submissions. Each archived submission should have a button to submit a new version of a data file - this button doesn't appear anywhere else
  13. Expose Versioning detail in DSpace API and Services (I.E. OAI, Bagit, Etc.)

Technical Details

Work Areas for Implementation of Version History

  • Database Modifications
  • Data Access Objects and Domain Model definition
  • Enhancing DSpace XMLUI Item Adapter to expose Version Details on Items
  • Item User Interface Changes to support Versioning
    • Details that should be presented in User Interface
    • Actions that should be possible on existing versions (compare, revert, delete, withdraw)
    • Actions that should be possible to generate a new version
    • Actions that should be possible to users (submitter, author, curator)

User Scenarios

General user actions that would generate a new Version of a new Dryad Item and their overal impact on creation of a new DSpace item.



New Data Package Version

New Data File Version

Add Data File to Existing Data Package

Add a Data File Item to Dryad and add to existing Data Package



Delete Data File from Package

Remove a Data File from an existing Data Package



Replace Data File in existing Package

Remove an existing Data File (2) and add a new one (1)



Edit Metadata in Data File

Metadata edits do not produce new Items



Edit Metadata in Data Package

Metadata Edits do not produce new Items



Metadata Mapping

General Versioning of DSpace Items

Generic Versioning of DSpace Items will involve the alteration of existing handles for those versions in DSpace.


=Version 1

=Version 2

=Version 3












While past recommendation are that identifiers that are assigned to items be opaque, we have two possible benefit from selecting a versioning schedme for the identifiers.

1. The original Item and the version currently being referred to are captured in the identifier.
1. Resolution of the most current version can be attained programmatically without having to navigate the version hierarchy

The goal behind this versioning approach is to capture the version of the Item while retaining its original version stream id. In the above case the DSpace Item Handle will be added the the Handle Manager and resolvable via CNRI.

  • Reference to the most current version of the Item:
  • Reference to a specific version in the version history:

Versioning of the Dryad Data Package and its Identifiers

Adopting the General Solution in dealing with Composite Dryad Data Packages with Data Files will employ the same approach, but with doi identifiers.

Versioning of the Data Package will utilize versioning capabilities already inside the DOI service used to mint DOI's (written by Kevin Clark)


Package Version 1

Package Version 2

Package Version 3












The solution will also apply to individual Data Files when it is designated that the Data File is being replaced in the Data Package rather than being simply removed.


File Version 1

File Version 2

File Version 3












DSpace Data Model and Versioning

Past Architectural Review Group work on the DSpace Data Model and versioning focused on replicating the item contents when a new DSpace Item was created in DSpace This means that all item metdata and content would be replicated. This was done to ease the complexity of managing references across individual Items to the content and metadata that would be considered the same across those versions.

In the GSoC project to address versioning, we made an effort to optimize on the above situation and emplyed the following strategy for the production of a New version of a DSpace Item

DSpace Item Objects:

DSpace Bundle Objects:

For every new version of an Item A Bundle will be created and the Bundle will link to all the preexisting bitstreams for the original Item. This means that Bitstreams may be associated with more than one Item by being linked by Bundle. This may give rise to unexpected behavior in some of the DSpaceObject Code that retrieves the Parent Item from the Bitstream

DSpace Bitstream Objects:

Business Model

Versioning Service

Identifier Service

Storage Considerations

  • Should Versions and VersionHistories maybe be stored in separate tables from Items considered part of repository?
  • Version and VersionHistory may not be stored in Metadata, but instead calculated and added as additional metadata in the item?
  • Will previous versions of items be represented separately from the latest version in the Item table? This will be required for methods based on finding all items to be indexed, batch exported, updated by MediaFilter, ...
    • Using a separate column: latestVersion?
    • Using the in_archive column?

Administrative Interface possibilities

Versioning edits to an Item

With a full feature versioning capability on Items, we may possibly be able to support restoration of Item Versions through an interface such as this WIKI page example interface.

Versioning an individual Bitstream on an Item

Versioning individual Bitstreams can be comparable to versioning attachements to a wiki.

Further Scenarios

  • Deleting an Item from the Repository
  • Moving and Item Between Collections
  • Mapping an Item to another collection

Previous WIKI Page needing integration into above...

Versioning Services will layer on top of planned Resolver and Identifier Minting services to provide a layering of functionality where organizations can alter the versioning behavior and introduce their own enhancements:

Versioning Interaction with existing DSpace Systems

Start a new version

  • Versioning a new Item will be anOption on the "Context Menu".
  • Action will Create a New Item and place it into the Submission Workflow.
  • In for Item Versioning we will introduce the following new  
    context.addItem().addXref(contextPath+"/admin/newversion?itemID="+item.getID(), T_context_version_item);

Call to create new Item will be issued to the VersionService as.

ItemVersioningService vs = new DSpace().getServiceManager().getServiceByName(null, ItemVersioningService.class);

Item newVersion = vs.createNewVersion(item);

Call will be initiated from the JAvascript Administrative Controller. ( We may need to come up woth strategy for implementation of calls into the ServiceManager from the Controller.)

Start versioning a new item.
function startVersionItem()
var itemID = cocoon.request.get("itemID");
// verify we can create a new version

// creates a new versioin in the submission workflow.
var newItemID = new DSpace().getServiceManager()....doVersionItem(itemID);

//restart editing new item as if it were part of the submission workflow.
var newItem = Item.find(getDSContext(),itemID);

item = null;

Code approach in GSoC Versioning project

from ArchiveManager (Possible methods for VersioningService)

     * Gets an Item by its OriginalItemID and Revision numbers
    public static DSpaceObject getVersionedItem(Context context, int originalItemID, int revision)

        return ItemDAOFactory.getInstance(context).getByOriginalItemIDAndRevision(originalItemID, revision);
     * Gets the HEAD of an OriginalItemID
    public static DSpaceObject getHeadRevision(Context context, int originalItemID)

        return ItemDAOFactory.getInstance(context).getHeadRevision(originalItemID);
     * Creates a Item in the database that maintains all the same
     * attributes and metadata as the Item it supplants with a new
     * revision number and a link to the given Item as the previousRevision
     * a new bitstream is not created
     * This Item is ready to be put into the Workspace or a Workflow
     * @param item The Item to create a new version of
    public static Item newVersionOfItem(Context context, Item originalItem)
            ArchiveManager am = new ArchiveManager();
            ItemDAO itemDAO = ItemDAOFactory.getInstance(context);
            WorkspaceItemDAO wsiDAO = WorkspaceItemDAOFactory.getInstance(context);
            Item item = itemDAO.create();
            Item head = itemDAO.getHeadRevision(originalItem.getOriginalItemID());

            // Done by ItemDAO.update ... item.setLastModified();


            //System.out.println("Head: " + head.toString());


            // Add uri as identifier.uri DC value
            item.clearMetadata("dc", "identifier", "uri", null);

            for (Bundle bundle : originalItem.getBundles())
                item.addBundle(am.dupeBundle(context, bundle));

            return item;
        catch (Exception e)
            throw new RuntimeException(e);

     *  Takes in a bundle and makes a deep copy of it.
     *  Without duping bitstreams.
     *  @param bundle
    private Bundle dupeBundle (Context context, Bundle bundle)
    throws AuthorizeException
        BundleDAO bdao = BundleDAOFactory.getInstance(context);
        Bundle dupe = bdao.create();
        Bitstream[] bitstreams = null;
        int primary = bundle.getPrimaryBitstreamID();

        bitstreams = bundle.getBitstreams();
        for (Bitstream b : bitstreams)
            if (primary == b.getID())

        return dupe;

Will Create a New Item and Place it into the Submission Workflow

  • Separate New Versions of an Item May be started
  • Can only one new version be started, until it has been finalized?
  • Should the new version of the data package, data files, and bitstreams be processed in the submission and/or reviewing workflow?
  • Should information about the revision be hidden until approved?
  • Should the handle of a replaced item automatically point to the latest version?

Versioning of item metadata

The metadata for either a datapackage or a data file can be altered.

  • Should all items receive a new version number at once?

Versioning of files

  • If no files are altered, preferable reference to same bitstream without duplication
  • If a new file is being uploaded, or a file is replaced by a URL, the new data file will no longer reference to the file
  • Should URL's for files remain the same if the file didn't change?

Two User Stories

Simple Item Versioning Case

The first example is most basic and involves providing a means to request a new version of an Item. When request in the API

public interface VersioningService
    public Version<T> getPropertyAsType(T original);

Will be used by the application in the following manner:

Item item = ....

String dsoId = "dso:item/" + item.getId();

Verison<Item> itemVersion = new DSpace().getServiceManager().getService(VersioningService.class).createNewVersion(item);

In the simplest Usecase, a new Version of an item will be created, it will have the following characteristics:

New version


Value (Relationship)






<previous item> (identifier of previous version)

Previous Version








<new item> (Identifer of new Item)

We need to consider that the criteria for what constitues a verion of an Item will evolve with the feature and usage. But we have a basic agreement that usage of fields such as the following will be critical for versioning.

Where isReplacedBy and Replaces will link individual nodes in the version history while isVersionOf should be sufficient to identify an "original version history thread" for an Item.

The Versioning System should repurpose the identifier generated as values by the MintingService and ResolverService. This will allow the underlying identifer types to be changed out as desired while encapsulating the versioning logic cleanly wihtin the VersioningService.

Complex Item Case (Dryad Composite DataPackage/DataFile versioing)

TODO (feel free to expand on)

See For Further detail: [[]]

To clarify a little on the last point. We will probably be adjusting the handleManager to assure that we may have a handle specific tot he most current item that can be assigned separately from the current versions handle.

The current logic in reassigning a handle is the following:

1.) The item associated with the handle in the HandleManager is changed.
2.) The item metadata in dc.identifier.uri is updated

If we generate a separate version handle that always points at the most current, it would reside in a separate metadata field (dc.relation.isVersionOf) and would not be altered across versions.

We may consider using dc.relation.replaces / isReplacedBy for pointing backwards/forwards in the version history. If we do not use dc.relation.isReplaced by and just usedc.relation.replaces, we can avoid altering the original metadata record. But there is still some question in my mind to the importance of flagging that the current item has been "replaced"

as we discussed last Friday

Both DOI and Handle usecases require that a persistent id be created that represents the latest version of the Item. Ideally, this would be both calculated and serialized int he metadata in a manner to reduce having to update previous items when new items are added. this means that a VersionHistory identifier may not actually be "unique" in our metadata, but that calculating which version to return would always return the most recent.

For Item doi:10.651/dryad.154

For the next version (doi:10.651/dryad.154.2)
dc.identifier: doi:10.651/dryad.154.2
dc.relation.isVersionOf: doi:10.651/dryad.154 <--- should be present in all Items in VersionHistory, will be used to look up the entire version history
dc.relation.replaces: doi:10.651/dryad.154.1 <--- Will be used to trace the Revision Tree.

Where the previous version would have (doi:10.651/dryad.154.1)
dc.identifier: doi:10.651/dryad.154.1
dc.relation.isVersionOf: doi:10.651/dryad.154 <--- should be present in all Items in VersionHistory, will be used to look up the entire version history
dc.relation.isReplacedBy: doi:10.651/dryad.154.2--

IN the Handle approach for this will look like

For the next version (hdl:1234.5/3)
dc.identifier: hdl:1234.5/2
dc.relation.isVersionOf: hdl:1234.5/1 <--- should be present in all Items in VersionHistory, will be used to look up the entire version history
dc.relation.hasVersion: hdl:1234.5/2
dc.relation.replaces: hdl:1234.5/2 <--- Will be used to trace the Revision Tree.

Where the previous version (hdl:1234.5/2) would have
dc.identifier: hdl:1234.5/2
dc.relation.isVersionOf: hdl:1234.5/1 <--- should be present in all Items in VersionHistory, will be used to look up the entire version history
dc.relation.isReplacedBy: hdl:1234.5/3--

At this time, the HandleManager simply assigns a handle to an Item in DSpace, the adjustment that will need to be made is that whenever a new version is generated, the handle representing the version stream (hdl:1234.5/1) will need to be moved to the new Item, however, because it by definition is the identifier for the VersionHistory, and will always resolve the most current version, no, item metadata will need to be updated to reflect that change. This handle should be used when citing the most current version of the item.

If we were to use dc.relation.replaces and dc.relation.isVersionOf to identify both the VersionHistory and and not dc.relation.isReplacedBy to resolve

  • No labels