Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

The current DSpace Reviewer Workflow System hardcodes a fixed set of 3 steps, of which the order is determined beforehand, and only the authenticated users (epersons) assigned to each of these workflow steps may participate in the workflow tasks that get assigned. In various use cases, the 3 simple steps for approving, rejecting, and editing the metadata of an submission, executed by the first person who claims the task from a predefined group (pool) of users, have proved to be quite insufficient when institutions have significantly more complex needs for mediating the review of DSpace submissions.
To resolve several of the shortcomings, @mire has implemented a "Configurable ”Configurable DSpace Workflow System"System”, planned for contribution to DSpace 1.8. This system is a generalization of a solution implemented and deployed in the production environment for certain clients.

...

The workflow actions configuration is located in the {dspace.dir}/config/spring/api directory and is named "workflow-actions.xml". This configuration file describes the different Action Java classes that are used by the workflow framework. Because the workflow framework uses Spring framework for loading these action classes, this configuration file contains Spring configuration.

...

Code Block
<?xml version="1.0" encoding="UTF-8"?>
<beans
    xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:util="http://www.springframework.org/schema/util"
    xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd
                           http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util-2.0.xsd">

    <!-- At the top are our bean class identifiers --->
    <bean id="{action.api.id}" class="{class.path}" scope="prototype"/>
    <bean id="{action.api.id.2}" class="{class.path}" scope="prototype"/>

 <!-- Below the class identifiers come the declarations for out actions/userSelectionMethods -->

 <!-- Use class workflowActionConfig for an action -->
 <bean id="{action.id}" class="orgoorg.dspace.workflowxmlworkflow.state.actions.WorkflowActionConfig" scope="prototype">
     <constructor-arg type="java.lang.String" value="{action.id}"/>

     <property name="processingAction" ref="{action.api.id}"/>
     <property name="requiresUI" value="{true/false}"/>
 </bean>

 <!-- Use class UserSelectionActionConfig for a user selection method -->
 <!--User selection actions-->
 <bean id="{action.api.id.2}" class="org.dspace.workflowxmlworkflow.state.actions.UserSelectionActionConfig" scope="prototype">
     <constructor-arg type="java.lang.String" value="{action.api.id.2}"/>

     <property name="processingAction" ref="{user.selection.bean.id}"/>
     <property name="requiresUI" value="{true/false}"/>
 </bean>
</beans>

...

Code Block
<bean id="{action.api.id.2}" class="org.dspace.workflowxmlworkflow.state.actions.UserSelectionActionConfig" scope="prototype">
     <constructor-arg type="java.lang.String" value="{action.api.id.2}"/>

     <property name="processingAction" ref="{user.selection.bean.id}"/>
     <property name="requiresUI" value="{true/false}"/>
 </bean>

...

The configuration file for the workflow user interface actions is located in the {dspace.dir}/config/xmlui and is named "workflow-actions-xmlui.xml". BEach bean defined here has an id which is the action identifier and the class is a classpath which links to the xmlui class responsible for generating the User Interface side of the workflow action. Each of the class defined here must extend the //org.dspace.app.xmlui.aspect.submission.workflow.AbstractXMLUIAction// class, this class contains some basic settings for an action and has a method called //addWorkflowItemInformation()// which will render the given item with a show full link so you don't have to write the same code in each of your actions if you want to display the item. The id attribute used for the beans in the configuration must correspond to the id used in the workflow configuration. In case an action requires a User Interface class, the workflow framework will look for a UI class in this configuration file.

...

Currently, the authorizations are always granted and revoked based on the tasks that are available for certain users and groups. The types of authorization policies that is granted for each of these is always the same:

  • READ
  • WRITE
  • ADD
  • DELETE

Database

...

Database changes - Original text

...

Database changes

The following changes tables have been made to existing added to the DSpace database. All tables are prefixed with 'xmlwf_' to avoid any confusion with the existing workflow related database tables:

tasklistitem

Entries in this table represent tasks in a task pool. The changes to this existing database table are the following:

  • Column action_id (ADD, TEXT): This column stores the action ID for which the pool tasks exist
  • Column step_id (ADD, TEXT): This column stores the step ID of the step that contains the action
  • Column workflow_id (ALTER to TEXT): This column stores the workflow process ID for the workflow that contains the step
  • Column workflow_item_id (ADD, INTEGER, REFERENCES workflowitem(workflow_id)): Contains the ID of the workflow item. This column replaces the previous workflow_id column.
  • Column group_id (ADD, INTEGER, REFERENCES epersongroup(eperson_group_id)): Because tasks in a tasks pool can be created for both groups and epersons, an additional group_id column has been added. The eperson_id column existed.

In order to make these changes backwards compatible and in order to maintain support for the old workflow in the JSPUI, the change from the workflow_id column to workflow_item_id must be addressed.

Required to support the old workflow

A number of columns in the existing workflow tables are no longer used by the new workflow framework but should probably not be removed in case the JSPUI still uses the old workflow. These columns are the following:

workflowitem

  • state
  • owner
  • multiple_titles (required for back to submission?)
  • multiple_files (required for back to submission?)
  • published_before (required for back to submission?)

Currently, these changes can be found in the {dspace.src.dir}/etc directory in the file //collection-workflow-changes.sql//. They will have to be moved to the database upgrade and creation scripts for both Oracle and PostgreSQL.

Metadata Schema

The new reviewer workflow framework uses a number of internal metadata fields to store temporary information. This information is used to keep track of the number of people who are performing their task on a workflow item and the number of people who have finished their task. This allows the workflow framework to offer the ability to require multiple people to perform the same action on one workflow item at the same time. The different metadata fields are configured in the //{dspace.dir}/registries/workflow-types.xml// file. Currently, the following metadata fields are used:

  • workflow.step.inProgressUsers: Contains the userIds of the users who are performing a step
  • workflow.step.finishedUsers: Contains the userIds of the users who are performed a step

Other issues

Backwards compatibility

For the DSpace 1.6 clients that use the new workflow framework, it is recommended to send all current items in the workflow back to the workspace since not doing this and changing the workflow might result in the loss of some workflow items. This can be done by using the following command: {dspace.dir}/bin/dsrun org.dspace.workflow.util.WorkflowItemConverter (no arguments are required).

The workflow that will be contributed in DSpace 1.8 will require the database upgrade script to convert the existing workflow items and configuration to the same state in the new workflow framework. In order to do this, the following issues need to be addressed:

...

xmlwf_workflowitem

The xmlwf_workflowitem table contains the different workflowitems in the workflow. This table has the following columns:

  • workflowitem_id: The identifier of the workflowitem and primary key of this table
  • item_id: The identifier of the DSpace item to which this workflowitem refers.
  • collection_id: The collection to which this workflowitem is submitted.
  • multiple_titles: Specifies whether the submission has multiple titles (important for submission steps)
  • published_before: Specifies whether the submission has been published before (important for submission steps)
  • multiple_files: Specifies whether the submission has multiple files attached (important for submission steps)

xmlwf_collectionrole

The xmlwf_collectionrole table represents a workflow role for one collection. This type of role is the same as the roles that existed in the original workflow meaning that for each collection a separate group is defined to described the role. The xmlwf_collectionrole table has the following columns:

  • collectionrol_id: The identifier of the collectionrole and the primaty key of this table
  • role_id: The identifier/name used by the workflow configuration to refer to the collectionrole
  • collection_id: The collection identifier for which this collectionrole has been defined
  • group_id: The group identifier of the group that defines the collection role

xmlwf_workflowitemrole

The xmlwf_workflowitemrole table represents roles that are defined at the level of an item. These roles are temporary roles and only exist during the execution of the workflow for that specific item. Once the item is archived, the workflowitemrole is deleted. Multiple rows can exist for one workflowitem with e.g. one row containing a group and a few containing epersons. All these rows together make up the workflowitemrole The xmlwf_workflowitemrole table has the following columns:

  • workflowitemrole_id: The identifier of the workflowitemrole and the primaty key of this table
  • role_id: The identifier/name used by the workflow configuration to refer to the workflowitemrole
  • workflowitem_id: The xmlwf_workflowitem identifier for which this workflowitemrole has been defined
  • group_id: The group identifier of the group that defines the workflowitemrole role
  • eperson_id: The eperson identifier of the eperson that defines the workflowitemrole role

xmlwf_pooltask

The xmlwf_pooltask table represents the different task pools that exist for a workflowitem. These task pools can be available at the beginning of a step and contain all the users that are allowed to claim a task in this step. Multiple rows can exist for one task pool containing multiple groups and epersons. The xmlwf_pooltask table has the following columns:

  • pooltask_id: The identifier of the pooltask and the primaty key of this table
  • workflowitem_id: The identifier of the workflowitem for which this task pool exists
  • workflow_id: The identifier of the workflow configuration used for this workflowitem
  • step_id: The identifier of the step for which this task pool was created
  • action_id: The identifier of the action that needs to be displayed/executed when the user selects the task from the task pool
  • eperson_id: The identifier of an eperson that is part of the task pool
  • group_id: The identifier of a group that is part of the task pool

xmlwf_claimtask

The xmlwf_claimtask table represents a task that has been claimed by a user. Claimed tasks can be assigned to users or can be the result of a claim from the task pool. Because a step can contain multiple actions, the claimed task defines the action at which the user has arrived in a particular step. This makes it possible to stop working halfway the step and continue later. The xmlwf_claimtask table contains the following columns:

  • claimtask_id: The identifier of the claimtask and the primary key of this table
  • workflowitem_id: The identifier of the workflowitem for which this task exists
  • workflow_id: The id of the workflow configuration that was used for this workflowitem
  • step_id: The step that is currenlty processing the workflowitem
  • action_id: The action that should be executed by the owner of this claimtask
  • owner_id: References the eperson that is responsible for the execution of this task

xmlwf_in_progress_user

The xmlwf_in_progess_user table keeps track of the different users that are performing a certain step. This table is used because some steps might require multiple users to perform the step before the workflowitem can proceed. The xmlwf_in_progress_user table contains the following columns:

  • in_progress_user_id: The identifier of the in progress user and the primary key of this table
  • workflowitem_id: The identifier of the workflowitem for which the user is performing or has performed the step.
  • user_id: The identifier of the eperson that is performing or has performe the task
  • finished: Keeps track of the fact that the user has finished the step or is still in progress of the execution

Backwards compatibility

Aspects and configuration files

In order to be backwards compatible with previous versions of DSpace, a number of changes have been made to the existing submission/workflow aspect. The submission aspect has been split up into muliple aspects: one submission aspect for the submission process, one workflow aspect containing the code for the original workflow and one xmlworkflow aspect containing the code for the new XML configurable workflow framework. In order to enable one of the two aspects, either the workflow or xmlworkflow aspect should be enabled in the dspace-install-dir/config/xmlui.xconf configuration file. Besides that, a workflow configuration file has been created that specifies the workflow that will be used in the back-end of the DSpace code. It is important that the option selected in this configuration file matches the aspect that was enabled. The workflow configuration file is available in dspace-install-dir/config/modules/workflow.cfg. This configuration file has been added because it is important that a CLI import process uses the correct workflow and this should not depend on the UI configuration. The workflow.cfg configration file contains the following propertu:

  • workflow.framework: possible values are 'originalworkflow' for the original workflow or xmlworkflow for the new XML configurable workflow framework.

Workflowitem conversion/migration scripts

Depending on the workflow that is used by a DSpace installation, different scripts can be used when migration to the new workflow.

SQL based migration

SQL based migration can be used when the out of the box original workflowframework is used by your DSpace installation. This means that your DSpace installation uses the workflow steps and roles that are available out of the box. The migration script will migrate the policies, roles, tasks and workflowitems from the original workflow to the new workflowframework. The following SQL scripts are available depending on the database that is used by the DSpace installation:

Java based migration

In case your DSpace installation uses a customized version of the workflow, the migration script might not work properly and a different approach is recommended. Therefore, an additional Java based script has been created that restarts the workflow for all the workflowitems that exist in the original workflow framework. The script will take all the existing workflowitems and place them in the first step of the XML configurable workflow framework thereby taking into account the XML configuration that exists at that time for the collection to which the item has been submitted. This script can also be used to restart the workflow for workflowitems in the original workflow but not to restart the workflow for items in the XML configurable workflow.

To execute the script, run the following CLI command: dspace(.bat) dsrun org.dspace.xmlworkflow.migration.RestartWorkflow -e admin@myrespository.org

The following arguments can be specified when running the script:

  • -e: specifies the username of an adminstrator user
  • -n: if sending submissions through the workflow, send notification emails
  • -p: the provenance description to be added to the item
  • -h: help

Additional workflow steps/actions and features

Work in progress...

Workflow overview features

A new features has been added to the XML based workflow that resembles the features available in the JSPUI of DSpace that allows administrators to abort workflowitems. The feature added to the XMLUI allows administrators to look at the status of the different workflowitems and look for workflowitems based on the collection to which they have been submitted. Besides that, the administrator has the ability to permanently delete the workflowitem or send the item back to the submitter.

Issues

...

Curation System

The DSpace 1.7 version of the curation system integration into the original DSpace workflow only exists in the WorkflowManager.advance() method. Before advancing to the next workflow step or archiving the Item, a check is performed to see whether any curation tasks need to be executed/scheduled. The problem is that this check is based on the hardcoded workflow steps that exist in the original workflow. These hardcoded checks are done in the CurationManager and will need to be changed.

...