Page History

...

"Select Collection" step: If not already selected, the user must select a collection to deposit the Item into.
"Describe" step: This is where the user may enter descriptive metadata about the Item. This step may consist of one or more pages of metadata entry. By default, there are two pages of metadata-entry. For information on modifying the metadata entry pages, please see Custom Metadata-entry Pages for Submission section below.
"Upload" step: This is where the user may upload one or more files to associate with the Item. For more information on file upload, also see Configuring the File Upload step below.
"Review" step: This is where the user may review all previous information entered, and correct anything as needed.
"License" step: This is where the user must agree to the repository distribution license in order to complete the deposit. This repository distribution license is defined in the [dspace]/config/default.license file. It can also be customized per-collection from the Collection Admin UI.
"Complete" step: The deposit is now completed. The Item will either become immediately available or undergo a workflow approval process (depending on the Collection policies). For more information on the workflow approval process see: Configurable Workflow.

To modify or reorganize these submission steps, just modify the [dspace]/config/item-submission.xml file. Please see the section below on Reordering/Removing/Adding Submission Steps.

You can also choose to have different submission processes for different DSpace Collections. For more details, please see the section below on Assigning a custom Submission Process to a Collection.

Note

title	DSpace 4.0 has removed the "Initial Questions" step by default

Prior to DSpace 4.0

Note

title	DSpace 4.0 has removed the "Initial Questions" step by default

Prior to DSpace 4.0, the "Initial Questions" step preceded all "Describe" steps. However, it was removed by default in DSpace 4.0.

You may still choose to re-enable the "Initial Questions" step, as needed. However, please note the warning below about the auto-assigning of Dates in the "Initial Questions" step.

...

"Access" step: This step allows users the user to (optionally) modify access rights or set an embargo during the deposit of an Item. For more information on this step, and Embargo options in general, please see the Embargo documentation.
"CC License" step: This step allows users the user to (optionally) assign a Creative Commons license to a particular Item. Please see the Configuring Creative Commons License section of the Configuration documentation for more details.
"Start Submission Lookup" step: This step allows the user to search or load metadata from an external service (arXiv online, bibtex file, etc.) and prefill the submission form. For more information on enabling and using it, please see the section on Configuring StartSubmissionLookupStep below.

"Initial Questions" step: This step asks users a simple set of "initial questions" which help to determine which metadata fields are displayed in the "Describe" step (see above). These initial questions include:

Multiple Titles: The item has more than one title, e.g. a translated title (If selected, then users will be asked for an alternative title in the Describe step)

Published Before: The item has been published or publicly distributed before (If selected, then users will be asked for a publication date and publisher in the Describe step).

Warning

title	Initial Questions will auto-assign a publication date when "Published Before" is unselected

Please note, if you enable Initial Questions, and your users do NOT select "Published Before" option, then DSpace will auto-assign a publication date (dc.date.issued) to that particular Item.

It may be entirely accurate for some types of content (e.g. for gray literature or even theses/dissertations) to auto-assign this publication date. As such, you may wish to still enable "Initial Questions" if your repository is mainly for previously unpublished content. You may also choose to only enable it for specific Collections – see Assigning a custom Submission Process to a Collection section below.

However, if the Item actually was published in some other location, this will result in an incorrect publication date being reported by DSpace. This tendency for an incorrect publication date has been reported by Google Scholar to DSpace developers (see: DS-1481), which is why the "Initial Questions" are now disabled by default .(see DS-1655).

To enable any of these optional submission steps, just uncomment the step definition within the [dspace]/config/item-submission.xml file. Please see the section below on Reordering/Removing/Adding Submission Steps.

You can also choose to enable certain steps only for specific DSpace Collections. For more details, please see the section below on Assigning a custom Submission Process to a Collection.

Understanding the Submission Configuration File

...

value-pairs-name – Name by which an input-type refers to this list.
dc-term – Qualified Dublin Core field for which this choice list is selecting a value.

Each value-pairs element contains a sequence of pair sub-elements, each of which in turn contains two elements:

displayed-value – Name shown (on the web page) for the menu entry.
stored-value – Value stored in the DC element when this entry is chosen. Unlike the HTML select tag, there is no way to indicate one of the entries should be the default, so the first entry is always the default choice.

...

Code Block

language	xml
title	item-submission.xml excerpt

     <step id="collection">
       <heading></heading> <!--can specify heading, if you want it to appear in Progress Bar-->
	   <processing-class>org.dspace.submit.step.StartSubmissionLookupStep</processing-class>
       <jspui-binding>org.dspace.app.webui.submit.step.JSPStartSubmissionLookupStep</jspui-binding>
       <xmlui-binding>org.dspace.app.xmlui.aspect.submission.submit.SelectCollectionStep</xmlui-binding>
       <workflow-editable>false</workflow-editable>
     </step>

Info

title	UI compatibility

The new step is available only for JSP UI. Nonetheless, if you run both UIs and want the JSP UI benefit of the new step you can configure it as processing class also for XML as it degrades graceful gracefully to the standard SelectCollectionStep logic

...

The BTE is a Java framework developed by the Hellenic National Documentation Centre (EKT) and consists of programmatic APIs for filtering and modifying records that are retrieved from various types of data sources (eg. databases, files, legacy data sources) as well as for outputing outputting them in appropriate standards formats (eg. database files, txt, xml, Excel). The framework includes independent abstract modules that are executed seperately, offering in many cases alternative choices to the user depending of the input data set, the transformation workflow that needs to be executed and the output format that needs to be generated.

The basic idea behind the BTE is a standard workflow that consists of three steps, a data loading step, a processing step (record filtering and modification) and an output generation. A data loader provides the system with a set of Records, the processing step is responsible for filtering or modifying these records and the output generator outputs them in the appropriate format.

The standard BTE version offers several predefined Data Loaders as well as Output Generators for basic bibliographic formats. However, Spring Dependency Injection can be utilized to load custom data loaders, filters, modifiers and output generators.

...

There are four accordion tabs (default configuration hide hides the third tab):

1) Search for identifier: In this tab, the user can search for an identifier in the supported online services (currently, arXiv, PubMed, CrossRef and CrossRef CiNii are supported). The publication results are presented in the tab "Results" in which the user can select the publication to proceed with. This means that a new submission form will be initiated with the form fields prefilled with metadata from the selected publication.

Currently, there are three four identifiers that are supported (DOI, PubMed ID, arXiv ID and arXiv NAID (CiNii ID) ). But these can be extended - refer to the following paragraph regarding the SubmissionLookup service configuration file.

User can fill in any of the three four identifiers. DOI is preferable. Keep in mind that the service can integrate results for the same publication from the three different providers so filling any of the three four identifiers will pretty much do the work. If identifiers for different publications are provided, the service will return a list of publications which will be shown to user to select. The selected publication will make it to the submission form in which some fields will be pre-filled with the publication metadata. The mapping from the input metadata (from arXiv or Pubmed or CrossRef or CiNii) to the DSpace metadata schema (and thus, the submission form) is configured in the Spring XML file that is discussed later on - you can see a table at the very end of this chapter.

Through the same file, a user can also extend the providers that the SubmissionLookup service can search publication from.

2) Upload a file: In this tab, the user can upload a file, select the type (bibtex. scvcsv, etc.), see the publications in the "Results" tab and then either select one to proceed with the submission or make all of them "Workspace Items" that can be found in the "Unfinished Submissions" section in the "My DSpace" page.

...

"OFF": All the publications of the uploaded file will be imported in the user's MyDSpace page as "Unfinished Submissions" while the first one will go thought the submission process.

(Regarding the pubmed, crossref and arxiv file upload, you can find the attached file named "sample-files.zip" that contains samples of these three file types)

3) Free search: In this tab, the user can freely search for Title, Author and Year in the four 3) Free search: In this tab, the user can freely search for Title, Author and Year in the three supported providers (PubMed, CrossRef, Arxiv and ArxivCiNii). By default, the three four providers are configured to be disabled for free search but you can enable it via the configuration file. Thus, initially this accordion tab is not shown to the user except for a data loader is declared as a "search provider" - refer to the following paragraphs.

...

This is the top level bean that describes the service of theSubmissionLookupthe SubmissionLookup. It accepts two three properties:

a) phase1TransformationEngine: the phase 1 BTE transformation engine.

b) phase2TransformationEngine: the phase 2 BTE transformation engine

Code Block

language	html/xml

<bean id="phase1TransformationEngine" />

The transformation engine for the first phase of the service (from external service to intermediate format)

c) detailFields: A list of the keys that the user wants to display in the detailed form of a publication. That is, when the results are shown, user can see the details of each one. In the detailed form, some fields appear. These fields are configured by this property. Refer to the table at the very end of this chapter to see the available values. This property is disabled by default while the list that is shown commented out is the default list for the detailed form.

Code Block

language	html/xml

<bean id="phase1TransformationEngine" />

The transformation engine for the first phase of the service (from external service to intermediate format)

It accepts three properties:

...

This bean declares the data loader to be used to load publications from. It has one property "dataloadersMap", a map that declares key-value pairs, thas that is a unique key and the corresponding data loader to be used. Here is the point where a new data loader can be added, in case the ones that are already supported do not meet your needs.

...

Code Block

language	html/xml

<bean id="bibTeXDataLoader" />
<bean id="csvDataLoader" />
<bean id="tsvDataLoader" />
<bean id="risDataLoader" />
<bean id="endnoteDataLoader" />
<bean id="pubmedFileDataLoader" />
<bean id="arXivFileDataLoader" />
<bean id="crossRefFileDataLoader" />
<bean id="ciniiFileDataLoader" />
<bean id="pubmedOnlineDataLoader" />
<bean id="arXivOnlineDataLoader" />
<bean id="crossRefOnlineDataLoader" />
<bean id="ciniiOnlineDataLoader" />

These beans are the actual data loaders that are used by the service. They are either "FileDataLoaders" or "SubmissionLookupDataLoaders" as mentioned previously.

...

a) fieldMap : it is a map that specifies the mapping between the keys that hold the metadata in the input format and the ones that we want to have internal in the BTE. At the end of this article there is a table that summarises the fields that are used from the three online services (pubmed, arXiv and crossRef) - which are the ones that the submission lookup step is capable of reading from the online services - and the keys used internally in the BTE.

Some loaders have more properties:

...

pubmedOnlineDataLoader, crossRefOnlineDataLoader and , arXivOnlineDataLoader and ciniiOnlineDataLoader also support another property:

a) searchProvider: if is set to true, the dataloader supports free search by title, author or year. If at least one of these data loaders is declared as a search provider, the accordion tab "Free search" is appeared. Otherwise, it stays hidden.

crossRefOnlineDataLoader and ciniiOnlineDataLoader

Code Block

language	html/xml

<bean id="phase1LinearWorkflow" />

This bean specifies the processing steps to be applied to the records metadata before they proceed to the output generator of the transformation engine. Currenty, three steps are supported, but you can add yours as well.

Code Block

language	html/xml

<bean id="mapConverter_arxivSubject" />
<bean id="mapConverter_pubstatusPubmed" />
<bean id="removeLastDot" />

These beans are the processing steps that are supported by the 1st phase of transformation engine. The two first map an incoming value to another one specified in a properties file. The last one is responsible to remove the last dot from the incoming value.

All of them have the property "fieldKeys" which is a list of keys where the step will be applied.

In the case you need to create your own filters and modifiers follow the instructions below:

To create a new filter, you need to extend the following BTE abstact class:

Code Block
gr.ekt.bte.core.AbstractFilter

You will need to implement the following method:

Code Block

language	java

public abstract boolean isIncluded ( Record  record )

...

also have two more properties:

a) apiKey/appId respectively: Both these services need to acquire (for free) an API key in order to access their online services. For CrossRef, visit: http://www.crossref.org/requestaccount/ and for CiNii visit: https://portaltools.nii.ac.jp/developer/en/

b) maxResults: the maximum results that these services will reply with to your search. By default, this property is commented out while the default value is 10 for both services.

(Regarding the file dataloaders, you can find the attached file named "sample-files.zip" that contains samples of all the file types that the corresponding data loaders can handle)

Code Block

language	html/xml

<bean id="phase1LinearWorkflow" />

This bean specifies the processing steps to be applied to the records metadata before they proceed to the output generator of the transformation engine. Currenty, three steps are supported, but you can add yours as well.

Code Block

language	html/xml

<bean id="mapConverter_arxivSubject" />
<bean id="mapConverter_pubstatusPubmed" />
<bean id="removeLastDot" />

These beans are the processing steps that are supported by the 1st phase of transformation engine. The two first map an incoming value to another one specified in a properties file. The last one is responsible to remove the last dot from the incoming value.

All of them have the property "fieldKeys" which is a list of keys where the step will be applied.

In the case you need to create your own filters and modifiers follow the instructions below:

To create a new filter, you need to extend the following BTE abstact class:

Code Block
gr.ekt.bte.core.AbstractFilter

You will need to implement the following method:

Code Block

language	java

public abstract boolean isIncluded ( Record  record )

Return false if the specified record needs to be filtered, otherwise return true.

To create a new modifier, you need to extend the following BTE abstact class:

Code Block
gr.ekt.bte.core.AbstractModifier

You will need to implement the following method:

Code Block

language	java

public abstract Record modify ( Record record )

within you can make any changes you like in the record. You can use the Record methods to get the values for a specific key and load new ones (For the later, you need to make the Record mutable)

After you create your own filters or modifiers you need to add them in the Spring XML configuration file as in the following example:

Code Block

language	html/xml

<bean id="customfilter"   class="org.mypackage.MyFilter" />

<bean id="phase1LinearWorkflow" class="gr.ekt.bte.core.LinearWorkflow">
    <property name="process">
    <list>
		 ... <old filters and modifiers>...
         <ref bean="customfilter" />
    </list>
    </property>
</bean>

Code Block

language	html/xml

<bean id="phase2TransformationEngine" />

The transformation engine for the second phase of the service (from the intermediate format to DSpace metadata schema)

Normally, you do not need to touch any of these three properties. You can edit the reference beans instead.

Code Block

language	html/xml

<bean id="phase2linearWorkflow" />

This bean specifies the processing steps to be applied to the records metadata before they proceed to the output generator of the transformation engine. Currenty, two steps are supported, but you can add yours as well.

Code Block

language	html/xml

<bean id="fieldMergeModifier" />
<bean id="valueConcatenationModifier" />
<bean id="languageCodeModifier" />

These beans are the processing steps that are supported by the 2nd phase of transformation engine. The first merges the values of multiple keys to a new key. The second one concatenates the values of a specific key to a unique value. The third one translated the three-letters language code to two-letters one (ie: eng to en)

Code Block

language	html/xml

<bean id="org.dspace.submit.lookup.DSpaceWorkspaceItemOutputGenerator" />

This bean declares the output generator to be used which is, in this case, a DSpaceWorkspaceItem generator. It accepts two properties:

a) outputMap: A map from the intermediate keys to the DSpace metadata schema fields. The table below displays the default output mapping. As you can see, some fields, while the are read from the input source, are not output in DSpace since there are no default metadata schema fields to host them. However, if you create the corresponding metadata field registry, you can come back in this configuration to add a map between the input field key and the DSpace metadata field.

b) extraMetadataToKeep: A list of DSpace metadata schema fields to keep in the output

The following table presents the available keys from the online services, the keys that BTE uses in phase1 and the final output map to DSpace metadata fields.

Arxiv	PubMed	CrossRef	CiNii	BTE Key (phase 1)	Extra Keys created by BTE (phase 2)	DSpace Metadata Field	Appears in Detail Form
title	articleTitle	articleTitle	title	title		dc.title	yes
published	pubDate	year	issued	issued		dc.date.issued	yes
id				url
summary	abstractText		description	abstract		dc.description.abstract	yes
comment				note
pdfUrl				fulltextUrl
doi	doi	doi		doi		dc.identifier	yes
journalRef	journalTitle	journalTitle	journal	journal		dc.source	yes
author	author	authors	authors	authors		dc.contributor.author	yes
authorWithAffiliation				authorsWithAffiliation
primaryCategory				arxivCategory		dc.subject	yes
category				arxivCategory		dc.subject
	pubmedID			pubmedID
	publicationStatus			publicationStatus
	pubModel
	printISSN	printISSN	issn	jissn		dc.identifier.issn	yes
	electronicISSN	electronicISSN		jeissn
	journalVolume	volume	volume	volume
	journalIssue	issue	issue	issue
	language		language	language		dc.language.iso	yes
	publicationType	doiType		subtype		dc.type	yes
	primaryKeyword		subjects	keywords	allkeywords	dc.subject	yes
	secondaryKeyword			keywords	allkeywords	dc.subject	yes
	primaryMeshHeading			mesh	allkeywords	dc.subject	yes
	secondaryMeshHeading			mesh	allkeywords	dc.subject	yes
	startPage	firstPage	spage	firstpage
	endPage	lastPage	epage	lastpage
		printISBN		pisbn		dc.identifier.isbn	yes
		electronicISBN		eisbn
		editionNumber		editionnumber
		seriesTitle		seriestitle
		volumeTitle		volumetitle
		publicationType
		editors		editors		dc.contributor.editor	yes
		translators		translators		dc.contributor.other	yes
		chairs		chairs		dc.contributor.other	yes
			naid	naid
			ncid	ncid
			publisher	publisher		dc.publisher	yes

Code Block
gr.ekt.bte.core.AbstractModifier

You will need to implement the following method:

Code Block

language	java

public abstract Record modify ( Record record )

within you can make any changes you like in the record. You can use the Record methods to get the values for a specific key and load new ones (For the later, you need to make the Record mutable)

After you create your own filters or modifiers you need to add them in the Spring XML configuration file as in the following example:

Code Block

language	html/xml

<bean id="customfilter"   class="org.mypackage.MyFilter" />

<bean id="phase1LinearWorkflow" class="gr.ekt.bte.core.LinearWorkflow">
    <property name="process">
    <list>
		 ... <old filters and modifiers>...
         <ref bean="customfilter" />
    </list>
    </property>
</bean>

Code Block

language	html/xml

<bean id="phase2TransformationEngine" />

The transformation engine for the second phase of the service (from the intermediate format to DSpace metadata schema)

Normally, you do not need to touch any of these three properties. You can edit the reference beans instead.

Code Block

language	html/xml

<bean id="phase2linearWorkflow" />

This bean specifies the processing steps to be applied to the records metadata before they proceed to the output generator of the transformation engine. Currenty, two steps are supported, but you can add yours as well.

Code Block

language	html/xml

<bean id="fieldMergeModifier" />
<bean id="valueConcatenationModifier" />

These beans are the processing steps that are supported by the 2nd phase of transformation engine. The first mergest he values of multiple keys to a new key. The second one concatenates the values of a specific key to a unique value.

Code Block

language	html/xml

<bean id="org.dspace.submit.lookup.DSpaceWorkspaceItemOutputGenerator" />

This bean declares the output generator to be used which is, in this case, a DSpaceWorkspaceItem generator. It accepts two properties:

a) outputMap: A map from the intermediate keys to the DSpace metadata schema fields.

...

Info

title	I can see more beans in the configuration file that are not explained above. Why is this?

The configuration file hosts options for two services. BatchImport service and SubmissionLookup service. Thus, some beans that are not used in the first service, are not mentioned in this documentation. However, since both services are based on the BTE, some beans are used by both services.

...

All Versions

DSpace Documentation

Page tree

Versions Compared

Old Version 18

New Version Current

Key

Understanding the Submission Configuration File