Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. "Select Collection" step: If not already selected, the user must select a collection to deposit the Item into.
  2. "Describe" step:  This is where the user may enter descriptive metadata about the Item. This step may consist of one or more pages of metadata entry. By default, there are two pages of metadata-entry. For information on modifying the metadata entry pages, please see Custom Metadata-entry Pages for Submission section below.
  3. "Upload" step: This is where the user may upload one or more files to associate with the Item.  For more information on file upload, also see Configuring the File Upload step below.
  4. "Review" step: This is where the user may review all previous information entered, and correct anything as needed.
  5. "License" step: This is where the user must agree to the repository distribution license in order to complete the deposit.  This repository distribution license is defined in the [dspace]/config/default.license file. It can also be customized per-collection from the Collection Admin UI.
  6. "Complete" step: The deposit is now completed. The Item will either become immediately available or undergo a workflow approval process (depending on the Collection policies).  For more information on the workflow approval process see: Configurable Workflow.

 

To modify or reorganize these submission steps, just modify the [dspace]/config/item-submission.xml file. Please see the section below on Reordering/Removing/Adding Submission Steps.

You can also choose to have different submission processes for different DSpace Collections. For more details, please see the section below on Assigning a custom Submission Process to a Collection.

Note
titleDSpace 4.0 has removed the "Initial Questions" step by default

Prior to DSpace 4.0

Note
titleDSpace 4.0 has removed the "Initial Questions" step by default

Prior to DSpace 4.0, the "Initial Questions" step preceded all "Describe" steps. However, it was removed by default in DSpace 4.0. 

You may still choose to re-enable the "Initial Questions" step, as needed. However, please note the warning below about the auto-assigning of Dates in the "Initial Questions" step.

...

  • "Access" step: This step allows users the user to (optionally) modify access rights or set an embargo during the deposit of an Item. For more information on this step, and Embargo options in general, please see the Embargo documentation.
  • "CC License" step: This step allows users the user to (optionally) assign a Creative Commons license to a particular Item. Please see the Configuring Creative Commons License section of the Configuration documentation for more details.
  • "Start Submission Lookup" step: This step allows the user to search or load metadata from an external service (arXiv online, bibtex file, etc.) and prefill the submission form. For more information on enabling and using it, please see the section on Configuring StartSubmissionLookupStep below.
  • "Initial Questions" step: This step asks users a simple set of "initial questions" which help to determine which metadata fields are displayed in the "Describe" step (see above).  These initial questions include:
    • Multiple Titles: The item has more than one title, e.g. a translated title  (If selected, then users will be asked for an alternative title in the Describe step)
    • Published Before: The item has been published or publicly distributed before (If selected, then users will be asked for a publication date and publisher in the Describe step).

      Warning
      titleInitial Questions will auto-assign a publication date when "Published Before" is unselected

      Please note, if you enable Initial Questions, and your users do NOT select "Published Before" option, then DSpace will auto-assign a publication date (dc.date.issued) to that particular Item.

      It may be entirely accurate for some types of content (e.g. for gray literature or even theses/dissertations) to auto-assign this publication date.  As such, you may wish to still enable "Initial Questions" if your repository is mainly for previously unpublished content. You may also choose to only enable it for specific Collections – see Assigning a custom Submission Process to a Collection section below.

      However, if the Item actually was published in some other location, this will result in an incorrect publication date being reported by DSpace.  This tendency for an incorrect publication date has been reported by Google Scholar to DSpace developers (see: DS-1481), which is why the "Initial Questions" are now disabled by default .(see DS-1655).

To enable any of these optional submission steps, just uncomment the step definition within the [dspace]/config/item-submission.xml file. Please see the section below on Reordering/Removing/Adding Submission Steps.

You can also choose to enable certain steps only for specific DSpace Collections. For more details, please see the section below on Assigning a custom Submission Process to a Collection.

Understanding the Submission Configuration File

...

  • value-pairs-name – Name by which an input-type refers to this list.
  • dc-termQualified Dublin Core field for which this choice list is selecting a value. 

Each value-pairs element contains a sequence of pair sub-elements, each of which in turn contains two elements:

  • displayed-value – Name shown (on the web page) for the menu entry.
  • stored-value – Value stored in the DC element when this entry is chosen. Unlike the HTML select tag, there is no way to indicate one of the entries should be the default, so the first entry is always the default choice.

...

Code Block
languagexml
titleitem-submission.xml excerpt
     <step id="collection">
       <heading></heading> <!--can specify heading, if you want it to appear in Progress Bar-->
	   <processing-class>org.dspace.submit.step.StartSubmissionLookupStep</processing-class>
       <jspui-binding>org.dspace.app.webui.submit.step.JSPStartSubmissionLookupStep</jspui-binding>
       <xmlui-binding>org.dspace.app.xmlui.aspect.submission.submit.SelectCollectionStep</xmlui-binding>
       <workflow-editable>false</workflow-editable>
     </step>
Info
titleUI compatibility

The new step is available only for JSP UI. Nonetheless, if you run both UIs and want the JSP UI benefit of the new step you can configure it as processing class also for XML as it degrades graceful gracefully to the standard SelectCollectionStep logic

...

The BTE is a Java framework developed by the Hellenic National Documentation Centre (EKT) and consists of programmatic APIs for filtering and modifying records that are retrieved from various types of data sources (eg. databases, files, legacy data sources) as well as for outputing outputting them in appropriate standards formats (eg. database files, txt, xml, Excel). The framework includes independent abstract modules that are executed seperately, offering in many cases alternative choices to the user depending of the input data set, the transformation workflow that needs to be executed and the output format that needs to be generated. 

The basic idea behind the BTE is a standard workflow that consists of three steps, a data loading step, a processing step (record filtering and modification) and an output generation. A data loader provides the system with a set of Records, the processing step is responsible for filtering or modifying these records and the output generator outputs them in the appropriate format. 

The standard BTE version offers several predefined Data Loaders as well as Output Generators for basic bibliographic formats. However, Spring Dependency Injection can be utilized to load custom data loaders, filters, modifiers and output generators. 

...

There are four accordion tabs (default configuration hide hides the third tab):

1) Search for identifier: In this tab, the user can search for an identifier in the supported online services (currently, arXiv, PubMed, CrossRef and CrossRef CiNii are supported). The publication results are presented in the tab "Results" in which the user can select the publication to proceed with. This means that a new submission form will be initiated with the form fields prefilled with metadata from the selected publication.

Currently, there are three four identifiers that are supported (DOI, PubMed ID, arXiv ID and arXiv NAID  (CiNii ID) ). But these can be extended - refer to the following paragraph regarding the SubmissionLookup service configuration file.

User can fill in any of the three four identifiers. DOI is preferable. Keep in mind that the service can integrate results for the same publication from the three different providers so filling any of the three four identifiers will pretty much do the work. If identifiers for different publications are provided, the service will return a list of publications which will be shown to user to select. The selected publication will make it to the submission form in which some fields will be pre-filled with the publication metadata. The mapping from the input metadata (from arXiv or Pubmed or CrossRef or CiNii) to the DSpace metadata schema (and thus, the submission form) is configured in the Spring XML file that is discussed later on - you can see a table at the very end of this chapter.

Through the same file, a user can also extend the providers that the SubmissionLookup service can search publication from.

2) Upload a file: In this tab, the user can upload a file, select the type (bibtex. scvcsv, etc.), see the publications in the "Results" tab and then either select one to proceed with the submission or make all of them "Workspace Items" that can be found in the "Unfinished Submissions" section in the "My DSpace" page.

...

"OFF": All the publications of the uploaded file will be imported in the user's MyDSpace page as "Unfinished Submissions" while the first one will go thought the submission process.

(Regarding the pubmed, crossref and arxiv file upload, you can find the attached file named "sample-files.zip" that contains samples of these three file types)

3) Free search: In this tab, the user can freely search for Title, Author and Year in the four 3) Free search: In this tab, the user can freely search for Title, Author and Year in the three supported providers (PubMed, CrossRef, Arxiv and ArxivCiNii). By default, the three four providers are configured to be disabled for free search but you can enable it via the configuration file. Thus, initially this accordion tab is not shown to the user except for a data loader is declared as a "search provider" - refer to the following paragraphs.

...

This is the top level bean that describes the service of theSubmissionLookupthe SubmissionLookup. It accepts two three properties:

a)  phase1TransformationEngine: the phase 1 BTE transformation engine.

b)  phase2TransformationEngine: the phase 2 BTE transformation engine

 

Code Block
languagehtml/xml
<bean id="phase1TransformationEngine" />

The transformation engine for the first phase of the service (from external service to intermediate format)

c)  detailFields:  A list of the keys that the user wants to display in the detailed form of a publication. That is, when the results are shown, user can see the details of each one. In the detailed form, some fields appear. These fields are configured by this property. Refer to the table at the very end of this chapter to see the available values. This property is disabled by default while the list that is shown commented out is the default list for the detailed form.

 

Code Block
languagehtml/xml
<bean id="phase1TransformationEngine" />

The transformation engine for the first phase of the service (from external service to intermediate format)

It accepts three properties:

...

This bean declares the data loader to be used to load publications from. It has one property "dataloadersMap", a map that declares key-value pairs, thas that is a unique key and the corresponding data loader to be used. Here is the point where a new data loader can be added, in case the ones that are already supported do not meet your needs.

...

Code Block
languagehtml/xml
<bean id="bibTeXDataLoader" />
<bean id="csvDataLoader" />
<bean id="tsvDataLoader" />
<bean id="risDataLoader" />
<bean id="endnoteDataLoader" />
<bean id="pubmedFileDataLoader" />
<bean id="arXivFileDataLoader" />
<bean id="crossRefFileDataLoader" />
<bean id="ciniiFileDataLoader" />
<bean id="pubmedOnlineDataLoader" />
<bean id="arXivOnlineDataLoader" />
<bean id="crossRefOnlineDataLoader" />
<bean id="ciniiOnlineDataLoader" />

These beans are the actual data loaders that are used by the service. They are either "FileDataLoaders" or "SubmissionLookupDataLoaders" as mentioned previously.

...

a)  fieldMap : it is a map that specifies the mapping between the keys that hold the metadata in the input format and the ones that we want to have internal in the BTE.  At the end of this article there is a table that summarises the fields that are used from the three online services (pubmed, arXiv and crossRef) - which are the ones that the submission lookup step is capable of reading from the online services - and the keys used internally in the BTE.

Some loaders have more properties:

...

pubmedOnlineDataLoadercrossRefOnlineDataLoader and , arXivOnlineDataLoader and ciniiOnlineDataLoader also support another property:

a)  searchProvider: if is set to true, the dataloader supports free search by title, author or year. If at least one of these data loaders is declared as a search provider, the accordion tab "Free search" is appeared. Otherwise, it stays hidden.


crossRefOnlineDataLoader and ciniiOnlineDataLoader 

Code Block
languagehtml/xml
<bean id="phase1LinearWorkflow" />

This bean specifies the processing steps to be applied to the records metadata before they proceed to the output generator of the transformation engine. Currenty, three steps are supported, but you can add yours as well.

 

Code Block
languagehtml/xml
<bean id="mapConverter_arxivSubject" />
<bean id="mapConverter_pubstatusPubmed" />
<bean id="removeLastDot" />

These beans are the processing steps that are supported by the 1st phase of transformation engine. The two first map an incoming value to another one specified in a properties file. The last one is responsible to remove the last dot from the incoming value.

All of them have the property "fieldKeys" which is a list of keys where the step will be applied.

In the case you need to create your own filters and modifiers follow the instructions below:
  
To create a new filter, you need to extend the following BTE abstact class:

Code Block
gr.ekt.bte.core.AbstractFilter

You will need to implement the following method:

Code Block
languagejava
public abstract boolean isIncluded ( Record  record )

...

also have two more properties:

a) apiKey/appId respectively: Both these services need to acquire (for free) an API key in order to access their online services. For CrossRef, visit: http://www.crossref.org/requestaccount/ and for CiNii visit: https://portaltools.nii.ac.jp/developer/en/

b) maxResults: the maximum results that these services will reply with to your search. By default, this property is commented out while the default value is 10 for both services.

 

(Regarding the file dataloaders, you can find the attached file named "sample-files.zip" that contains samples of all the file types that the corresponding data loaders can handle)

 

Code Block
languagehtml/xml
<bean id="phase1LinearWorkflow" />

This bean specifies the processing steps to be applied to the records metadata before they proceed to the output generator of the transformation engine. Currenty, three steps are supported, but you can add yours as well.

 

Code Block
languagehtml/xml
<bean id="mapConverter_arxivSubject" />
<bean id="mapConverter_pubstatusPubmed" />
<bean id="removeLastDot" />

These beans are the processing steps that are supported by the 1st phase of transformation engine. The two first map an incoming value to another one specified in a properties file. The last one is responsible to remove the last dot from the incoming value.

All of them have the property "fieldKeys" which is a list of keys where the step will be applied.

In the case you need to create your own filters and modifiers follow the instructions below:
  
To create a new filter, you need to extend the following BTE abstact class:

Code Block
gr.ekt.bte.core.AbstractFilter

You will need to implement the following method:

Code Block
languagejava
public abstract boolean isIncluded ( Record  record )

Return false if the specified record needs to be filtered, otherwise return true. 

To create a new modifier, you need to extend the following BTE abstact class:

Code Block
gr.ekt.bte.core.AbstractModifier

You will need to implement the following method:

Code Block
languagejava
public abstract Record modify ( Record record )

within you can make any changes you like in the record. You can use the Record methods to get the values for a specific key and load new ones (For the later, you need to make the Record mutable)  

After you create your own filters or modifiers you need to add them in the Spring XML configuration file as in the following example:

Code Block
languagehtml/xml
<bean id="customfilter"   class="org.mypackage.MyFilter" />

<bean id="phase1LinearWorkflow" class="gr.ekt.bte.core.LinearWorkflow">
    <property name="process">
    <list>
		 ... <old filters and modifiers>...
         <ref bean="customfilter" />
    </list>
    </property>
</bean>
Code Block
languagehtml/xml
<bean id="phase2TransformationEngine" />

The transformation engine for the second phase of the service (from the intermediate format to DSpace metadata schema)

Normally, you do not need to touch any of these three properties. You can edit the reference beans instead.

 

Code Block
languagehtml/xml
<bean id="phase2linearWorkflow" />

This bean specifies the processing steps to be applied to the records metadata before they proceed to the output generator of the transformation engine. Currenty, two steps are supported, but you can add yours as well.

 

Code Block
languagehtml/xml
<bean id="fieldMergeModifier" />
<bean id="valueConcatenationModifier" />
<bean id="languageCodeModifier" />

These beans are the processing steps that are supported by the 2nd phase of transformation engine. The first merges the values of multiple keys to a new key. The second one concatenates the values of a specific key to a unique value. The third one translated the three-letters language code to two-letters one (ie: eng to en)

 

Code Block
languagehtml/xml
<bean id="org.dspace.submit.lookup.DSpaceWorkspaceItemOutputGenerator" />

This bean declares the output generator to be used which is, in this case, a DSpaceWorkspaceItem generator. It accepts two properties:

a) outputMap: A map from the intermediate keys to the DSpace metadata schema fields. The table below displays the default output mapping. As you can see, some fields, while the are read from the input source, are not output in DSpace since there are no default metadata schema fields to host them. However, if you create the corresponding metadata field registry, you can come back in this configuration to add a map between the input field key and the DSpace metadata field.

b) extraMetadataToKeep: A list of DSpace metadata schema fields to keep in the output


The following table presents the available keys from the online services, the keys that BTE uses in phase1 and the final output map to DSpace metadata fields.

ArxivPubMedCrossRefCiNiiBTE Key (phase 1)

Extra Keys created

by BTE (phase 2)

DSpace Metadata Field

Appears in

Detail Form

titlearticleTitlearticleTitletitletitle dc.titleyes
publishedpubDateyearissuedissued dc.date.issuedyes
id   url   
summaryabstractText descriptionabstract dc.description.abstractyes
comment   note   
pdfUrl   fulltextUrl   
doidoidoi doi dc.identifieryes
journalRefjournalTitlejournalTitlejournaljournal dc.sourceyes
authorauthorauthorsauthorsauthors dc.contributor.authoryes
authorWithAffiliation   authorsWithAffiliation   
primaryCategory   arxivCategory dc.subjectyes
category   arxivCategory dc.subject 
 pubmedID  pubmedID   
 publicationStatus  publicationStatus   
 pubModel      
 printISSNprintISSNissnjissn dc.identifier.issnyes
 electronicISSNelectronicISSN jeissn   
 journalVolumevolumevolumevolume   
 journalIssueissueissueissue   
 language languagelanguage dc.language.isoyes
 publicationTypedoiType subtype dc.typeyes
 primaryKeyword subjectskeywordsallkeywordsdc.subjectyes
 secondaryKeyword  keywordsallkeywordsdc.subjectyes
 primaryMeshHeading  meshallkeywordsdc.subjectyes
 secondaryMeshHeading  meshallkeywordsdc.subjectyes
 startPagefirstPagespagefirstpage   
 endPagelastPageepagelastpage   
  printISBN pisbn dc.identifier.isbnyes
  electronicISBN eisbn   
  editionNumber editionnumber   
  seriesTitle seriestitle   
  volumeTitle volumetitle   
  publicationType     
  editors editors dc.contributor.editoryes
  translators translators dc.contributor.otheryes
  chairs chairs dc.contributor.otheryes
   naidnaid   
   ncidncid   
   publisherpublisher dc.publisheryes
Code Block
gr.ekt.bte.core.AbstractModifier

You will need to implement the following method:

Code Block
languagejava
public abstract Record modify ( Record record )

within you can make any changes you like in the record. You can use the Record methods to get the values for a specific key and load new ones (For the later, you need to make the Record mutable)  

After you create your own filters or modifiers you need to add them in the Spring XML configuration file as in the following example:

Code Block
languagehtml/xml
<bean id="customfilter"   class="org.mypackage.MyFilter" />

<bean id="phase1LinearWorkflow" class="gr.ekt.bte.core.LinearWorkflow">
    <property name="process">
    <list>
		 ... <old filters and modifiers>...
         <ref bean="customfilter" />
    </list>
    </property>
</bean>
Code Block
languagehtml/xml
<bean id="phase2TransformationEngine" />

The transformation engine for the second phase of the service (from the intermediate format to DSpace metadata schema)

Normally, you do not need to touch any of these three properties. You can edit the reference beans instead.

 

Code Block
languagehtml/xml
<bean id="phase2linearWorkflow" />

 This bean specifies the processing steps to be applied to the records metadata before they proceed to the output generator of the transformation engine. Currenty, two steps are supported, but you can add yours as well.

 

Code Block
languagehtml/xml
<bean id="fieldMergeModifier" />
<bean id="valueConcatenationModifier" />

 These beans are the processing steps that are supported by the 2nd phase of transformation engine. The first mergest he values of multiple keys to a new key. The second one concatenates the values of a specific key to a unique value.

   

Code Block
languagehtml/xml
<bean id="org.dspace.submit.lookup.DSpaceWorkspaceItemOutputGenerator" />

This bean declares the output generator to be used which is, in this case, a DSpaceWorkspaceItem generator. It accepts two properties:

a) outputMap: A map from the intermediate keys to the DSpace metadata schema fields.

...

 

 

Info
titleI can see more beans in the configuration file that are not explained above. Why is this?

The configuration file hosts options for two services. BatchImport service and SubmissionLookup service. Thus, some beans that are not used in the first service, are not mentioned in this documentation. However, since both services are based on the BTE, some beans are used by both services.

...