Contribute to the DSpace Development Fund
The newly established DSpace Development Fund supports the development of new features prioritized by DSpace Governance. For a list of planned features see the fund wiki page.
Users that have the permission to ADD content to at least one collection will have an easy to use and intuitive user interface (this is related to the use case: End User - Easy and Intuitive Deposit Interface)
three options to start a submission
(priority 1) Drag & drop area to supply a file with bibliographic references or the fulltext. Depending on the recognized file format a pluggable process will be triggered on the server to initialize one or more items
(priority 1) support for RIS, EndNote, BibTeX
(priority 1) support for CSV, TSV
(priority 1) support for PDF file with metadata extraction using fulltext parsing (Grobid library) with fallback to the pdf metadata
(priority 2) the system will assign the submission automatically to one of collection available to the user depending on some pluggable logic based on the extracted metadata (out-of-box a mapping between dc.type and collections will be supported)
(priority 3) ... extends to other (?) common file formats
(priority 1) Create a new item specifying an identifier. Depending on the recognised identifier type one or more pluggable providers will be queried to retrieve the actual metadata and maybe the fulltext. The submitter can force the use of a specific collection or rely on the automatic guess described above
(priority 1) retrieve the metadata from Crossref, PubMed, PubMed Europe, arXiv, Cinii, Scopus and Web of Science
(priority 2) retrieve the fulltext from PubMed Europe, arXiv
(priority 3) ... extends to other (?) providers (ADS, DBLP, ...)
(priority 1) Create a new item from scratch. Depending on the number of collections where the submission can be done a dropdown to select the target collection will be shown or not. A default could be set to streamline the process.
(priority 1) The submitter or the responsible of the workflow should be able to change the collection at any time - this is related to the use case: Admin UI - Reviewers can move a submission to a different collection
once the submission is started a single form page is presented
(priority 1) options to enrich the record using identifiers or supplying file are still available
(priority 1) all the steps are presented as "accordion" in the same page giving immediate access to all the functionalities without the need to follow a single predefined linear process
(priority 1) the requested metadata are configured in an extended input-form configuration file with the following features
(priority 1) all the current options available in the input-form
(priority 1) additional input types: calendar, number
(priority 1) validation support: regex and ranges - this is related to the use case: Structure - Format checking of data entry in input forms
(priority 1) size of the input box
(priority 1) put multiple metadata on the same row to save space
(priority 1) require or not to specify the language for a metadata, supporting mandatory of one or more values in specific languages and/or avoid input of multiple values for the same language
(priority 2) group some metadata together to be able to collect and keep in sync linked information such as the author name and the affiliation, the journal title and the ISSN, etc.
(priority 1) upload files
(priority 1) additional technical metadata will be automatically recorded with a pluggable framework (out-of-box JHove will be integrated) - this is related to the use case: Admin UI - Advanced Preservation - Format characterization
(priority 2) Sherpa/Romeo integration
(priority 3) ability to configure descriptive metadata requested for each file - this is related to the use case: Structure - Describe Individual Bitstream within an Item
(priority 1) deposit license. A must to agree license is presented before to send the submission to the workflow
(priority 1) creative commons license integration
(priority 1) access condition for the item metadata and each single files can be easily set:
inheriting collection policies
setting or adding custom policies for specific groups with/without a date validity
the system should detect near duplicates allowing the submitter to suspend or proceed the submission. This is related to the use case: Detect potential duplicates see also the existent feature in DSpace-CRIS
Macro changes required to the DSpace architecture
(priority 1) the transition between the current linear steps based submission wizard to the "single page" model should be clarified:
when temporary data are saved?
how item aware services like the sherpa/romeo integration or the detect duplicate are invoked? is it required a stored incomplete workspaceitem or should be based on an item mock?
(priority 1) the interaction with the BTE framework need to be revised to allow storing of the original PDF file or linked files in the generated item
(priority 1) support the definition of custom accordion like the current visibile custom steps
(priority 2) the inheritance of collection policies should be optional in the item creation
(priority 3) a validation framework to allow pluggable complex rules to be checked against the whole item before the submission should be introduced
(priority 4) it could be convenient to split the inputform in multiple files, one for each template definition
Related tasks
(priority 1) a REST endpoint need to be created to support item(s) creation from a file containing bibliographic references (RIS, Endnote, BibTeX, etc.) or PDF with metadata extraction
(priority 1) a REST endpoint need to be created to list the queued processes for an user
(priority 1) define REST endpoints for the detect duplicate features and the sherpa/romeo integration
(priority 1) define REST endpoint for the creative commons integration
MOCKUPs
Edit form optional accordion (editable mockup)
Files and access condition (editable mockups)
Files and access condition (editable mockups)
Files and access condition (editable mockups)
Files and access condition (editable mockups)
Edit form CC License (editable mockups)
Edit form CC License (editable mockups)
Edit form CC License (editable mockups)
Edit form Distribution License (editable mockups)
Edit form Distribution License (editable mockups)
11 Comments
Tim Donohue
Questions/Responses heard in today's meeting (2017-08-17 DSpace 7 Working Group Meeting notes):
Art Lowel (Atmire)
I'm not entirely convinced by the explanation: these extra features won't affect the dspace 7 timeline because we were going to make them anyway for a client project.
Every feature you add needs to be taken in to account later on. If you put in feature A today that wasn't on the roadmap it may make it harder to implement a feature B next month that was. B might be harder to do because of A, or B might break A, so we'll have to spend time fixing A, etc. All features we add will also need to be maintained later on.
Andrea Bollini (4Science)
I agree with the general consideration from Art. Having a feature developed for a specific project don't necessary mean that it is good for the community as a whole or it is possible to include the new feature in the platform without introducing major issue, delay or complexity.
A new feature need to be evaluated to check if it meets a general requirement or if it is instead something specific of the project/customer. In this case I believe that the deduplication alert and merge tool is something generally useful. When a new feature that we recognize as useful for the community is introduced we can always flag such functionality as beta and make it disabled by default to reduce maintenance risk.
DSpace is an open source project so the roadmap need to be dynamically adapted to the resource offered by the community. If anyone propose a new functionality, it is our duty to evaluate it before to discard. The extra complexity added need to be evaluated and a strategy to manage it put in place. In the case of the deduplication we, as contributor of the functionality, are also willing to manage any fix on it that should become necessary before the final cut of the 7 release. No matter if the issue comes from the later introduction of an already planned feature or the development of a new functionality. About the extra complexity that the feature could add to the development of other feature we are open to careful review any feedback to reduce such eventualities and collaborate to fix the issues when they come. BTW, deduplication is something very isolated so we don't envision any specific risk here. After the release of DSpace 7 all us become responsible of the accepted features, they add complexity to the platform but of course also potentiality making it more appealing that in turn will mean more institutions involved, more resources, etc.
The previous consideration apply also to the "grouping metadata" feature. In such case we suggest to include the capabilities to manage such groups as beta only in the submission without necessary implement any specific/additional support in other part of the platform (such as the item detail page, etc.). Institutions that want to use such feature will have to configure the inputform enabling it. I'm going to provide more details about the "grouping metadata" feature in reply to the Tim comment above
Andrea Bollini (4Science)
I have added more information on the grouping metadata feature here: Nested metadata in DSpace items please note that this is something that we have already added in DSpace-CRIS and don't require any change to the current dspace data model / database.
It is not only related to author / affiliations, etc. but it has many concrete use case (project information, journal, conference, etc.)
In the past years I hear several institutions to work in such way manually (taking care to add as many "metadata2" as "metadata1") but this is an easy error prone process without support from the UI (when you go back to re-edit the information, reorder the metadata, etc). Here we propose to address these UI issues without changing the underline data structure
Tim Donohue
One feature I think may be missing or unclear in this UI mockup is the ability to easily see what is required before you can successfully deposit. For example, some sites choose to require a file be uploaded, and additional metadata may be required beyond just the "title". Currently, in the mockup the file upload is hidden by default in the accordion, so it may be unclear to a user whether that is required for deposit or not.
One software that displays this sort of feedback well is the Hyrax project from Samvera (aka Hydra).
Andrea Bollini (4Science)
Thanks for the very constructive comments and the points to the Hydra mockups (we were not aware of that!).
The use of a side widget to show requirements, state information and maybe permission is very nice. We will try to integrate this concept in our proposal (right now I don't have other option than the approach followed by Hydra but I will explore that with my team and UHasselt). I'm not sure it is possible/easy to include the acceptance of the deposit license here as it is mostly an option behavior that in the current dspace is implemented as a sort of plugin (it is a separate step). Our plan is to maintain the current configurability and extendibility of the submission allowing institution to plugin their behavior/logic. Each "custom step" is expected to be translated in a separate accordion.
The file upload is just an example of "step" (submission plugin), if it is flagged as required in the configuration we expect to have the corresponding accordion always present (maybe showing no files upload yet)
Art Lowel (Atmire)
I see it is possible to change collections during the submission. What happens with the metatadata you've already entered if you change to a collection that uses a different set of metadata fields?
How does the fetch metadata link work? What happens if you've entered different information in certain fields already?
I like not having a predefined linear process, but I don't think that necessitates an accordion. Accordions tend to be disorienting for users, as several things move at once.
In a Javascript UI going to a different page is just as fast as opening an accordion, and can be animated as well if you'd like.
We discussed this in the meeting yesterday. I'd say we save whenever any field is changed, but with a max frequency of for example: once every 5 seconds. In a javascript UI, there's no real reason to tie saving to going to a different submission step as it is now.
Andrea Bollini (4Science)
Thanks for the questions they help us to clarify and better document what we are going to implement!
Jonathan Green
This is looking very promising. Really liking what is covered in the descriptions/mockups. I like Tim Donohue's suggestion about having a clear indication of pending/completed submission requirements. It's a good idea if there is room (I get that there might not be though). A related thought on that is that it might be a place to indicate a key bit of information related to the deposit to the user. Specifically I'm thinking if say a deposit has been assigned a (reserved) DOI, it would be great if this could be indicated to the user in a really obvious way. I note though that such a thing may be best (or perhaps has to be for functional reasons) communicated nearer the end of the deposit process.
Tim Donohue
Feedback per the meeting on 2017-09-21:
Tim Donohue
Feedback per Outreach meeting on Oct 11: