Contribute to the DSpace Development Fund
The newly established DSpace Development Fund supports the development of new features prioritized by DSpace Governance. For a list of planned features see the fund wiki page.
Introduction
DSpace users have expressed the need for DSpace to be able to provide more support for different types of digital objects related to open access publications, such as authors/author profiles, data sets etc. Configurable Entities are designed to meet that need.
In DSpace, an Entity is a special type of Item which often has Relationships to other Entities. Breaking it down with more details...
- Entity: Every Entity is an Item.
- This means they must belong to a Collection, just like a normal Item. (Community & Collection objects are unchanged and unaffected by Entities.)
- Normal Items are still the "default" Item, and they are unchanged. So, not every Item is an Entity.
- Because Entities are all Items, they are immediately usable in submission/workflow process, batch import/export, OAI-PMH, etc.
- Entity (or Item) Type: Entities all have a "dspace.entity.type" metadata field which defines their Entity/Item "type". For example, this type may be "Person", "Project", "Publication", "Journal", etc. It's highly visible within the User Interface as a label.
- Relationships: Based on that "type", an Entity may be related to other Entities via a Relationship. One Entity type may support several relationship types at once. Examples of relationship types include "isPersonOfProject" or "isPublicationOfAuthor". These relationship types are named based on the Entity "type" (as you can likely tell). Relationships also appear on Entities as metadata using the "relation" schema.
- Virtual Metadata: Entities of different types may also have customized visualizations in the User Interface. These visualizations may also dynamically pull in metadata from related Entities. For example, a Publication entity may be displayed in the User Interface with an author name dynamically pulled in from a related Person entity. The metadata "appears" as though it is part of the Entity you are viewing, but it is dynamically pulled via the Relationship.
Entities and their Relationships are also completely configurable. DSpace provides some sample models out of the box, which you can use directly or adapt as needed.
The Entity model also has similarities with the Portland Common Data Model (PCDM), with an Entity roughly mapping to a "pcdm:Object" and existing Communities and Collections roughly mapping to a "pcdm:Collection". However, at this time DSpace Entities concentrate more on building a graph structure of relationships, instead of a tree structure.
Default Entity Models
DSpace currently comes with the following Entity models, both of which are defined in [dspace]/config/entities/relationship-types.xml.
These Entity models are not used by default, but may be enabled as described below.
Research Entities
Research Entities include Person, OrgUnit, Project and Publication. They allow you to create author profiles (Person) in DSpace, and relate those people to their department(s) (OrgUnit), grant project(s) (Project) and works (Publication).
- Each publication can link to projects, people and org units
- Each person can link to projects, publications and org units
- Each project can link to publications, people and org units
- Each org units can link to projects, people and publications
Journals
Journal Entities include Journal, Journal Volume, Journal Issue and Publication (article). They allow you to represent a Journal hierarchy more easily within DSpace, starting at the overall Journal, consisting of multiple Volumes, and each Volume containing multiple Issues. Issues then link to all articles (Publication) which were a part of that journal issue.
NOTE: that this model includes the same "Publication" entity as the Research Entities model described above. This Entity overlap allows you to link an article (Publication) both to its author (Person) as well as the Journal Issue it appeared in.
Enabling Entities
By default, Entities are not used. But, as described above several models are available out-of-the-box that may be optionally enabled.
- (Optionally) Configure your entity model : DSpace provides two default entity models defined in
[dspace]/config/entities/relationship-types.xml.
These models may be used as-is, or modified. You can also design your own model from scratch (see "Designing your own model" below). So, feel free to start by modifyingrelationship-types.xml
, or creating your own model based on therelationship-types.dtd
. Import that model into the DSpace database: To enable that defined model, you must import it into DSpace. This is achieved by using the "initialize-entities" script:
# The -f command requires a full path to an Entities model configuration file. [dspace]/bin/dspace initialize-entities -f [dspace]/config/entities/relationship-types.xml
- If an Entity (of same type name) already exists, it will be updated with any new relationships defined in relationship-types.xml
- If an Entity (of same type name) doesn't exist, the new Entity type will be created along with its relationships defined in relationship-types.xml
- Configuration of submission form for Entity
- Configuration of workflowfor Entity
- Configuration of community/collection list.
Configuration of Entity types and their relations
There are 2 approaches possible here:
- Re-using one of the models previously created in the community
- Designing your own model
Both parts will be explained in detail
Re-using one of the models previously created in the community
The models currently available in the community are:
- Research Entities
- Journals
- OpenAire (in progress)
In order to start using such a model, the following steps should be used
Configuration of the submission forms per item type
A collection should contain only objects of one item type. E.g. there would be one (or even multiple) collections for Person objects. The submission forms of the collection(s) for Person objects would contain metadata only applicable for the Person item type. Since the item type is stored in the metadata, the collection template can be used to store the item type metadata value.
There’s no need to remain stuck in the model where collections represent departments, and both Publication and Person objects should be stored under the department’s collection.
- In the edit collection, an item template should be created containing
dspace.entity.type
with valuePublication
,Person
or whatever entity/item type should be submitted to that collection. - In the edit collection, the submit group should be defined to identify who can submit to this collection
In item-submission.xml, the collection handle should be linked to the applicable submission-definitions in the file, e.g.
<name-map collection-handle="123456789/6" submission-name="People" /> <name-map collection-handle="123456789/279" submission-name="People" /> <name-map collection-handle="123456789/7" submission-name="Project" /> <name-map collection-handle="123456789/280" submission-name="Project" /> <name-map collection-handle="123456789/8" submission-name="OrgUnit" /> <name-map collection-handle="123456789/28" submission-name="Journals" /> <name-map collection-handle="123456789/29" submission-name="JournalVolumes" /> <name-map collection-handle="123456789/30" submission-name="JournalIssues" />
Configuration of workflow for item types
The DSpace workflow can be used for reviewing all objects in the Object Model since these objects are all items, and separate collections can be used. The workflow used for e.g. a Person Object can be configured to be identical to a publication, different from a publication, or use no workflow at all.
The users who are granted permissions to create a Person Object are determined by the submitters group of that collection.
Configuration of community/collection list for item types
The community list is very flexible, and institutions can use it for various purposes. By using collections for the item types, additional collections can be created in the community/collection list. Extending existing community/collection structures to include collections per item type, can be done using a single collection for the other item types. This is typically useful if the structure is not applicable for the other item types, or should not be displayed. In this use case, it would also be possible to not grant Anonymous READ rights on the People, Projects collections to ensure they’re not visible on the community/collection list page. Or extending these to include collections per item type, can be performed using multiple collections for the other item types. This is typically applicable when they should be clearly displayed in the structure:
Structure based on the departments:
- Department of Architecture
- Building Technology Program
- Theses - Department of Architecture
- Department of Biology
- Theses - Biology
- People
- Projects
OR
- Department of Architecture
- Building Technology Program
- Theses - Department of Architecture
- People in Department of Architecture
- Projects in Department of Architecture
- Department of Biology
- Theses - Biology
- People in Department of Biology
- Projects in Department of Biology
Structure based on the publication type:
- Books
- Book Chapter
- Edited Volume
- Monograph
- Theses
- Bachelor Thesis
- Doctoral Thesis
- Habilitation Thesis
- Master Thesis
- People
- Projects
OR
- Books
- Book Chapter
- Edited Volume
- Monograph
- Theses
- Bachelor Thesis
- Doctoral Thesis
- Habilitation Thesis
- Master Thesis
- People
- Projects
Designing your own model
When using a different entities model, the new model has to be configured an loaded into your repository
Thinking about the object model
First step: identify the entity types
- Which types of objects would you want to create items for: e.g. Person, Publication, JournalVolume
- Be careful not to confuse a type with a relationship. A Person is a type, an author is a relationship between the publication and the person
Second step: identify the relationship types
- Which relationship types would you want to create between the entity items from the previous step: e.g. isAuthorOfPublication, isEditorOfPublication, isProjectOfPublication, isOrgUnitOfPerson, isJournalIssueOfPublication
- Multiple relationships between the same 2 types can be created: isAuthorOfPublication, isEditorOfPublication
- Relationships are automatically bidirectional, so no need to worry about whether you want to display the authors in a publication or the publications of an author
Third step: visualize your model
- By creating a drawing of your model, you’ll be able to quickly verify whether anything is missing
Configuring the object model
Configure the model in relationship-types.xml
- Similar to the default relationship-types.xml, configure a relationship type per connection between 2 entity types
- Include the 2 entity type names which are being connected
- Determine a clear an unambiguous name for the relation in both directions
- Optionally: determine the cardinality (min/max occurrences) for the relationships
- Optionally: determine default behavior for copying metadata if the relationship is deleted
Configuring the metadata fields
Determining the metadata fields to use
- Dublin Core works for publications, but not for a Person, JournalVolume, …
- There are many standards which can be easily configured: schema.org, eurocris, datacite, …
- Pick a schema which suits your needs
Configure the submission forms
- Add a form in submission-forms.xml for each entity type, containing the relevant metadata fields
- Configure which relationships to create
Configuring the item display pages
- The metadata configuration is not specific to configurable entities.
- Similar to other customizations to the item display pages, configure in Angular which metadata fields to display and their label. A template per entity type can be created
- The relationship display is similar to the metadata configuration
- Similar to the metadata configuration: configure in Angular which relationship to display and their label
Configuring virtual metadata
- The isAuthorOfPublication relationship can be displayed for the Publication item as dc.contributor.author
- The isOrgUnitOfPerson relationship can be displayed for the Person item as organization.legalName
- This can be configured in virtual-metadata.xml
Configuring discovery
- Configure the discovery facets, filters, sort options, …
- The facets for a Person can be job title, organization, project, …
- The filters for a Person can be person.familyName, person.givenName, …