All Versions
- DSpace 7.x (Current Release)
- DSpace 8.x (Unreleased)
- DSpace 6.x (EOL)
- DSpace 5.x (EOL)
- More Versions...
Contribute to the DSpace Development Fund
The newly established DSpace Development Fund supports the development of new features prioritized by DSpace Governance. For a list of planned features see the fund wiki page.
DSpace users have expressed the need for DSpace to be able to provide more support for different types of digital objects related to open access publications, such as authors/author profiles, data sets etc. Configurable Entities are designed to meet that need.
In DSpace, an Entity is a special type of Item which often has Relationships to other Entities. Breaking it down with more details...
Entities and their Relationships are also completely configurable. DSpace provides some sample models out of the box, which you can use directly or adapt as needed.
The Entity model also has similarities with the Portland Common Data Model (PCDM), with an Entity roughly mapping to a "pcdm:Object" and existing Communities and Collections roughly mapping to a "pcdm:Collection". However, at this time DSpace Entities concentrate more on building a graph structure of relationships, instead of a tree structure.
DSpace currently comes with the following Entity models, both of which are defined in [dspace]/config/entities/relationship-types.xml.
These Entity models are not used by default, but may be enabled as described below.
Research Entities include Person, OrgUnit, Project and Publication. They allow you to create author profiles (Person) in DSpace, and relate those people to their department(s) (OrgUnit), grant project(s) (Project) and works (Publication).
Journal Entities include Journal, Journal Volume, Journal Issue and Publication (article). They allow you to represent a Journal hierarchy more easily within DSpace, starting at the overall Journal, consisting of multiple Volumes, and each Volume containing multiple Issues. Issues then link to all articles (Publication) which were a part of that journal issue.
NOTE: that this model includes the same "Publication" entity as the Research Entities model described above. This Entity overlap allows you to link an article (Publication) both to its author (Person) as well as the Journal Issue it appeared in.
By default, Entities are not used in DSpace. But, as described above several models are available out-of-the-box that may be optionally enabled.
Keep in mind, there are a few DSpace import/export features that do not yet support Entities in DSpace 7.0. These will be coming in future 7.x releases. See DSpace Release 7.0 Status for prioritization information, etc.
As described above, DSpace provides two default entity models defined in [dspace]/config/entities/relationship-types.xml.
These models may be used as-is, or modified.
You can also design your own model from scratch (see "Designing your own model" section below). So, feel free to start by modifying relationship-types.xml
, or creating your own model based on the relationship-types.dtd
.
In order to enable a defined entity model, it MUST be imported into the DSpace database This is achieved by using the "initialize-entities" script. The example below will import the "out-of-the-box" entity models into your DSpace installation
# The -f command requires a full path to an Entities model configuration file. [dspace]/bin/dspace initialize-entities -f [dspace]/config/entities/relationship-types.xml
If an Entity (of same type name) already exists, it will be updated with any new relationships defined in relationship-types.xml
If an Entity (of same type name) doesn't exist, the new Entity type will be created along with its relationships defined in relationship-types.xml
Once imported into the Database, the overall structure is as follows:
Keep in mind, your currently enabled Entity model is defined in your database, and NOT in the "relationship-types.xml". Anytime you want to update your data model, you'd update/create a configuration (like relationship-types.xml) and re-run the "initialize-entities" command.
Because all Entities are Items, they MUST belong to a Collection. Therefore, the recommended way to create a different submission forms per Entity type (e.g. Person, Project, Journal, Publication, etc) is to ensure you create a Collection for each Entity Type (as each Collection can have a custom Submission Form).
Obviously, how you organize your Entity Types into Collections is up to you. You can create a single Collection for all Entities of that type (e.g. an "Author Profiles" collection could be where all "Person" Entities are submitted/stored). Or, you could create many Collections for each Entity Type (e.g. each Department in your University may have it's own Community, and underneath have a "Staff Profiles" Collection where all "Person" Entities for that department are submitted/stored). A few example structures are shown below.
Example Structure based on the departments:
OR
Example Structure based on the publication type:
You should have already created Entity-specific Collections in the previous step. Now, we just need to map those Collections to Submission processes specific to each Entity.
On the backend, you will now need to modify the [dspace]/config/item-submission.xml
to "map" this Collection (or Collections) to the submission process for this Entity type.
item-submission.xml
and named based on the Entity type (e.g. Publication, Person, Project, etc).submission-forms.xml
, and named in the format "[entityType]Step" (where the entity type is camelcased). For example: "publicationStep", "personStep", "projectStep".item-submission.xml
or submission-forms.xml
filesAs of 7.6, you can simply map each Entity Type to a specific submission form as follows in your item-submission.xml
(This section already exists, just uncomment it)
<name-map collection-entity-type="Publication" submission-name="Publication"/> <name-map collection-entity-type="Person" submission-name="Person"/> <name-map collection-entity-type="Project" submission-name="Project"/> <name-map collection-entity-type="OrgUnit" submission-name="OrgUnit"/> <name-map collection-entity-type="Journal" submission-name="Journal"/> <name-map collection-entity-type="JournalVolume" submission-name="JournalVolume"/> <name-map collection-entity-type="JournalIssue" submission-name="JournalIssue"/>
In 7.5 and earlier, you needed to map each Collection's handle one by one to a Submission form in item-submission.xml. M
ap your Collection's handle (findable on the Collection homepage) to the submission form you want it to use. In the below example, we've mapped a single Collection to each of the out-of-the-box Entity types.
<name-map collection-handle="123456789/5" submission-name="Publication"/> <name-map collection-handle="123456789/6" submission-name="Person"/> <name-map collection-handle="123456789/7" submission-name="Project"/> <name-map collection-handle="123456789/8" submission-name="OrgUnit"/> <name-map collection-handle="123456789/28" submission-name="Journal"/> <name-map collection-handle="123456789/29" submission-name="JournalVolume"/> <name-map collection-handle="123456789/30" submission-name="JournalIssue"/>
Once your modifications to the submission process are complete, you will need to quickly reboot Tomcat (or your servlet container) to reload the current settings.
Alternatively to a collection's Handle, Entities Types can be used as an attribute. So, instead of specifying the collection handle, you will need to use the collection-entity-type
attribute and what Entity Type to use (like: Person, Project). Please mind that your Collections with Entity Type need to be previously created.
<name-map collection-entity-type="Publication" submission-name="Publication"/> <name-map collection-entity-type="Person" submission-name="Person"/> <name-map collection-entity-type="Project" submission-name="Project"/> <name-map collection-entity-type="OrgUnit" submission-name="OrgUnit"/> <name-map collection-entity-type="Journal" submission-name="Journal"/> <name-map collection-entity-type="JournalVolume" submission-name="JournalVolume"/> <name-map collection-entity-type="JournalIssue" submission-name="JournalIssue"/>
Once your modifications to the submission process are complete, you will need to quickly reboot Tomcat (or your servlet container) to reload the current settings.
For DSpace 7.6 release it requires Tomcat Restart for every new collection
Due to the way SubmissionConfigReader is loaded into memory (on a initialize process) currently there is no implemented way to reload submission forms. So, every time you assign an entity type to a collection, or create a new collection with an associated entity type, you will need to do a Tomcat restart for that collection to be available at the item submission config. There is an on going fix for that.
DSpace 7.6.1 introduced a fix and you don't need to do a Tomcat Restart anymore
DSpace 7.6.1 adds a way to reload Submission Configs, so you no longer need to do a Tomcat Restart after creating a new collection with an entity type, or assigning to a existing one.
The DSpace workflow can be used for reviewing all objects in the Object Model since these objects are all Items, and separate collections can be used. The workflow used for e.g. a Person Object can be configured to be identical to a publication, different from a publication, or use no workflow at all.
See Configurable Workflow for more information on configuring workflows per Collection.
"Virtual Metadata" is metadata that is dynamically determined (at the time of access) based on an Entity's relationship to other Entities. A basic example is displaying a Person Entity's name in the "dc.contributor.author" field of a related Publication Entity. That "dc.contributor.author" field doesn't actually exist on the Publication, but is dynamically added as "virtual metadata" simply because the Publication is linked to the Person (via a relationship).
Virtual Metadata is configurable for all Entities and all relationships. DSpace comes with default settings for its default Entity model, and those can be found in [dspace]/config/spring/api/virtual-metadata.xml
. In that Spring Bean configuration file, you'll find a map of each relationship type to a metadata field & its value. Here's a summary of how it works:
The "org.dspace.content.virtual.VirtualMetadataPopulator" bean maps every Relationship type (from relationship-types.xml
) to a <util:map> definition (of a given ID) also in the virtual-metadata.xml
<!-- For example, the isAuthorOfPublication relationship is linked to a map of ID "isAuthorOfPublicationMap" --> <entry key="isAuthorOfPublication" value-ref="isAuthorOfPublicationMap"/>
That <util:map> defintion defines which DSpace metadata field will store the virtual metadata. It also links to the bean which will dynamically define the value of this metadata field.
<!-- In this example, isAuthorOfPublication will be displayed in the "dc.contributor.author" field --> <!-- The *value* of that field will be defined by the "publicationAuthor_author" bean --> <util:map id="isAuthorOfPublicationMap"> <entry key="dc.contributor.author" value-ref="publicationAuthor_author"/> </util:map>
A bean of that ID then defines the value of the field, based on the related Entity. In this example, these fields are pulled from the related Person entity and concatenated. If the Person has "person.familyName=Jones" and "person.givenName=Jane", then the value of "dc.contributor.author" on the related Publication will be dynamically set to "Jones, Jane.
<bean class="org.dspace.content.virtual.Concatenate" id="publicationAuthor_author"> <property name="fields"> <util:list> <value>person.familyName</value> <value>person.givenName</value> <value>organization.legalName</value> </util:list> </property> <property name="separator"> <value>, </value> </property> <property name="useForPlace" value="true"/> <property name="populateWithNameVariant" value="true"/> </bean>
If the default Virtual Metadata looks good to you, no changes are needed. If you make any changes, be sure to restart Tomcat to update the bean definitions.
When using a different entities model, the new model has to be configured an loaded into your repository
First step: identify the entity types
Second step: identify the relationship types
Third step: visualize your model
Configure the model in relationship-types.xml
Determining the metadata fields to use
Configure the submission forms
The original Entities design document is available in Google Docs at: https://docs.google.com/document/d/1wEmHirFzrY3qgGtRr2YBQwGOvH1IuTVGmxDIdnqvwxM/edit
We are working on pulling that information into this Wiki space as a final home, but currently some technical details exist only in that document.
A talk on Configurable Entities was also presented at DSpace 7 at OR2021
DSpace entities fully support versioning. For the most part, this works like any other item. For example, when creating a new version of an item, a new item is created and all metadata values of the preceding item are copied over to the new item. Special care was taken to version relationships between entities.
To understand how versioning between entities with relationships works, let's walk through the following example:
Consider Volume 1.1 (left side) and Issue 1.1 (right side). Both are archived and both are the first version. Note that on the arrow, representing the relation between the volume and the issue, two booleans and two numbers are indicated.
(v)
is true if and only if volume 1.1 is the latest version that is relevant to issue 1 (even though it may be possible that volume 1.2, the second version of volume 1, exists). This means that on the item page of issue 1.1, a link to the item page of volume 1.1 should be displayed. It also means that searching for (the uuid of) issue 1.1 should yield volume 1.1.(i)
is true if and only if issue 1.1 is the latest version that is relevant to volume 1.1 (even though it may be possible that issue 1.1, the second version of issue 1, exists). This means that on the item page of volume 1.1, a link to the item page of issue 1.1 should be displayed. It also means that searching for (the uuid of) volume 1.1 should yield issue 1.1.(v)
indicates the place at which the virtual metadata representing this relationship (if any) will appear on volume 1.1. E.g. using the out-of-the-box configuration in virtual-metadata.xml
, metadata field publicationissue.issueNumber
of issue 1.1 would appear as metadata field publicationissue.issueNumber
on volume 1.1 on place 0 (i.e. as the first metadata value).(i)
indicates the place at which the virtual metadata representing this relationship (if any) will appear on issue 1.1. E.g. using the out-of-the-box configuration in virtual-metadata.xml
, metadata field publicationvolume.volumeNumber
of volume 1.1 would appear as metadata field publicationvolume.volumeNumber
on issue 1.1 on place 0 (i.e. as the first metadata value).With the groundwork out of the way, let's see what happens when we create a new version of volume 1.1. The new version is not yet archived, because it still has to be edited in the submission UI.
At this moment, when viewing the item page of issue 1.1, the user should only see volume 1.1 (as volume 1.2 is not yet archived). When viewing the item page of volume 1.1, nothing has changed: only a link to issue 1.1 will appear. When viewing the item page of volume 1.2 (e.g. as an admin), a link to issue 1.1 will appear as well.
As soon as volume 1.2 is deposited (archived), the "latest status" of both volume 1.1 and volume 1.2 are updated. When viewing the item page of issue 1.1, volume 1.2 should be visible. When viewing the item pages of the volumes, nothing has changed.
Let's create another version of the volume (not archived):
And after archiving volume 1.3:
What happens if we create a new version of issue 1.1?
Only the relationship with volume 1.3 is copied. For issue 1.1, no relationship was displayed with volume 1.1 and 1.2. (The relationships still exist in the database, but are not visible in the UI.). For volume 1.1, a relationship to issue 1.1 remains present, but it should not be updated to issue 1.2. For issue 1.2, these relationships are longer relevant, so they are not copied.
On the item pages of volume 1.1, volume 1.2 and volume 1.3, you should see issue 1.1 (as 1.2 is not archived yet)
Because issue 1.2 is not yet archived, all volumes are still pointing to issue 1.1. Let's archive it:
Now on the item pages of volume 1.1 and volume 1.2, you should see issue 1.1; it's the latest issue at the time that those volumes were superseded by volume 1.3. On the item page of volume 1.3, you'll see issue 1.3. On the item page of issue 1.1 you'll still see volume 1.3 as well.
If you have a closer look at items with relationships, you'll notice two categories of metadata fields that are controlled by DSpace:
relation.*
fields, for example relation.isIssueOfJournalVolume
on volume itemsrelation.*.latestForDiscovery
fields, for example relation.isIssueOfJournalVolume.latestForDiscovery
on volume itemsMetadata fields of the first category (relation.*
) contain all uuids of related items that the current item can see. I.e. a relationship has to exist between the current item and the other item, and the other item needs to have "latest status" for that specific relationship.
As an example take the following state of the previous section:
Item issue 1.1 will contain metadata field relation.isJournalVolumeOfIssue
with as value the uuid of volume 1.3. Volume 1.1 and 1.2 are not included because they don't have "latest status" on the relevant relationships.
Metadata fields of the second category (relation.*.latestForDiscovery
) contain all uuids of the items for which the current item is visible. I.e. a relationship has to exist between the current item and the other item, and the current item needs to have "latest status" for that specific relationship. These fields are particularly important for indexing and search, because they allow to us to surface all the items that a particular item is referring to.
Continuing on the example above, issue 1.1 will have metadata field relation.isJournalVolumeOfIssue.latestForDiscovery
containing the uuids of volume 1.1 and 1.2.
With issue 1.1 containing volume 1.1 and 1.2 in relation.isJournalVolumeOfIssue.latestForDiscovery
, a search on the volume 1.1 page for all issues containing volume 1.1 will display issue 1.1 thanks to this setup.
DSpace contains a bunch of example entity types that support versioning out of the box. What follows is an overview of the requirements to make entity versioning work.
config/registries/relationship-formats.xml
. E.g. relation.isAuthorOfPublication
, relation.isAuthorOfPublication.latestForDiscovery
, relation.isPublicationOfAuthor
and relation.isPublicationOfAuthor.latestForDiscovery
latestVersion:true
in discovery.xml
. This will be the default search, which ensures older versions are not shownlatestVersion:true
. This should be used for item pages displaying the related items to the current item using the discovery search.<entity-type>
and discovery config <entity-type>Relationships
for that purpose.Note that versioning support is enabled by default, but can be turned off by setting versioning.enabled = false
in versioning.cfg
or local.cfg
. For more details on item versioning, see: https://wiki.lyrasis.org/display/DSDOC7x/Item+Level+Versioning.
1 Comment
Tim Donohue
Additional documentation should be added here based on the detailed notes/discussion of the DSpace 7 Entities Working Group (2018-19) around this feature, all documented in this Google Doc: https://docs.google.com/document/d/1wEmHirFzrY3qgGtRr2YBQwGOvH1IuTVGmxDIdnqvwxM/edit
Specifically, that Google doc also includes: