Page History
...
Info | ||
---|---|---|
| ||
ORCID Authority allows you to link up DSpace metadata fields (added during the submission process) to a person's ORCID identifier. The main use case for this feature is to allow you to link author metadata fields to their ORCID identifier. This is a very basic ORCID integration that has existed since DSpace 5.x. |
...
Enabling the ORCID authority control
To enable ORCID authority control requires settings in both local.cfg and updating the registered beans in "orcid-authority-services.xml" as described below.
Settings to enable in local.cfg
If you wish to enable this feature, some changes are required to the local.cfg
file. The first step is to activate the authority as a valid option for authority control, this is done by adding/setting an additional plugin in the plugin.named.org.dspace.content.authority.ChoiceAuthority
property. An example of this can be found below.
...
The feature relies on the following configuration parameters in authority.cfg, solrauthority.cfg and solrauthorityorcid.cfg
. To activate the default settings it suffices to remove the comment hashes ("#") for the following lines . See or copy them into your local.cfg. See the section at the bottom of this page what these parameters mean exactly and how you can tweak the configuration.
Code Block |
---|
# This setting should already be specified in your solrauthority.cfg solr.authority.server=${solr.server}/${solr.multicorePrefix}authority # These settings can be found in your authority.cfg (or could be added to local.cfg) choices.plugin.dc.contributor.author = SolrAuthorAuthority choices.presentation.dc.contributor.author = authorLookup authority.controlled.dc.contributor.author = true authority.author.indexer.field.1=dc.contributor.author |
The final part of configuration is to add the authority consumer in front of the list of event consumers (in dspace.cfg or local.cfg). Add "authority" in front of the list as displayed below.
Code Block |
---|
event.dispatcher.default.consumers = authority, versioning, discovery, eperson |
Importing existing authors & keeping the index up to date
When first enabled the authority index will be empty, to populate the authority index run the following script:
Code Block |
---|
[dspace]/bin/dspace index-authority |
This will iterate over every metadata under authority control and create records of them in the authority index. The metadata without an authority key will each be updated with an auto generated authority key. These will not be matched in any way with other existing records. The metadata with an authority key that does not already exist in the index will be indexed with those authority keys. The metadata with an authority key that already exist in the index will be re-indexed the same way. These records remain unchanged.
Different possible use cases for Index-authority script
Metadata value WITHOUT authority key in metadata
“Luyten, Bram” is present in the metadata without any authority key.
GOAL: “Luyten, Bram” gets added in the cache ONCE
All occurences of “Luyten, Bram” in the DSpace item metadata will become linked with the same generated uid.
Metadata that already has an authority key from an external source (NOT auto-generated by DSpace)
“Snyers, Antoine” is present with authority key “u12345”
The old authority key needs to be preserved in the item metadata and duplicated in the cache.
“u12345” will be copied to the authority cache and used as the authority key there.
Metadata that has already a new dspace generated uid authority key
Item metadata already contains an author with name “Haak, Danielle” and a uid in the authority field 3dda2571-6be8-4102-a47b-5748531ae286
This uid is preserved and no new record is being created in the authority index.
Processing on records in the authority cache
Running this script again will update the index and keep the index clean. For example if an author occurs in a single item and that item is deleted the script will need to be run again to remove it from the index. When run again it will remove all records that no longer have a link to existing authors in the database.
Submission of new DSpace items - Author lookup
The submissions forms have not changed much. The only thing you can notice is an extra button next to the input fields for the author names. Next to the Add button, which is common for all repeatable fields, there is the Lookup & Add button.
Info |
---|
Note: the below screenshots are from DSpace 6.x. They have not yet been updated for 7.x or above. |
It's by clicking on that button that the Look-up User Interface appears. If an author name was filled in but not added yet, the Lookup User Interface will immediately perform a search for that name. Otherwise the search field remains empty and a list of known authors is displayed. The list of authors is updated as you type in the search box.
Authors that already appear somewhere in the repository are differentiated from the authors that have been retrieved from ORCID.
The authors retrieved from ORCID have their name italicized and they're listed after the authors that are found in the repository.
Click on one of these names to see more information about them. The message "There's no one selected" will vanish, making room for the author's information. The available information can vary: Authors imported from ORCID have an orcid where the others do not. Authors that have been added without look-up only show their last name and first name.
To add an author from the Look-up User Interface, you select the author in the list and then you click on the "Add This Person" button.
To add an author without look-up, you don't go through the Look-up User Interface. Instead you simply use the "Add" button in the submissions forms.
Admin Edit Item
In the edit metadata page, under the values for the dc.contributor.author fields, an extra line shows the author ID together with a lock icon and a Lookup button. The author ID cannot be changed manually. However the Lookup button will help you change the author name and ID at the same time.
Clicking the Lookup button brings back the Lookup User Interface. This works just the same way as in the submission forms.
Editing existing items using Batch CSV Editing
Instructions on how to use the Batch CSV Editing are found on the Batch Metadata Editing documentation page.
ORCID Integration is provided through the Batch CSV Editing feature with an extra available headers "ORCID:dc.contributor.author". The usual CSV headers only contain the metadata fields: e.g. "dc.contributor.author". In addition to the traditional header, another dc.contributor.author header can be added with the "ORCID:" prefix. The values in this column are supposed to be ORCIDs.
For each of the ORCID authors a lookup will be done and their names will be added to the metadata. All the non-ORCID authors will be added as well. The authority keys and solr records are added when the reported changes are applied.
Storage of related metadata
ORCID authorities not only link a digital identifier to a name. It regroups a load of metadata going from alternative names and email addresses to keywords about their works and much more. The metadata is obtained by querying the ORCID web services. In order to avoid querying the ORCID web services every time, all these related metadata is gathered in a "metadata authority cache" that DSpace can access directly.
In practice the cache is provided by an apache solr server. When a look-up is made and an author is chosen that is not yet in the cache, a record is created from an ORCID profile and added to the cache with the list of related metadata. The value of the Dublin Core metadata is based on the first and last name as they are set in the ORCID profile. The authority key for this value links directly to the solr document's id. DSpace does not provide a way to edit these records manually.
The information in the authority cache can be updated by running the following command line operation:
Command used: |
|
Arguments | description |
-i | update specific solr records with the given internal ids (comma-separated) |
-h | prints this help message |
This will iterate over every solr record currently in use (unless the -i argument is provided), query the ORCID web service for the latest data and update the information in the cache. If configured, the script will also update the metadata of the items in the repository where applicable.
The configuration property can be set in config/modules/solrauthority.cfg
, or overridden in your local.cfg
(see Configuration Reference).
Code Block |
---|
solrauthority.auto-update-items = false | true |
When set to true and this is script is run, if an authority record's information is updated the whole repository will be scanned for this authority. Every metadata field with this authority key will be updated with the value of the updated authority record.
Configuration
In the Enabling the ORCID authority control section, you have been told to add this block of configuration.
Info | ||
---|---|---|
| ||
For all of the configuration options described below, you can use either dspace.cfg or local.cfg. Either will work. It is possible that, when you compile your code with Maven, and you have tests enabled, your build will fail. DSpace unit tests utilize parts of dspace.cfg, and the configuration options you will utilize below are known to cause unit test errors. The easiest way to avoid this situation is to use the local.cfg file. |
Code Block |
---|
solr.authority.server=${solr.server}/authority
choices.plugin.dc.contributor.author = SolrAuthorAuthority
choices.presentation.dc.contributor.author = authorLookup
authority.controlled.dc.contributor.author = true
authority.author.indexer.field.1=dc.contributor.author
# These ORCID settings are now required for ORCID Authority
orcid.domain-url = https://orcid.org
orcid.api-url = https://pub.orcid.org/v3.0
# You do NOT need to pay for a Member API ID to use ORCID Authority.
# Instead, you just need a Public API ID from a free ORCID account.
# https://info.orcid.org/documentation/features/public-api/
orcid.application-client-id = MYID
orcid.application-client-secret = MYSECRET |
The ORCID Integration feature is an extension on the authority control in DSpace. Most of these properties are extensively explained on the Authority Control of Metadata Values documentation page. These will be revisited but first we cover the properties that have been newly added.
- The
solr.authority.server
is the url to the solr core. Usually this would be on thesolr.server
next to the oai, search and statistics cores. -
authority.author.indexer.field.1
and the subsequent increments configure which fields will be indexed in the authority cache. However before adding extra fields into the solr cache, please read the section about Adding additional fields under ORCID.
That's it for the novelties. Moving on to the generic authority control properties:
- With the
authority.controlled
property every metadata field that needs to be authority controlled is configured. This involves every type of authority control, not only the fields for ORCID integration. - The
choices.plugin
should be configured for each metadata field under authority control. Setting the value on SolrAuthorAuthority tells DSpace to use the solr authority cache for this metadatafield, cfr. Storage of related metadata. - The
choices.presention
should be configured for each metadata field as well. The traditional values for this property areselect|suggest|lookup
. A new value,authorLookup
, has been added to be used in combination with the SolrAuthorAuthority choices plugin. While the other values can still be used, the authorLookup provides a richer user interface in the form of a popup on the submission page. - The browse indexes need to point to the new authority-controlled index:
webui.browse.index.2 = author:metadata:dc.contributor.*,dc.creator:text
should become webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority - More existing configuration properties are available but their values are independent of this feature and their default values are usually fine:
choices.closed
,authority.required,
authority.minconfidence
For the cache update script, one property can be set in config/modules/solrauthority.cfg
:
Code Block |
---|
auto-update-items = false | true |
The default value for when the property is missing is false.
You must also uncomment the Orcid beans in config/spring/api/orcid-authority-services.xml
. Leave their values as-is since they pull their data from orcid.cfg or local.cfg
|
Beginning with DSpace 7, you must specify which ORCID API you wish to use. A Client ID/Secret is also required, but can be obtained for free for the Public API: https://info.orcid.org/documentation/features/public-api/
If you are an ORCID Member Institution, you can use the Member API instead. The Member API is required for additional ORCID Integration features, but is NOT required for this basic ORCID Authority feature.
Code Block |
---|
# These ORCID settings are now required for ORCID Authority
# They can be found in your orcid.cfg (or can be added to local.cfg)
orcid.domain-url = https://orcid.org
# You can use either the Public API or Member API in this next setting
orcid.api-url = https://pub.orcid.org/v3.0
# You do NOT need to pay for a Member API ID to use ORCID Authority.
# Instead, you just need a Public API ID from a free ORCID account.
orcid.application-client-id = MYID
orcid.application-client-secret = MYSECRET |
The final part of configuration is to add the authority consumer in front of the list of event consumers (in dspace.cfg or local.cfg). Add "authority" in front of the list as displayed below.
Code Block |
---|
event.dispatcher.default.consumers = authority, versioning, discovery, eperson |
Enabling the ORCID beans
You must also uncomment the ORCID beans in config/spring/api/orcid-authority-services.xml
. These are commented out by default as they require setting the "orcid.*" settings described above.
Simply uncomment these settings as-is & restart Tomcat. They will pull their configs from orcid.cfg or your local.cfg.
Code Block | ||
---|---|---|
| ||
<!-- This bean & alias are commented out by default. Simply uncomment them -->
<alias name="OrcidSource" alias="AuthoritySource"/>
<bean name="OrcidSource" class="org.dspace.authority.orcid.Orcidv3SolrAuthorityImpl">
<property name="clientId" value="${orcid.application-client-id}" />
<property name="clientSecret" value="${orcid.application-client-secret}" />
<property name="OAUTHUrl" value="${orcid.token-url}" />
<property name="orcidRestConnector" ref="orcidRestConnector"/>
</bean>
<!-- Also uncomment Orcidv3AuthorityValue in the list of supported types below.
The other settings in this AuthorityTypes bean can be left as-is. -->
<bean name="AuthorityTypes" class="org.dspace.authority.AuthorityTypes">
<property name="types">
<list>
<bean class="org.dspace.authority.orcid.Orcidv3AuthorityValue"/>
<bean class="org.dspace.authority.PersonAuthorityValue"/>
</list>
</property>
...
</bean> |
Importing existing authors & keeping the index up to date
When first enabled the authority index will be empty, to populate the authority index run the following script:
Code Block |
---|
[dspace]/bin/dspace index-authority |
This will iterate over every metadata under authority control and create records of them in the authority index. The metadata without an authority key will each be updated with an auto generated authority key. These will not be matched in any way with other existing records. The metadata with an authority key that does not already exist in the index will be indexed with those authority keys. The metadata with an authority key that already exist in the index will be re-indexed the same way. These records remain unchanged.
Different possible use cases for Index-authority script
Metadata value WITHOUT authority key in metadata
“Luyten, Bram” is present in the metadata without any authority key.
GOAL: “Luyten, Bram” gets added in the cache ONCE
All occurences of “Luyten, Bram” in the DSpace item metadata will become linked with the same generated uid.
Metadata that already has an authority key from an external source (NOT auto-generated by DSpace)
“Snyers, Antoine” is present with authority key “u12345”
The old authority key needs to be preserved in the item metadata and duplicated in the cache.
“u12345” will be copied to the authority cache and used as the authority key there.
Metadata that has already a new dspace generated uid authority key
Item metadata already contains an author with name “Haak, Danielle” and a uid in the authority field 3dda2571-6be8-4102-a47b-5748531ae286
This uid is preserved and no new record is being created in the authority index.
Processing on records in the authority cache
Running this script again will update the index and keep the index clean. For example if an author occurs in a single item and that item is deleted the script will need to be run again to remove it from the index. When run again it will remove all records that no longer have a link to existing authors in the database.
Submission of new DSpace items - Author lookup
When ORCID Authority is enabled, the Author field can be used to search entries in ORCID. Simply type in an Author name to search your locally indexed authors and authors in ORCID.
Select an author entry from the list to add that Author. The List of authors is updated as you type.
Authors that already appear somewhere in the repository are differentiated from the authors that have been retrieved from ORCID.
Admin Edit Item
Warning |
---|
Not yet available in DSpace 7.x. The below screenshots are from 6.x |
In the edit metadata page, under the values for the dc.contributor.author fields, an extra line shows the author ID together with a lock icon and a Lookup button. The author ID cannot be changed manually. However the Lookup button will help you change the author name and ID at the same time.
Clicking the Lookup button brings back the Lookup User Interface. This works just the same way as in the submission forms.
Editing existing items using Batch CSV Editing
Instructions on how to use the Batch CSV Editing are found on the Batch Metadata Editing documentation page.
ORCID Integration is provided through the Batch CSV Editing feature with an extra available headers "ORCID:dc.contributor.author". The usual CSV headers only contain the metadata fields: e.g. "dc.contributor.author". In addition to the traditional header, another dc.contributor.author header can be added with the "ORCID:" prefix. The values in this column are supposed to be ORCIDs.
For each of the ORCID authors a lookup will be done and their names will be added to the metadata. All the non-ORCID authors will be added as well. The authority keys and solr records are added when the reported changes are applied.
Storage of related metadata
ORCID authorities not only link a digital identifier to a name. It regroups a load of metadata going from alternative names and email addresses to keywords about their works and much more. The metadata is obtained by querying the ORCID web services. In order to avoid querying the ORCID web services every time, all these related metadata is gathered in a "metadata authority cache" that DSpace can access directly.
In practice the cache is provided by an apache solr server. When a look-up is made and an author is chosen that is not yet in the cache, a record is created from an ORCID profile and added to the cache with the list of related metadata. The value of the Dublin Core metadata is based on the first and last name as they are set in the ORCID profile. The authority key for this value links directly to the solr document's id. DSpace does not provide a way to edit these records manually.
The information in the authority cache can be updated by running the following command line operation:
Command used: |
|
Arguments | description |
-i | update specific solr records with the given internal ids (comma-separated) |
-h | prints this help message |
This will iterate over every solr record currently in use (unless the -i argument is provided), query the ORCID web service for the latest data and update the information in the cache. If configured, the script will also update the metadata of the items in the repository where applicable.
The configuration property can be set in config/modules/solrauthority.cfg
, or overridden in your local.cfg
(see Configuration Reference).
Code Block |
---|
solrauthority.auto-update-items = false | true |
When set to true and this is script is run, if an authority record's information is updated the whole repository will be scanned for this authority. Every metadata field with this authority key will be updated with the value of the updated authority record.
Configuration
In the Enabling the ORCID authority control section, you have been told to add this block of configuration.
Info | ||
---|---|---|
| ||
For all of the configuration options described below, you can use either dspace.cfg or local.cfg. Either will work. It is possible that, when you compile your code with Maven, and you have tests enabled, your build will fail. DSpace unit tests utilize parts of dspace.cfg, and the configuration options you will utilize below are known to cause unit test errors. The easiest way to avoid this situation is to use the local.cfg file. |
Code Block |
---|
solr.authority.server=${solr.server}/authority
choices.plugin.dc.contributor.author = SolrAuthorAuthority
choices.presentation.dc.contributor.author = authorLookup
authority.controlled.dc.contributor.author = true
authority.author.indexer.field.1=dc.contributor.author
# These ORCID settings are now required for ORCID Authority
orcid.domain-url = https://orcid.org
# You can use either the Public API or Member API
orcid.api-url = https://pub.orcid.org/v3.0
# You do NOT need to pay for a Member API ID to use ORCID Authority.
# Instead, you just need a Public API ID from a free ORCID account.
# https://info.orcid.org/documentation/features/public-api/
orcid.application-client-id = MYID
orcid.application-client-secret = MYSECRET |
The ORCID Integration feature is an extension on the authority control in DSpace. Most of these properties are extensively explained on the Authority Control of Metadata Values documentation page. These will be revisited but first we cover the properties that have been newly added.
- The
solr.authority.server
is the url to the solr core. Usually this would be on thesolr.server
next to the oai, search and statistics cores. -
authority.author.indexer.field.1
and the subsequent increments configure which fields will be indexed in the authority cache. However before adding extra fields into the solr cache, please read the section about Adding additional fields under ORCID.
That's it for the novelties. Moving on to the generic authority control properties:
- With the
authority.controlled
property every metadata field that needs to be authority controlled is configured. This involves every type of authority control, not only the fields for ORCID integration. - The
choices.plugin
should be configured for each metadata field under authority control. Setting the value on SolrAuthorAuthority tells DSpace to use the solr authority cache for this metadatafield, cfr. Storage of related metadata. - The
choices.presention
should be configured for each metadata field as well. The traditional values for this property areselect|suggest|lookup
. A new value,authorLookup
, has been added to be used in combination with the SolrAuthorAuthority choices plugin. While the other values can still be used, the authorLookup provides a richer user interface in the form of a popup on the submission page. - The browse indexes need to point to the new authority-controlled index:
webui.browse.index.2 = author:metadata:dc.contributor.*,dc.creator:text
should become webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority - More existing configuration properties are available but their values are independent of this feature and their default values are usually fine:
choices.closed
,authority.required,
authority.minconfidence
For the cache update script, one property can be set in config/modules/solrauthority.cfg
:
Code Block |
---|
auto-update-items = false | true |
The default value for when the property is missing is false <alias name="OrcidSource" alias="AuthoritySource"/>
<bean name="OrcidSource" class="org.dspace.authority.orcid.Orcidv3SolrAuthorityImpl">
<property name="clientId" value="${orcid.application-client-id}" />
<property name="clientSecret" value="${orcid.application-client-secret}" />
<property name="OAUTHUrl" value="${orcid.token-url}" />
<property name="orcidRestConnector" ref="orcidRestConnector"/>
</bean>
<!-- Also uncomment Orcidv3AuthorityValue in the list of supported types -->
<bean name="AuthorityTypes" class="org.dspace.authority.AuthorityTypes">
<property name="types">
<list>
<bean class="org.dspace.authority.orcid.Orcidv3AuthorityValue"/>
<bean class="org.dspace.authority.PersonAuthorityValue"/>
</list>
</property>
... Code Block
The final part of configuration is to add the authority consumer in front of the list of event consumers. Add "authority" in front of the list as displayed below.
...