This Confluence wiki site, maintained by DuraSpace prior to the recent merger with LYRASIS, will transition from the duraspace.org domain to the lyrasis.org domain on Saturday, Nov 16 beginning at approximately 7pm ET. A period of downtime of 2-3 hours is expected. After the transition, this wiki will be available at https://wiki.lyrasis.org/. All links to duraspace.org wiki pages will be redirected to the correct lyrasis.org URL. If you have questions prior to or following the transition please contact: wikihelp@lyrasis.org.
Child pages
  • Updating the Qualified Dublin Core registry in DSpace
Skip to end of metadata
Go to start of metadata

Historical JIRA references: JIRA-433 and JIRA-805.  

Description: 

Our goal was to tackle the task of "Updating the Qualified Dublin Core registry in Dspace to the latest standards of the DCMI", in coordination with the goal of "Standardizing the Default Namespace." Given the incompatibility of the current DCMI terms standard (which necessitates implementing range and domain properties) with the DSpace data model, we are focusing on the more discrete task of evaluating elements in the current Dspace "DC" registry and adding or updating them to comply with DCMI QDC standards (themselves a loose set of standards). As a benefit, the updated QDC allows for a fairly direct mapping to the flat DCTERMS. This may ultimately allow for a neater migration to the DCTERMS. 

Our first attempt to evaluate the task and make recommendations is attached as a xslx file. Further recommendations hinge on: the community's sense of the importance of locking down the DC registry (and the level of desired lockdown, i.e., at the element or qualifier level); input on the operational-to-DSpace nature of certain elements; and feedback on the desirability of moving closer to the DC terms standards currently recommended by DCMI.

Questions:

  1. We aimed to integrate our task with the goal of Standardizing the Default Namespace. Is it plausible, from a developer's perspective, to implement this task's "standardized default Dublin Core, un-editable (no changing or deleting fields) except to add fields with qualifiers to existing elements" in the metadata registry? Specifically, would it be possible to largely lock down the DC registry but allow additions to be made only to qualifiers (I.e., no one could add dc.personal.ddc, but they could add dc.subject.ddc)?
  2. Is there a clear preference for locking down the DC registry? 
  3. Given that the DC terms data model is currently incompatible with DSpace, we aimed to update the QDC registry to the last version of DC that would also comply with the Dspace data model. Is anyone aware of a QDC document that supersedes this one: http://dublincore.org/documents/usageguide/qualifiers.shtml
  4. Has any previous work been done to identify those Dspace-specific fields that should be migrated to a separate registry? (Mentioned here) This may help guide our decision to make the QDC registry more "pure DC" and shed the Dspace operational fields to another registry.
  5. For those more familiar with internationalization work: Dspace current ships with dc.language.rfc3066. Should we add or suggest migrating to dc.language.rfc5646? What would this change or suggestion entail?
  • No labels

7 Comments

  1. Summary of the DCAT discussion during the Sept 4, 2012 mtg (courtesy of Bram Luyten):

    1. Refining current DSpace DC according to 2005 QDC recommendations VS aiming for adding DCTERMS as a new schema

     The spreadsheet currently proposes the first approach. It's less intrusive and already helps to get closer to more uniformity. However, it seems like the DC community itself, with the DCTERMS namespace is already a bit further down a different path.

     Adding a new schema and making it optimal for people to convert/migrate existing metadata to this new schema might in the end make it less intrusive than performing changes to the current standard DC which might affect everyone.

    2. Definition of "locked" schema's

    We started off with the question whether a potential standard schema should be locked on either the element level, or as far as the qualifier level.

    Given the fact that DC allows extensions, it's my fear that if we don't allow people to add at least qualifiers, that we're being to severe. Therefor locking could mean that both standard elements and their qualifiers can't just be renamed or deleted from the user interface.

    In terms of the element level, there seemed to be consensus that elements in a "locked" schema should be added, removed or modified.

    3. Migration path

    The usecase of someone starting with a new installation of DSpace with a shiny new standardized namespace seems very different from existing DSpace users who might be confronted with a lot of migration efforts & pain.

    If we can't come up with a concept, either with or without help of the developers, to make migration painless & riskfree, there might be very little uptake (given that we would be able to get something in the code to start with).

    Therefore, it might be a good idea to investigate in detail how far the impact goes when we start changing things here. A very quick list could be:

    Any process that create new metadata in dspace

    - submission forms

    - spreadsheet importer

    - command line import

    - SWORD

    - built in OAI-Harvester

    Any process that displays metadata in the web user interface

    - item pages

    - search, browse, DSpace discovery

    Any process that delivers the metadata (potentially via crosswalks) to other applications

    - OAI server

    - REST API

    - ..

  2. Thanks for these notes from the last meeting, Bram. Apologies for the delay, but here are some snippets I noted during September's discussion of the DC registry update:

     

    Overarching goals

    -We need to have a clearer idea of what we think the ideal is for DC in Dspace. What is our goal, and what steps do we need to take towards it?

     

    QDC v. DCTERMS

    -DCTERMS less compatible with Dspace's schema, which does not offer authority control or rules (or enforcement). 

    -How do others enforce values in DCTERMS with range and domain? 

     

    Migration and implementation questions

    -Add DCTERMS as its own namespace and provide scripts to migrate to DCTERMS?

    -Lockdown at element or qualifier level? Concern about breaking compliance with DC. 

    -How to ensure backwards compatibility?

    -Ship with tool for mapping that will identify what elements were not compliant?

    -Staged implementation? 

     

    Current schema management

    -Open question of how repositories are being used— are additional schema being added? Are repositories adding elements to the DC schema that ships with Dspace? Or repurposing certain fields that are in the DC registry but not being used for their defined purpose? 

    -Some discussion of the need for more documentation and transparency around adding schema to Dspace. 

    -Should Dspace ship with an "example" skeleton local schema to encourage the use of a local schema?

  3. At DCAT's request, we have compiled a report around this question, to be discussed during the December 18 DCAT conference call. The report is attached here: Report to DCAT_QDC metadata registry update (12 11 2012).docx

  4. In terms of locking down the standard definitions, here's an approach that might be worth considering:

    http://jena.apache.org/documentation/javadoc/jena/com/hp/hpl/jena/vocabulary/DCTerms.html

  5. Comments from Sarah Shreeves: 

    "I want to strongly urge the group to look at conforming with DCMI terms (http://dublincore.org/documents/dcmi-terms/) - even if we can't conform to the vocabulary, etc, this is the most up to date and current form of the namespace. If we use the dc qualifiers document we will be perpetuating the same problem, IMO. I think we can, as Tim suggests, have a graceful path forward.

    I will admit that a real part of my fear of just moving to DC Qualified is that DSpace--in terms of metadata--will continue to be seen as out of touch with where much of the metadata world is headed."

  6. This issue was discussed at some length in the 12/18/12 DCAT phone call: DCAT Meeting Notes December 18, 2012