Page History
...
This proposal is based on the premis that changes to DSpace metadata characteristics must be backward comparable and retain the same functionality as previously existed to ease transitional transition for all existing users of the platform. So many different functional areas of DSpace are reliant on existing metadata functionality , that it is criticial that any changes in functionality also have well defined and scripted updates across releases. Thus another very critical feature of this proposal is that this new Schema model should support the above features without significant need to transform existing DSpace Item metadata nor the registry itself.
...
This proposal seeks to extend the definition of the DSpace Metadata Schema to include support of these features previously found only in the Submission input-forms.xml. Formaizing a strategy for metadata validation in DSpace that is a new core feature.
Content Models For DSpace (
...
Extending on MetadataSchema and MetadataField to provide "
...
Metadata Profiles")
Rather than the current MetadataSchema applying to the namespace of the metadata fields that are allowed by the entire repository. It is instead recommended that this table be repurposed and expanded to support creation of "Named Profiles" that can be easily assigned to DSpaceObjects as Metadata or or Content Models. In this case, typing would initially be based on:
...
The above solution can be easily encoded into the database schema, while the existing MetadataSchema, MetadataField and MetadataValue objects should be easy extendable to support new methods and business logic.
A Example Use-Case
Metadata Schema Registry
In the following example an additional "dcterm" schema has been created to house the proper dcterms predicates while the "dc" schema continues to hold the existing qualified dc for legacy purposes.
ID | Namespace | Name | |
---|---|---|---|
1 | dc | ||
2 | http://purl.org/dc/terms | dcterms |
Metadata Schema: "dcterms"
where "dcterms:xxx" refinements point to a new Schema in the repository that contains the fields required for the typical dcterms namespace. In the current case, with the "item" and "item2" schema, this schema is not applied directly to Items, but inherited into defined "item" fields through "refinement".
ID | Field | refines | encoding | default | required | Scope Note |
---|---|---|---|---|---|---|
15 | dcterms.date | rdf:Property | W3CDTF | ${now} | true | Date of publication or distribution. |
25 | dcterms.identifier | rdf:Property | URI | true | Uniform Resource Identifier | |
37 | dcterms.language | rdf:Property | RFC5646 | en | Catch-all for non-ISO forms of the language of the item, accommodating harvested values. | |
40 | dcterms.relation | rdf:Property | URI | Catch-all for references to other related items. | ||
57 | dcterms.subject | rdf:Property | Literal | Uncontrolled index term. | ||
64 | dcterms.title | rdf:Property | Literal | true | Title statement/title proper. | |
66 | dcterms.type | rdf:Property | Class | Nature or genre of content. | ||
... | ... | ... | ... | ... | ... | ... |
Metadata Profile Registry
The profile registry defines fields that may be attached to a DSpace Item.
- A new "Profile" has been defined with its own namespace to be allowed on Collections A and B.
- Each custom "Profile" can be applied to a specific DSO type (in this case, Item) via an "Applies To" mapping to objects that are of its type (in the diagram above, this is the profile2dso mapping).
- Each custom "Profile" can enabled in a specific Container (Community, Collection, Item) via an "Allowed In" mapping (in the diagram above, this is the profile2container mapping).
ID | Namespace | Name | Applies To | Allowed In | |
---|---|---|---|---|---|
1 | Generic Item | Item | All Collections | ||
2 | http://mydspace/schema/item2 | Simple Item | Item | Collection A, Collection B |
Item Metadata Profile "Generic Item"
The following exemplifies how a Profile for generic items that may have many optional fields attached to them.
element | refines | encoding | default | required | Scope Note |
---|---|---|---|---|---|
issued | dcterms:issued | W3CDTF | ${now} | true | Date of publication or distribution. |
date | dcterms:date | W3CDTF | ${now} | Use qualified form if possible. | |
uri | dcterms:identifier | URI | true | Uniform Resource Identifier | |
identifier | dcterms:identifier | Literal | Catch-all for unambiguous identifiers not defined by qualified form; use identifier.other for a known identifier common to a local collection instead of unqualified form. | ||
iso | dcterms:language | RFC5646 | en | Current ISO standard for language of intellectual content, including country codes (e.g. "en_US"). | |
language | dcterms:language | RFC5646 | en | Catch-all for non-ISO forms of the language of the item, accommodating harvested values. | |
haspart | dcterms:relation | URI | References physically or logically contained item. | ||
relation | dcterms:relation | URI | Catch-all for references to other related items. | ||
mesh | dcterms:subject | URI | MEdical Subject Headings | ||
other | dcterms:subject | Literal | Local controlled vocabulary; global vocabularies will receive specific qualifier. | ||
subject | dcterms:subject | Literal | Uncontrolled index term. | ||
alternative | dcterms:title | Literal | Varying (or substitute) form of title proper appearing in item, e.g. abbreviation or translation | ||
title | dcterms:title | Literal | true | Title statement/title proper. | |
type | dcterms:type | Class | Nature or genre of content. | ||
... | ... | ... | ... | ... | ... |
Item Metadata Profile "Simple Item"
The second Item profile exemplifies a simple item with a smaller set of fields allowed, but with stricter requirements for populating those fields.
Field | refines | encoding | default | required | Scope Note |
---|---|---|---|---|---|
issued | dcterms:date | W3CDTF | ${now} | true | Date of publication or distribution. |
uri | dcterms:identifier | URI | true | Uniform Resource Identifier | |
language | dcterms:language | RFC5646 | en | true | Catch-all for non-ISO forms of the language of the item, accommodating harvested values. |
mesh | dcterms:subject | URI | true | MEdical Subject Headings | |
title | dcterms:title | Literal | true | Title statement/title proper. | |
type | dcterms:type | Class | true | Nature or genre of content. |
Steps To getting There
- New fields and tables are added to database.
- New attributes for existing DC schema and addition of the DCTerms Schema should be added to Registry after it has been extended.
- Creation of several "Item Profiles" that can exemplify different types of Items in DSpace. each should utilize the new DCTERMS Schema where-ever possible.
- Update Tooling Update DSpace build process to populate any necessary fields in new MetadataField and Profile tables.
- Improve User interface and DSO data model to include returning details pertaining to Profile types for informing the User interface
- Creation of new Describe Step and ItemEdit interfaces that enforce validation requirements expressed in the Metadata Profile
- Creation of MetadataProfile Administrative Interfaces for managing Profiles.
Summary
The above proposal clarifies that new capabilities may emerge for "Typing" , "Restriction" and "Validation" of DSpace objects through extension of the existing data model. The proposed strategy will support stronger typing of not only DSpaceObejcts, but also the values of metadata fields through validation rules such as syntax or vocabulary encodings, requiredness, Dublin Core or other metadata schema types. DSpace should be able to utilize the new MetadataProfileRegistry as a means to replace large portions of the functionally found in the input-forms.xml file in future DSpace versions.
...