Page History
Expectations
This proposal is based on the premis that changes to DSpace metadata characteristics must be backward comparable and retain the same functionality as previously existed to ease transitional for all existing users of the platform. So many different functional areas of DSpace are reliant on existing metadata functionality, that it is criticial that any changes in functionality also have well defined and scripted updates across releases.
The following are some basic features of the proposal:
- Metadata fields can include additional properties for
- Validation rules such as syntax or vocabulary encodings
- Flag to designate the field is required.
- Form field types for input forms
- Type fields to designate Dublin Core or other metadata schema types, types are initially hard coded but new schema registry is extensible
- MetadataField should be extended with methods to derive its "dc" type. In the absence of an assigned type, all fields default to dc.description typing.
- MetadataSchema filed will be repurposed and extended to support
- Identification of the types of DSO it may be assigned to
- DSpaceObjects will be extended to allow them to have a specific "MetadataSchema" assigned. For example, different schema can be created for Publications, thesis, Multimedia, and so on, each having a different set of fields.
- DSpace will be able to use the new MetadataSchema registry will replace majority of input-forms.xml file.
- Additional table(s) will more than likely be required to designate schema that can be used in a specific collection, and thus the input forms that may be enabled in that collection.
- Inheritance may be used in schema to reduce replication of fields. For example, a base schema with Required DSpace fields that are generate during submission, workflow, archive processes (title, issued, accessioned, available)
- New schema may inherit from it to reduce replication of metadata fields.
Primary Objective
The primary objective of this proposal is that the DSpace metadata registry be "naturally" extended to support a richer and more expressive "Metadata Schema". Technical Objectives of the rpoposal are to provide the following features:
- Capability to Define "Metadata Profiles" for specific DSpace Objects and/or types of Objects.
- Capability to Define DMCI "subPropertyOf" relationships outside of the legacy ns.element.qualifier approach
- Capability to have "immutable" DC, DCTERMS and other "well established" namespaces
- Capability to Validate Existing DSpace item Metadata based on a profile that is either assigned via the parent container or directly tot he DSpace Item
- Capability to Apply these profiles similarly to DSpace Communities, Collection, Items, Bundles and Bitstreams.
Another very critical feature of this proposal is that this new Schema model should support the above features without significant need to transform existing DSpace Item metadata nor the registry itself.
Conceptual Definition of "Schema"
The DSpace MetadataSchema registry was designed based on an outdated concept of "Application Profiles" and "Qualified Dublin Core" that predated the current DCMI Abstract Model. Due to this, there are number of significant shortcomings to the current implementation.
- Namespaces are not "Schema"
- Qualification does not effectively meet needs for use of alternative namespaces while still providing clear mappings to DC for exposing metadata in OAI_DC.
- The Schema and Fields defined are insufficient to support validation of DSpace metadata fields in relation to Item Submission or other methods of Deposit.
The current "DSpace Schema" does not meet the requirements that a Schema is traditionally used for. Schema are traditionally used to define a scaffolding or framework of rules which actual content can be validated against. While the current MetadataSchema/Field does restrict what can be assigned to any item in DSpace, it does not provide any support for validation of these assignments, nor allow us to further define the encoding of the metadata values nor if they are required or not. At this time, much if of the validation, rules and encoding is poorly assigned instead, at the UI/Presentation level in the DSpace Submission input-forms.xml file and only enforced in the Describe Step of the Submission workflow.
This proposal seeks to extend the definition of the DSpace Metadata Schema to include support of these features previously found only in the Submission input-forms.xml. Formaizing a strategy for metadata validation in DSpace that is a new core feature.
...
Repurposing of MetadataSchema and MetadataField as Custom Metadata Template
Rather than MetadataSchema applying to the namespace of the metadata fields that are allowed by the repository. We instead recommend that this table be repurposed to embody "templates" of MetadataFields that should be used for specific types of DSpace Objects. Typing would be based on:
- DSpace Object Types (Site, Community, Collection, Item, Bundle, Bitstream)
- DCMI or Other Classes (Collection , Dataset , Event , Image , InteractiveResource , MovingImage , PhysicalObject , Service , Software , Sound , StillImage , Text)
- Custom Local RepresentationsTypes, of which the existing Qualified Dublin Core schema will be initial considered one of.
These above types will be expressed through the addition of properties to the MetadataSchemaRegistry and MetadataFieldRegistry tables to provide the facility to expand on and add additional Schema. Some Hypothetical examples of such schema would be:
- Community or Collection Profiles
- Document Collection Profile
- Journal Issue Profile
- Image Gallery Profile
- Item Profiles :
- Scholarly Item Profile
- Website Item Profile
- Thesis Item Profile
- Technical Report Item Profile
- Journal Article Profile
- Learning Object Item Profile
- Bitstream Profiles
- Multimedia Profile
- Streaming Video Profile
- Image Profile
- Document Profile
- Article
- Spreadsheet
- Etc
- Multimedia Profile
- Custom
- Custom Profile for any new type of DSpace content
The above profiles could be applied heterogeniously though metadata attached to any level of the DSpace object hierarchy.
Metadata Field Inheritance
Individual Metadata Fields, like DCMI metadata properties will support subTyping or inheritance. For example, from the DCMI Website, we have the following:
http://dublincore.org/documents/dcmi-terms/#terms-title
Term Name: title | |
---|---|
URI: | http://purl.org/dc/terms/title |
Label: | Title |
Definition: | A name given to the resource. |
Type of Term: | Property |
Refines: | http://purl.org/dc/elements/1.1/title |
Version: | http://dublincore.org/usage/terms/history/#titleT-002 |
Has Range: | http://www.w3.org/2000/01/rdf-schema#Literal |
In the case of DSpace
Supporting a similar level of refinement for DSpace Metadata can be supported through the addition of new MetadataFieldRegistry properties that are capable of storing this relationship.
The following are some basic features of the proposal:
- Metadata fields can include additional properties for
- Validation rules such as syntax or vocabulary encodings
- Flag to designate the field is required.
- Form field types for input forms
- Type fields to designate Dublin Core or other metadata schema types, types are initially hard coded but new schema registry is extensible
- MetadataField should be extended with methods to derive its "dc" type. In the absence of an assigned type, all fields default to dc.description typing.
- MetadataSchema filed will be repurposed and extended to support
- Identification of the types of DSO it may be assigned to
- DSpaceObjects will be extended to allow them to have a specific "MetadataSchema" assigned. For example, different schema can be created for Publications, thesis, Multimedia, and so on, each having a different set of fields.
- DSpace will be able to use the new MetadataSchema registry will replace majority of input-forms.xml file.
- Additional table(s) will more than likely be required to designate schema that can be used in a specific collection, and thus the input forms that may be enabled in that collection.
- Inheritance may be used in schema to reduce replication of fields. For example, a base schema with Required DSpace fields that are generate during submission, workflow, archive processes (title, issued, accessioned, available)
- New schema may inherit from it to reduce replication of metadata fields.
Gliffy Diagram name New Metadata Class Diagram
In the case of DSpace
ID | Field | refines | encoding | default | required | Scope Note |
---|---|---|---|---|---|---|
15 | dc.date.issued | dc:date | W3CDTF | ${now} | true | Date of publication or distribution. |
10 | dc.date | dc:date | W3CDTF | ${now} | Use qualified form if possible. | |
25 | dc.identifier.uri | dc:identifier | URI | true | Uniform Resource Identifier | |
17 | dc.identifier | dc:identifier | Literal |
ID | Field | Scope Note | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
73 | dc.accessibility.imageequivalents | Boolean field, true if images have equivalents | |||||||||||||||||||||||
73 | dc.accessibility.imageequivalents | Boolean field, true if images have equivalents | |||||||||||||||||||||||
74 | dc.accessibility.imageequivalentspresentation | Indicates the way image equivalents are presented | |||||||||||||||||||||||
72 | dc.accessibility.imagespresent | Boolean accessibility field | |||||||||||||||||||||||
2 | dc.contributor.advisor | Use primarily for thesis advisor. | |||||||||||||||||||||||
3 | dc.contributor.author | ||||||||||||||||||||||||
4 | dc.contributor.editor | ||||||||||||||||||||||||
5 | dc.contributor.illustrator | ||||||||||||||||||||||||
6 | dc.contributor.other | ||||||||||||||||||||||||
82 | dc.contributor.sponsor | ||||||||||||||||||||||||
1 | dc.contributor | A person, organization, or service responsible for the content of the resource. Catch-all for unspecified contributors. | |||||||||||||||||||||||
7 | dc.coverage.spatial | Spatial characteristics of content. | |||||||||||||||||||||||
8 | dc.coverage.temporal | Temporal characteristics of content. | |||||||||||||||||||||||
9 | dc.creator | Do not use; only for harvested metadata. | |||||||||||||||||||||||
11 | dc.date.accessioned | Date DSpace takes possession of item. | |||||||||||||||||||||||
12 | dc.date.available | Date or date range item became available to the public. | |||||||||||||||||||||||
13 | dc.date.copyright | Date of copyright. | |||||||||||||||||||||||
14 | dc.date.created | Date of creation or manufacture of intellectual content if different from date.issued. | |||||||||||||||||||||||
77 | dc.date.embargountil | Date Embargo will be lifted. | |||||||||||||||||||||||
15 | dc.date.issued | Date of publication or distribution. | |||||||||||||||||||||||
16 | dc.date.submitted | Recommend for theses/dissertations. | |||||||||||||||||||||||
67 | dc.date.updated | The last time the item was updated via the SWORD interface | |||||||||||||||||||||||
10 | dc.date | Use qualified form if possible. | |||||||||||||||||||||||
27 | dc.description.abstract | Abstract or summary. | |||||||||||||||||||||||
76 | dc.description.embargoterms | Description of Embargo Terms | |||||||||||||||||||||||
28 | dc.description.provenance | The history of custody of the item since its creation, including any changes successive custodians made to it. | |||||||||||||||||||||||
29 | dc.description.sponsorship | Information about sponsoring agencies, individuals, or contractual arrangements for the item. | |||||||||||||||||||||||
30 | dc.description.statementofresponsibility | To preserve statement of responsibility from MARC records. | |||||||||||||||||||||||
31 | dc.description.tableofcontents | A table of contents for a given item. | |||||||||||||||||||||||
32 | dc.description.uri | Uniform Resource Identifier pointing to description of this item. | |||||||||||||||||||||||
68 | dc.description.version | The Peer Reviewed status of an item | |||||||||||||||||||||||
26 | dc.description | Catch-all for any description not defined by qualifiers. | |||||||||||||||||||||||
34 | dc.format.extent | Size or duration. | |||||||||||||||||||||||
35 | dc.format.medium | Physical medium. | |||||||||||||||||||||||
36 | dc.format.mimetype | Registered MIME type identifiers. | |||||||||||||||||||||||
33 | dc.format | Catch-all for any format information not defined by qualifiers. | |||||||||||||||||||||||
18 | dc.identifier.citation | Human-readable, standard bibliographic citation of non-DSpace format of this item | |||||||||||||||||||||||
19 | dc.identifier.govdoc | A government document number | |||||||||||||||||||||||
20 | dc.identifier.isbn | International Standard Book Number | |||||||||||||||||||||||
23 | dc.identifier.ismn | International Standard Music Number | |||||||||||||||||||||||
21 | dc.identifier.issn | International Standard Serial Number | |||||||||||||||||||||||
24 | dc.identifier.other | A known identifier type common to a local collection. | |||||||||||||||||||||||
22 | dc.identifier.sici | Serial Item and Contribution Identifier | |||||||||||||||||||||||
69 | dc.identifier.slug | a uri supplied via the sword slug header, as a suggested uri for the item | |||||||||||||||||||||||
25 | dc.identifier.uri | Uniform Resource Identifier | |||||||||||||||||||||||
17 | dc.identifier | Catch-all for unambiguous identifiers not defined by qualified form; use identifier.other for a known identifier common to a local collection instead of unqualified form. | |||||||||||||||||||||||
38 | dc.language.iso | dc:language | RFC5646 | en | Current ISO standard for language of intellectual content, including country codes (e.g. "en_US"). | ||||||||||||||||||||
37 | 70 | dc.language.rfc3066 | the rfc3066 form of the language for the item | dc:language | RFC5646 | en | 37 | dc.language | Catch-all for non-ISO forms of the language of the item, accommodating harvested values. | 39 | dc.publisher | Entity responsible for publication, distribution, or imprint. | |||||||||||||
44 | dc.relation.haspart | References physically or logically contained item. | 46 | dc | .:relation | .hasversionReferences later version. | URI | 47 | dc.relation.isbasedon | References source. | 41 | dc.relation.isformatof | References additional physical form. | 42 | dc.relation.ispartof | References physically or logically containing contained item. | |||||||||
40 | 43 | dc.relation.ispartofseries | Series name and number within that series, if available. | ||||||||||||||||||||||
48 | dc.relation.isreferencedby | Pointed to by referenced resource. | |||||||||||||||||||||||
51 | dc.relation.isreplacedby | References succeeding item. | |||||||||||||||||||||||
45 | dc.relation.isversionof | References earlier version. | |||||||||||||||||||||||
50 | dc.relation.replaces | References preceeding item. | |||||||||||||||||||||||
49 | dc.relation.requires | Referenced resource is required to support function, delivery, or coherence of item. | |||||||||||||||||||||||
52 | dc.relation.uri | References Uniform Resource Identifier for related item. | |||||||||||||||||||||||
dc:relation | URI | 40 | dc.relation | Catch-all for references to other related items. | |||||||||||||||||||||
62 | 71 | dc.rightssubject.holder | The owner of the copyright | mesh | dc:subject | URI | 54 | dc.rights.uri | References terms governing use and reproduction. | MEdical Subject Headings | |||||||||||||||
5363 | dc.rights | Terms governing use and reproduction. | |||||||||||||||||||||||
56 | dc.source.uri | Do not use; only for harvested metadata. | |||||||||||||||||||||||
55 | dc.source | Do not use; only for harvested metadata. | |||||||||||||||||||||||
subject.other | dc:subject | Literal | Local controlled vocabulary; global vocabularies | 58 | dc.subject.classification | Catch-all for value from local classification system; global classification systems will receive specific qualifier. | |||||||||||||||||||
5957 | dc.subject.ddc | Dewey Decimal Classification Number | |||||||||||||||||||||||
60 | dc.subject.lcc | Library of Congress Classification Number | |||||||||||||||||||||||
dc:subject | Literal | 61 | dc.subject.lcsh | Library of Congress Subject Headings62 | dc.subject.mesh | MEdical Subject Headings | Uncontrolled index term. | ||||||||||||||||||
6563 | dc.subject.other | Local controlled vocabulary; global vocabularies will receive specific qualifier. | 57 | dc.subject | Uncontrolled index term.title.alternative | dc:title | TEXT | 65 | dc.title.alternativeVarying (or substitute) form of title proper appearing in item, e.g. abbreviation or translation | ||||||||||||||||
64 | dc.title | dc:title | TEXT | true | Title statement/title proper. | ||||||||||||||||||||
66 | dc.type | dc:type | Class | Nature or genre of content. |