Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

DS-COMPOSITE datastream

Content Models may contain this reserved datastream. It lists the datastreams that must exist in subscribing data objects, and the few requirements on them. #DSCompositeSchema

DSCompositeSchema

The old schema for the datastream can be seen below.

...

Code Block
<dsCompositeModel
        xmlns="info:fedora/fedora-system:def/dsCompositeModel#">
    <dsTypeModel ID="DC">
        <form MIME="text/xml"/>
    </dsTypeModel>
    <dsTypeModel ID="ORIGIN">
        <form MIME="text/xml"/>
    </dsTypeModel>
</dsCompositeModel>

edit Allowing extensions in DS-COMPOSITE

Since Fedora already use DS-COMPOSITE to declare the existence of datastreams, it is the natural location to specify restrictions on the contents of datastreams. Unfortunately, the schema for the DS-COMPOSITE datastream does not allow for any extra content, To that effect, we have made a small change to the schema.

Optional datastreams

First, we will ammend the schema to support the property "optional" to datastream declarations. The JIRA issue for this is http://fedora-commons.org/jira/browse/FCREPO-531

The semantic meaning of optional is:
1. The datastream can exist in the substribing objects but does not have to
2. If the datastream exist, it must adhere to the specification in the content model

This will allow you to express optional datastreams like this

Code Block

<dsCompositeModel
        xmlns
Code Block

<xsd:schema
    targetNamespace="info:fedora/fedora-system:def/dsCompositeModel#">
    <dsTypeModel xmlnsID="info:fedora/fedora-system:def/dsCompositeModel#"DC">
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"    <form MIME="text/xml"/>
    elementFormDefault="qualified"</dsTypeModel>
    attributeFormDefault<dsTypeModel ID="unqualifiedORIGIN">
  <xsd:element name optional="dsCompositeModeltrue">
     <xsd:complexType>
   <form MIME="text/xml"/>
    </dsTypeModel>
</dsCompositeModel>

To allow this, the schema is now

Code Block

<xsd:schema  <xsd:sequence>
        <xsd:element minOccurs="0" maxOccurs="unbounded" ref="dsTypeModel"/>
targetNamespace="info:fedora/fedora-system:def/dsCompositeModel#"
        </xsd:sequence>xmlns="info:fedora/fedora-system:def/dsCompositeModel#"
    </xsd:complexType>
  </xsd:element>
  <xsdxmlns:element namexsd="dsTypeModel">http://www.w3.org/2001/XMLSchema"
    <xsd:complexType>
      <xsd:sequence>elementFormDefault="qualified"
        <xsd:element minOccursattributeFormDefault="0unqualified" maxOccurs="unbounded" ref="form"/>

>
    <xsd:element name="dsCompositeModel">
        <xsd:complexType>
           <!--Changes begin-->
 <xsd:sequence>
                <xsd:element minOccurs="0" maxOccurs="1unbounded" ref="extensionsdsTypeModel"/>
          <!--Changes end-->

  </xsd:sequence>
        </xsd:element>complexType>
      </xsd:sequence>element>
      <xsd:attributeelement name="IDdsTypeModel" use="required" type="xsd:NCName"/>
>
        </xsd<xsd:complexType>
  </xsd:element>

    <!-- Changes  begin -->
  <xsd:element name="extensions">sequence>
    <xsd:complexType>
      <xsd:sequence>
         <xsd:anyelement namespaceminOccurs="##any0" processContentsmaxOccurs="skipunbounded" minOccursref="0" maxOccurs="unboundedform"/>
            </xsd:sequence>
            <xsd:attribute name="nameID" use="optionalrequired"/>
    </xsd:complexType>
  </xsd:element>
    <!--Changes end-->

  <xsd:element name="form">
 type="xsd:NCName"/>
       <xsd:complexType>
      <xsd:attribute name="FORMAT_URIoptional" use="optional" type="xsd:anyURIboolean"/>
      <xsd:attribute name="MIME" use="optional" default="false"/>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>

With this changed schema, the contents could look like this:

Code Block

<dsCompositeModel
    <xsd:element name="form">
        xmlns="info:fedora/fedora-system:def/dsCompositeModel#">

<xsd:complexType>
     <dsTypeModel ID="DC">
      <xsd:attribute  <form MIME="text/xmlname="FORMAT_URI" use="optional" type="xsd:anyURI"/>
            <extensions<xsd:attribute name="SCHEMAMIME">

        </extensions>
    </dsTypeModel>
    <dsTypeModel ID="ORIGIN" use="optional"/>
        <form MIME="text/xml"/></xsd:complexType>
        <extensions name="SCHEMA">

        </extensions>
    </dsTypeModel>
</dsCompositeModel>

What is noteworthy here is that the <dsTypeModel> and the <form> elements are left unchanged. The Fedora code, working with DS-COMPOSITE only looks for these tags, so the new schema will not cause conflicts, and the extensions will be quietly ignored. This is exactly as we want, this change should not make our objects incompatible with an unmodified Fedora.
edit Schema extensions in DS-COMPOSITE

Now that there is a system in place for extensions to DS-COMPOSITE, looking at extensions become worthwhile. It can be said that there are three kinds of datastreams in a Fedora object:

1. xml embedded in the object
2. bytes embedded in the object
3. external file referenced by URL.

The schema extension will only concern itself with the first option, namely the xml embedded in the datastream. For XML, there already exist a commonly accepted system for specifying the content, i.e. XML Schema. But where to place the schema, then? Embedding it directly in DS-COMPOSITE makes for a very unreadable datastream. Alternatively, you could just specify an URL to the schema, but this approach have problems too. Having the Content Model depend on schemas defined elsewhere, perhaps on remote servers, mean that the content models could break by actions totally unrelated to the repository. The best way, we have found, is to embed the schema in a datastream, either in the content model itself, or in another object. To that purpose we have defined the following extension schema

</xsd:element>
</xsd:schema>

Allowing extensions in DS-COMPOSITE

Since Fedora already use DS-COMPOSITE to declare the existence of datastreams, it is the natural location to specify restrictions on the contents of datastreams. Unfortunately, the schema for the DS-COMPOSITE datastream does not allow for any extra content, To that effect, we have made a small change to the schema.

Code Block

<xsd:schema
        targetNamespace="info:fedora/fedora-system:def/dsCompositeModel#"
        xmlns="info:fedora/fedora-system:def/dsCompositeModel#"
        xmlns:xsd="http://www.w3.org/2001/XMLSchema"
        elementFormDefault="qualified"
        attributeFormDefault="unqualified">
    <xsd:element name="dsCompositeModel">
        <xsd:complexType>
    
Code Block

<xsd:schema
        targetNamespace="http://ecm.sourceforge.net/types/dscompositeschema/0/1/#"<xsd:sequence>
        xmlns="http://ecm.sourceforge.net/types/dscompositeschema/0/1/#"
        xmlns<xsd:xsdelement minOccurs="http://www.w3.org/2001/XMLSchema"
"0" maxOccurs="unbounded" ref="dsTypeModel"/>
            elementFormDefault="qualified"</xsd:sequence>
        attributeFormDefault="unqualified">
</xsd:complexType>
    </xsd:element>
    <xsd:element name="schemadsTypeModel">
        <xsd:complexType>
            <xsd:attribute name="type" use="required" type="typetype"/>
sequence>
                <xsd:attributeelement nameminOccurs="datastream0" usemaxOccurs="requiredunbounded" typeref="idTypeform"/>
                <xsd:attributeelement nameminOccurs="object0" usemaxOccurs="optionalunbounded" typeref="pidTypeextension"/>
        </xsd:complexType>
    </xsd:element>sequence>

      <xsd:simpleType name="typetype">
        <xsd:restriction base="xsd:string"attribute name="ID" use="required"/>
            <xsd:enumeration valueattribute name="optional" use="optional" type="xsd:boolean" default="false"/>
        </xsd:restriction>complexType>
    </xsd:simpleType>
element>
    <xsd:simpleTypeelement name="idTypeextension">
        <xsd:restriction base="xsd:ID">
complexType>
            <xsd:sequence>
                <xsd:maxLength value="64any namespace="##any" processContents="skip" minOccurs="0" maxOccurs="unbounded"/>
            </xsd:restriction>sequence>
       </xsd:simpleType>

     <xsd:simpleTypeattribute name="name" use="pidTyperequired"/>
        <xsd:restriction base="xsd:string">
</xsd:complexType>
    </xsd:element>
    <xsd:element name="form">
        <xsd:maxLength value="64"/>complexType>
            <xsd:patternattribute value="([A-Za-z0-9]|-|\.)+:(([A-Za-z0-9])|-|\.|~|_|(%[0-9A-F]{2}))+"/>
  name="FORMAT_URI" use="optional" type="xsd:anyURI"/>
            </xsd:restriction>
<xsd:attribute name="MIME" use="optional"/>
        </xsd:simpleType>

complexType>
    </xsd:schema>

Using that extension, the DS-COMPOSITE datastream could look like this

Code Block

<dsCompositeModelelement>
    <xsd:element name="reference">
        xmlns="info:fedora/fedora-system:def/dsCompositeModel#"
<xsd:complexType>
            xmlns<xsd:schemaattribute name="type"http://ecm.sourceforge.net/types/dscompositeschema/0/1/#">

/>
            <xsd:attribute name="value"/>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>

With this changed schema, the contents could look like this:

Code Block

<dsCompositeModel
        xmlns="info:fedora/fedora-system:def/dsCompositeModel#">

<!-- The DC datastream is declared. It's mime type must be text/xml. It must adhere to the xml schema residing in the DC_SCHEMA datastream in the "doms:DublinCore_Schema" object. -->
    <dsTypeModel ID="DC">
        <form MIME="text/xml"/>
        <extensions<extension name="SCHEMA">

        </extension>
    </dsTypeModel>
  <schema:schema type="xsd" datastream="DC_SCHEMA" object="info:fedora/doms:DublinCore_Schema"/>
 <dsTypeModel ID="ORIGIN">
         </extensions><form MIME="text/xml"/>
    </dsTypeModel>

    <!-- The ORIGIN datastream is declared. It's mime type must be text/xml. It must adhere to the xml schema residing in the ORIGIN_SCHEMA datastream in this content model-->
    <dsTypeModel ID="ORIGIN">
        <form MIME="text/xml"/>
        <extensions name="SCHEMA">
            <schema:schema type="xsd" datastream="ORIGIN_SCHEMA"/>
        </extensions>
    </dsTypeModel>

</dsCompositeModel>

...

    <extension name="SCHEMA">

        </extension>
    </dsTypeModel>
</dsCompositeModel>

What is noteworthy here is that the <dsTypeModel> and the <form> elements are left unchanged. The Fedora code, working with DS-COMPOSITE only looks for these tags, so the new schema will not cause conflicts, and the extensions will be quietly ignored. This is exactly as we want, this change should not make our objects incompatible with an older Fedora.

Reserved extensions in DS-COMPOSITE - the SCHEMA extension

We have reserved one extention name.

  • "SCHEMA" - used to specify that the datastream content must adhere to a schema

That leaves the question on where to place the content (schema). Currently, three ways are supported.
Embedding it directly in DS-COMPOSITE makes for a very unreadable datastream. Alternatively, you could just specify an URL to the schema, but this approach have problems too. Having the Content Model depend on schemas defined elsewhere, perhaps on remote servers, mean that the content models could break by actions totally unrelated to the repository. The best way, we have found, is to embed the schema in a datastream in the content model.

Code Block

<dsCompositeModel
        xmlns="info:fedora/fedora-system:def/dsCompositeModel#"
        xmlns:schema="http://ecm.sourceforge.net/types/dscompositeschema/0/1/#">

    <!-- The DC datastream is declared. It's mime type must be text/xml. It must adhere to the xml schema residing in on the specified URL.
    <dsTypeModel ID="DC">
        <form MIME="text/xml"/>
        <extension name="SCHEMA">
            <reference type="url" value="http://www.openarchives.org/OAI/2.0/oai_dc.xsd"/>
        </extension>
    </dsTypeModel>


    <!-- The PBCORE datastream is declared. It's mime type must be text/xml. It must adhere to the xml schema residing in the PDCORE_SCHEMA datastream in this content model-->
    <dsTypeModel ID="PBCORE">
        <form MIME="text/xml"/>
        <extension name="SCHEMA">
            <reference type="datastream" value="PDCORE_SCHEMA"/>
        </extension>
    </dsTypeModel>

    <!-- The ORIGIN datastream is declared. It's mime type must be text/xml. It must adhere to the xml schema inlined here
    <dsTypeModel ID="ORIGIN">
        <form MIME="text/xml"/>
        <extension name="SCHEMA">
            <schema targetNamespace="originNamespace" xmlns="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified">
                <element name="origin" type="string"/>
            </schema>
        </extension>
    </dsTypeModel>

</dsCompositeModel>

XML schema has a few problems. The most apparent problem here is that one schema can only declare a single namespace. In order to have elements from several namespaces, other schemas must be imported. These cannot be inlined in the schema, they must be residing on some other uri. In effect, this means that we can put one schema in a datastream, but it might still import schemas from remote locations. So, without being able to tell the schema to import from other datastreams, this would serve little purpose. For this purpose, we have defined the $THIS$ keyword, meaning "this object". It is used like this

Code Block

<schema targetNamespace="http://www.openarchives.org/OAI/2.0/oai_dc/"
        xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
        xmlns:dc="http://purl.org/dc/elements/1.1/"
        xmlns="http://www.w3.org/2001/XMLSchema"
        elementFormDefault="qualified" attributeFormDefault="unqualified">

    <annotation>
        <documentation>
            XML Schema 2002-03-18 by Pete Johnston.
            Adjusted for usage in the OAI-PMH.
            Schema imports the Dublin Core elements from the DCMI schema for unqualified Dublin Core.
            2002-12-19 updated to use simpledc20021212.xsd (instead of simpledc20020312.xsd)
        </documentation>
    </annotation>

    <import namespace="http://purl.org/dc/elements/1.1/"
            schemaLocation="$THIS$/SIMPLEDC-SCHEMA"/>

    <element name="dc" type="oai_dc:oai_dcType"/>

    <complexType name="oai_dcType">
        <choice minOccurs="0" maxOccurs="unbounded">
            <element ref="dc:title"/>
            <element ref="dc:creator"/>
            <element ref="dc:subject"/>
            <element ref="dc:description"/>
            <element ref="dc:publisher"/>
            <element ref="dc:contributor"/>
            <element ref="dc:date"/>
            <element ref="dc:type"/>
            <element ref="dc:format"/>
            <element ref="dc:identifier"/>
            <element ref="dc:source"/>
            <element ref="dc:language"/>
            <element ref="dc:relation"/>
            <element ref="dc:coverage"/>
            <element ref="dc:rights"/>
        </choice>
    </complexType>

</schema>

Notice that in the import tag, the schema location is given as "$THIS$/SIMPLEDC-SCHEMA". This means that the schema location is the residing in the SIMPLEDC-SCHEMA datastream in this content model. This functionality only works in schemas, and only when validating with the built-in validate method. It works both for schemas inlined in DS-COMPOSITE-MODEL, and stored in separate datastreams. You can have multiple schemas imported, in a hirachical structure, but the $THIS$ will always refer to the object containing the DS-COMPOSITE-MODEL in question.

This setup would not work, or rather, it would attemp the get the SCHEMA2 datastream from content model 1.

Code Block

Content model 1
  * DS-COMPOSITE-MODEL
    * SOMESTREAM
      * URL to http://localhost/fedora/objects/demo:contentmodel2/SOMESTREAM-SCHEMA

Content model 2
  * SOMESTREAM-SCHEMA
     * xml schema
       * import $THIS$/SCHEMA2
  * SCHEMA2
     * xml schema