You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Table of Contents

A system for specifying relations in Fedora, based on OWL Lite ontologies

In the following these shorthands will refer to the following namespaces

Fedora is a store for digital objects. The exact way they are stored is not important for this discussion. What is important is that the Fedora digital objects have RDF relations to each other. I.e. the Fedora digital object repository can be modelled as an RDF graph.

There is one critical difference between Fedora digital objects and a normal RDF graph: Each Fedora object contains its own local bit of the graph. You cannot change the number or nature of the relations from object A, without editing object A.

An ontology for the Fedora repository should preserve this characteristic. The ontology must be spread out over the entire repository, and you should not be able to break the ontology in one place by changing it somewhere else. In other words, if you have two separate Fedora repositories, each described by separate ontologies, and you transfer them to the same repository, the combined ontology should describe the combined sets of objects. This leads to the first property on the ontology system:

1. The ontology must not make statements that are global for the entire repository, except for the declaration of the existence of a class or property

Fedora provides a "class" of objects called Content Models. These try to represent the classes of data objects, and if specified, contain the description of the data objects. These are the natural location to place the local ontology bits. But now we reach the first problems. The Content models are both the classes of data objects, but they are also objects themselves. In order to describe such duality of existence, the language needed is OWL FULL, or something with similar expressive power. Since such expressive languages are difficult to reason about by automated systems, we chose to use a more restricted version, called OWL LITE. This imposes the second property on the ontology system.

2. The ontology only describes the data objects. Classes, representing and related to the content models are used, distinct from the content model objects.

An ontology with implicit rules, properties or classes, could lead to some potential problems. When part of the ontology is derived from the whole ontology, the effects of changes to the ontology can become difficult to predict. Especially the removal or introduction of a class could affect the nature of other classes. In effect, this means that someone wanting to use the ontology must know the entire ontology, in order to extrapolate anything implicit, which is in conflict with property 1. To make this explicit, the third property is introduced:

3. The ontology must be locally complete, so that every local bit provides the complete description of its local area.

Fedora RDF relations

Fedora does not allow for the FULL RDF specification to be used in the repository. What it basically allows is that each object can have properties relating them to other objects (called relations), and literal properties. There can be no qualifiers on the properties.

There are a number of note-worthy issues about the way Fedora works with RDF. The first is that Fedora objects do not declare a rdf:type property. Instead they use a fedora-model:hasModel property to relate to a Content model. Unfortunately, OWL LITE regards the relations as "owl:ObjectProperty"'s, and "rdf:type" as an "rdf:Property". As they are different, you must use OWL FULL to define: fedora-model:hasModel rdfs:subPropertyOf rdf:type
So, in OWL LITE Content Models cannot be regarded as classes, in violation of property 2. But as this is all that prevents from using OWL LITE for the ontology, there are hackish ways around it. And thus is property 4 defined.

4. In data objects, all "fedora-model:hasModel" relations are to be regarded as "rdf:type" relations to the class represented by the content model

In Fedora, it is not required that the relations from an object actually refers to another resource. For an ontology, this is problematic, in regards to the 3. property. As the Content Model define the class of an object, using a non-existing content model will mean that the class is implicitly defined. This leads to the fifth property.

5. All fedora-model:hasModel relations must refer to real Content models

Defining ontologies by OWL LITE

When the properties 2 and 5 are expressed in OWL, they become the property 6.

6. All Content models must contain OWL that define themselves as classes, and list the defined relations for their subscribing objects

A Fedora object consists of a number of datastreams. One datastream, RELS-EXT has been reserved for the Fedora rdf statements. We choose to reserve another datastream, ONTOLOGY, to contain the ontology definitions.

Just like Fedora only allows the "rdf:Description" tag in each object, we have chosen to similarly restrict what OWL tags can reside in a Content model. In fact, there are just three allowed elements inside the rdf:RDF tag; "owl:Class", "owl:ObjectProperty" and "owl:DatatypeProperty".

Each Content model must contain one and just one "owl:Class" element, about the Content model itself, with the prefix "_class" on the Content Model pid (to distinguish the object from the class). In this element the ordinary OWL syntax can be used to place restrictions on the Properties. The allowed restrictions are:

  • minCardinality (0-1)
  • maxCardinality (0-1)
  • cardinality (0-1)
  • someValuesFrom
  • allValuesFrom

You are not allowed to use the "rdfs:subClassOf" property to make the class a subclass of another Content model.

All relations defined for a data object should be defined in at least one of its Content models, in the form of "owl:ObjectProperty".

Relation can be declared in multiple Content models, but if a Content model place restrictions on a relation, it must declare the relation itself. The reason behind this requirement is just property 3. Even through all the declaration of ObjectProperties are global for the repository, and thereby allowed for all objects in the repository, the demand is that each data object should be described by just the local Content models, ie. those it relates to through the "fedora-model:hasModel" relation.

Fedora will also allow an object to have literal properties. Such properties are defined by the "owl:DatatypeProperty" tag.

Looking at property 1 in the context of "owl:ObjectProperty", it becomes clear that range and domain are not allowed. This is unfortunately required. Since neither OWL nor Fedora provide a way to ensure that the same relation is not defined twice, it is entirely possible for two unrelated Content models in the repository to define the same property. Each part of the repository will be valid viewed locally, but when regarding the repository as a whole, the two different definitions will be combined. Having two domains for a property mean that the source must be of both types, not either, and likewise for range, and the repository as a whole will be invalid. To prevent the risk of such errors the use of domain and range are disallowed.

7. "rdf:range" and "rdf:domain" are not allowed on any properties.

Instead of "rdf:range", one should use the "allValuesFrom" restriction. This restriction defines a range for the property, but only in the given class. As such, the restriction will have no global effect. "rdf:domain" is just not nessesary. The property 3 implies that the ONTOLOGY in a Content Model should describe the local area, i.e. the objects subscribing to that content model. The result of this is that the domain, so to speak, of a property will always be the Content Model in which it was defined. But again, this restriction will have no global effect, the property defined somewhere else will have some other Content Model as its domain.

Example of a simple ontology

    <!--RELS-EXT from Object_A1-->
    <rdf:Description rdf:about="info:fedora/demo:Object_A1">
        <fedora-model:hasModel rdf:resource="info:fedora/demo:CM_A"/>
        <demo-relations:hasB rdf:resource="info:fedora/demo:Object_B1"/>
    </rdf:Description>

    <!--OWL-SCHEMA from CM_A-->
    <owl:Class rdf:about="info:fedora/demo:CM_A">
        <rdfs:subClassOf>
            <owl:Restriction>
                <owl:onProperty
                        rdf:resource="http://www.statsbiblioteket.dk/demo-relations/#hasB"/>
                <owl:cardinality
                        rdf:datatype=
                                "http://www.w3.org/2001/XMLSchema#integer">
                    1
                </owl:cardinality>
            </owl:Restriction>
        </rdfs:subClassOf>
        <rdfs:subClassOf>
            <owl:Restriction>
                <owl:onProperty
                        rdf:resource="http://www.statsbiblioteket.dk/demo-relations/#hasB"/>
                <owl:allValuesFrom rdf:resource="info:fedora:/demo:CM_B"/>
            </owl:Restriction>
        </rdfs:subClassOf>
    </owl:Class>
    <owl:ObjectProperty
            rdf:about="http://www.statsbiblioteket.dk/demo-relations/#hasB"/>

An object A1 and a Content model CM_A is defined. There is one defined relation for A1, the #hasB relation. Two restrictions are placed on this relation. There must be one, and just one such relation in A1, and it must refer to an object of class/Content model CM_B. In fact, A1 has one such relation, and it refers to the object B1, which follows below.

    <!--RELS-EXT from Object_B1-->
    <rdf:Description rdf:about="info:Fedora/demo:Object_B1">
        <fedora-model:hasModel rdf:resource="info:fedora/demo:CM_B"/>
    </rdf:Description>


    <!--OWL-SCHEMA from CM_B-->
    <owl:Class rdf:about="info:Fedora/demo:CM_B">
        <rdfs:subClassOf>
            <owl:Restriction>
                <owl:onProperty
                        rdf:resource="http://www.statsbiblioteket.dk/demo-relations/#hasA"/>
                <owl:allValuesFrom rdf:resource="info:fedora:/demo:CM_A"/>
            </owl:Restriction>
        </rdfs:subClassOf>
    </owl:Class>
    <owl:ObjectProperty
            rdf:about="http://www.statsbiblioteket.dk/demo-relations/#hasA"/>

Here is the the object B1, and it's Content model CM_B. There is one fedined relation from a B1, the #hasA relation. There is just one restriction on this relation, that it must refer to something of class/Content model CM_A. No cardinality restriction is defined, so B1 does not need to have the relation, and in fact, it does not have it.

#trackbackRdf ($trackbackUtils.getContentIdentifier($page) $page.title $trackbackUtils.getPingUrl($page))
  • No labels