Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Anatomy of a Fedora OCFL Object

This design of Fedora OCFL objects is narrowly focused on the logical state of the OCFL object. Details such as content paths and how OCFL objects are physically represented on disk are OCFL implementation details and are not covered.

Definitions

document shows a series of diagrams depicting the structure of the three varieties of Fedora resources compared with their non-Fedora OCFL representations. Following the diagrams are detailed descriptions of the Fedora naming conventions.
The three varieties of Fedora resources are:

  1. Atomic container: A Fedora container resource that maps to a single OCFL object
  2. Atomic binary: A Fedora binary resource Atomic resource: A Fedora resource (a binary or container) that maps to a single OCFL object
  3. Archival group: A  A potentially hierarchical set of Fedora resources (binaries and/or containers) that maps to a single OCFL object
    1. Archival part: A Fedora resource (a binary or container) that is a constituent part of an archival group

    Basic OCFL Structure

      1. Note, an archival group is comprised of Archival parts

    Finally, a detailed description is provided of the internal contents of the Fedora-specific files.

    Table of Contents

    Atomic Resources

    An atomic resource is either a single binary or a single container that maps to its own OCFL object. Below is a comparison of how Fedora atomic resources are represented in OCFL.

    Single-binary OCFL Object

    The following OCFL Object has a single version, which holds a single binary: "image.tiff":

    No Format
    [object root]
        ├── 0=ocfl_object_1.0
        ├── inventory.json
        ├── inventory.json.sha512
        └── v1/
            ├── inventory.json
            ├── inventory.json.sha512
            └── content/
                └── image.tiff

    Fedora Atomic Resource - Binary

    The following is the same single-binary OCFL object with both optional and required files used by FedoraThe following is an example of an OCFL object that contains two versions, consisting of a hierarchy of files and directories. For more information on OCFL, see the specification.

    No Format
    [object root]
        ├── 0=ocfl_object_1.0
        ├── inventory.json
        ├── inventory.json.sha512
        ├──└── v1/
            ├── inventory.json
            ├── inventory.json.sha512
            └── content/
                ├── empty.txtfcrepo/
                   ├── image.tiff.json           <-- Required "header" file holding system metadata about the binary. See description below.
                │   ├── foo/ image.tiff~fcr-desc.json  <-- Required "header" file holding system metadata about the binary's description. See description below.
                │   └── bar.xml image.tiff~fcr-acl.json   <-- Optional, only present if this Fedora resource has its own ACL.
                ├── image.tiff
                ├── image.tiff~fcr-desc.nt        <-- Required "binary description". See description below.
                └── image.tiff~fcr-acl.nt         <-- Optional, only present if this Fedora resource has its own ACL.

    Fedora stores system metadata about the binary (image.tiff) in a JSON file that contains elements such as the creation date and creator, as well as its interaction model (i.e. type), whether the object is an archival group, etc. The full contents and format of the JSON file is documented later in this document.

    Basic OCFL Structure

    The following is an example of an OCFL object that contains one version, consisting of a hierarchy of files and directories. For more information on OCFL, see the specification.

    No Format
    [object root]
        ├── └── v20=ocfl_object_1.0
        ├── inventory.json
        ├── inventory.json.sha512
        └── v1/
            ├── inventory.json
            ├── inventory.json.sha512
            └── content/
                 └──├── empty.txt
                ├── foo/
                    └── bar.xml
                └── image.tiff


    Basic Fedora OCFL Structure

    ...

    No Format
    [object root]
        ├── 0=ocfl_object_1.0
        ├── inventory.json
        ├── inventory.json.sha512
        ├──└── v1/
            ├── inventory.json
            ├── inventory.json.sha512
            └── content/
                ├── .fcrepo/
               ├── fcr-root.json
               ├── empty.txt.json
                   ├── foo/
         └── bar.xml.json
                 └──  ├── imagefoo.tiffjson
        └──   v2/
            ├──└── inventoryimage.tiff.json
                ├── inventory.json.sha512fcr-container.nt
            └── content/    ├── empty.txt
                └──├── foo/
                    └── bar.xml
                └── image.tiff


    No Format
    ..diagrams:
    - atomic with two versions of a binary
    - atomic with two versions of a container
    - archival group with two versions of a nested container with a binary

    ...

    Outside of the .fcrepo directory, all other OCFL object files store the content of Fedora resources. Every Fedora resource is persisted to OCFL using exactly two files. The first is a header file located in .fcrepo and the second is a content file located in the root of the OCFL object. The naming and location of content files can vary depending on the type of resource they belong to. It is not necessarily possible to identify a resource's content file without first determining what type of resource it is by reading its header file.

    -------------------------------------------------------------------

    Atomic Resources

    An atomic resource is a Fedora resource that maps directly to an OCFL object that contains only that resource.

    For example, if you have a container info:fedora/example that contains a binary info:fedora/example/binary and they are atomic resources, then they are each stored in their own OCFL objects, info:fedora/example and info:fedora/example/binary respectively.

    Files

    Header file: .fcrepo/fcr-root.json

    Content file, based on type of resource: 

    • RDF: fcr-container.nt
    • Non-RDF: <LAST_PART>

    LAST_PART: The final "path" part of the Fedora resource ID, eg binary

    Atomic resources may contain additional files; these will be covered later in the System Resource section.

    Examples

    The following is a container info:fedora/foo

    foo/

    ├── .fcrepo/

    │   └── fcr-root.json

    └── fcr-container.nt

    The following is a binary info:fedora/foo/bar

    bar/

    ├── bar

    └── .fcrepo/

    └── fcr-root.json

    Archival Group Resources

    An archival group resource is similar to an atomic resource, except that it must be a container and it must contain all of its children within the same OCFL object as archival part resources.

    For example, if you have an archival group info:fedora/example that contains a binary info:fedora/example/binary, then they are both stored in the same OCFL object, info:fedora/example.

    Because an archival group shares an OCFL object with its parts, an OCFL ID alone is not sufficient to identify the files related to a given Fedora resource that's contained in an archival group. This will be addressed in the Archival Part Resources section.

    Files

    Header file: .fcrepo/fcr-root.json

    Content file: fcr-container.nt

    As with atomic resource, archival groups may contain additional resources, which are detailed in the System Resources section.

    Examples

    The following is an empty archival group info:fedora/foo

    foo/

    ├── .fcrepo/

    │   └── fcr-root.json

    └── fcr-container.nt

    Archival Part Resources

    An archival part resource is a resource that is contained within the same OCFL object as the archival group that contains it. This includes not just binaries but also nested containers of binaries as well. In this case, there are many Fedora resources that map to the same OCFL object. All of the archival part resources necessarily share a common prefix, the archival group resource ID, and so they are disambiguated based on their resource ID relative to the archive group resource ID.

    For example, if you have an archival group info:fedora/example that contains a container info:fedora/example/book that in turn contains info:fedora/example/book/page1, then all three of these resources are contained within the OCFL object, info:fedora/example, and the archival group relative IDs are book and book/page1.

    Files

    Archival part containers are represented as directories within the OCFL object. So, in the above example, the container info:fedora/example/book is represented as the directory book within the info:fedora/example OCFL object. Similarly, info:fedora/example/book/page1 is located at page1 within the book directory.

    The files stored for an archival part are the same as the other resource types; they only appear different because they are in a nested directory structure within the same object, instead of in a flat structure across multiple objects.

    Header file: .fcrepo/<RELATIVE_ID>.json

    Content file, base on type of resource:

    • RDF: <RELATIVE_ID>/fcr-container.nt
    • Non-RDF: <RELATIVE_ID>

    RELATIVE_ID: The Fedora resource ID relative the containing archival group, eg book/page1

    Finally, archival part resources may also have System Resources, as described next.

    Examples

    The following is an AG info:fedora/foo that contains the containers info:fedora/foo/bar and info:fedora/foo/bar/baz and the binaries info:fedora/foo/f1, info:fedora/foo/bar/f2, and info:fedora/foo/bar/baz/f3.

    foo/

    ├── bar/

    │   ├── fcr-container.nt

    │   ├── baz/

    │   │   ├── fcr-container.nt

    │   │   └── f3

    │   └── f2

    ├── f1
    ── fcr-container.nt

    ├── .fcrepo/

    │   ├── bar/

    │   │   ├── baz

    │   │   │   └── f3.json

    │   │   ├── baz.json

    │   │   └── f2.json

    │   ├── bar.json

    │   ├── f1.json

    │   └── fcr-root.json

    └── fcr-container.nt

    System Resources

    Fedora supports 5 system resource types that are handled differently than the rest. They are: time maps (/fcr:versions), mementos (/fcr:version/timestamp), tombstones (/fcr:tombstone), ACLs (/fcr:acl), and binary descriptions (/fcr:metadata). The first three of these system resources are virtual, in that they are not represented in their own files on disk, and are not discussed here. On the other hand, ACLs and binary descriptions are stored in files within OCFL objects.

    Any Fedora resource may have an ACL associated to it and every binary resource has a binary description associated to it. These system resources are always stored in the same OCFL object that the Fedora resource they describe resides in. This is true even for atomic resources, which normally only contain a single Fedora resource.

    For example, if you have an atomic resource binary info:fedora/example/binary, and it has an ACL associated to it, then the OCFL object info:fedora/example/binary contains all of the following Fedora resources: info:fedora/example/binary, info:fedora/example/binary/fcr:metadata, and info:fedora/example/binary/fcr:acl.

    Files

    The following files are associated with binary metadata resources:

    Header file: .fcrepo/<RELATIVE_ID>~fcr-desc.json

    Content file: <RELATIVE_ID>~fcr-desc.nt

    The following files are associated with ACL resources:

    Header file, based on type of resource:

    • Atomic or archival group: .fcrepo/fcr-root~fcr-acl.json 
    • Archival part: .fcrepo/<RELATIVE_ID>~fcr-acl.json 

    Content file, based on type of resource:

    • RDF: <RELATIVE_ID>/fcr-container~fcr-acl.nt
    • Non-RDF: <RELATIVE_ID>~fcr-acl.nt

    RELATIVE_ID: The Fedora resource ID relative the containing archival group, eg book/page1

    The above file definitions apply to all Fedora resource types. However, in the case of atomic resources, RELATIVE_ID is always equal to LAST_PART and there should be no sub directories.

    Examples

    The following is an atomic binary resource info:fedora/foo/bar an ACL (info:fedora/foo/bar/fcr:acl) and description (info:fedora/foo/bar/fcr:metadata).

    bar/

    ├── bar

    ├── bar~fcr-acl.nt

    ├── bar~fcr-desc.nt

    └── .fcrepo/

    ├── bar~fcr-acl.json

    ├── bar~fcr-desc.json

    └── fcr-root.json

    Collisions

    One of the goals when mapping Fedora resources to OCFL objects and files within OCFL objects is for the mapping to be as transparent as possible. As much as possible there should be one-to-one mappings between the two. However, this is an impossible goal because Fedora must also write system files within the same OCFL objects and protect these files from being accidentally overwritten by users.

    To that end, the following restrictions apply to all Fedora resource names:

    1. Cannot equal .fcrepo
    1. Cannot equal fcr-root
    1. Cannot equal fcr-container.nt
    1. Cannot end in ~fcr-desc or ~fcr-desc.nt
    1. Cannot end in ~fcr-acl or ~fcr-acl.nt

    These restrictions are necessary to prevent server managed files from being overwritten, while, at the same time, avoiding completely segregating server managed and user managed files within separate directories within the OCFL object.