Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Layout of a Fedora OCFL Object

This design of Fedora OCFL objects is narrowly focused on the logical state of the OCFL object. Details such as content paths and how OCFL objects are physically represented on disk are OCFL implementation details and are not covered.

Definitions

  1. Atomic resource
  2. Archival group
  3. Header file

Basic OCFL Structure

...

Basic Fedora OCFL Structure

No Format
..diagrams:
- atomic with two versions of a binary
- atomic with two versions of a container
- archival group with two versions of a nested container with a binary

General

Every Fedora OCFL object contains a .fcrepo directory in its root. This directory contains JSON files, header files, that describe the resources contained in the OCFL object. The specifics of the contents of the header files is not discussed here, just their naming conventions. They contain information that enables Fedora to know how it is supposed to interact with a resource. Header files must be identifiable by Fedora without knowing anything about a resource other than its IDs.

Outside of the .fcrepo directory, all other OCFL object files store the content of Fedora resources. Every Fedora resource is persisted to OCFL using exactly two files. The first is a header file located in .fcrepo and the second is a content file located in the root of the OCFL object. The naming and location of content files can vary depending on the type of resource they belong to. It is not necessarily possible to identify a resource's content file without first determining what type of resource it is by reading its header file.

document shows a series of diagrams depicting the structure of the three varieties of Fedora resources compared with their non-Fedora OCFL representations. Following the diagrams are detailed descriptions of the Fedora naming conventions.

The three varieties of Fedora resources are:

  1. Atomic container: A Fedora container resource that maps to a single OCFL object
  2. Atomic binary: A Fedora binary resource that maps to a single OCFL object
  3. Archival group: A potentially hierarchical set of Fedora resources (binaries and/or containers) that maps to a single OCFL object
    1. Note, an archival group is comprised of Archival parts

Finally, a detailed description is provided of the internal contents of the Fedora-specific files.

For more information on OCFL, see the specification.

Note

This document is intended to provide a "logical" description of what may be found when exploring Fedora's OCFL storage root. What is actually on-disk may be different based on details of the OCFL specification. For example:

  • Identical files in the same object will only be persisted once, and referenced in the OCFL inventory.json as multiple files.
  • Each subsequent OCFL version directory will only contain the changed/added files, as opposed to the complete set of files for the OCFL object. This is the concept of "forward versioning".
  • Fedora installations that choose to disable auto-versioning will see "mutable head" directories in their OCFL objects.

To provide the most intuitive view of your OCFL storage root, it is recommended that an OCFL client be used to explore the files. 

Table of Contents

General

In the examples below, it can be seen that OCFL objects created by Fedora consist of three types of files:

  1. The user-content files
  2. RDF files containing user-properties related to the OCFL object and/or user's content files
  3. Fedora-specific JSON "header" files containing system-properties related to the OCFL object and/or user's content files

The first two categories of files (user's content files and Fedora-specific RDF files) are located in the "content/" section of the OCFL object. The last category of files (Fedora-specific JSON "header" files) are located in a sub-directory of the "content/" section of the OCFL object, named ".fcrepo/".

The full contents and format of the JSON "header" files, and Fedora-specific RDF files are detailed in the Specification of Fedora-specific OCFL Files section of this document. -------------------------------------------------------------------

Atomic Resources

An atomic resource is a Fedora resource either a single binary or a single container that maps directly to an OCFL object that contains only that resource.

For example, if you have a container info:fedora/example that contains a binary info:fedora/example/binary and they are atomic resources, then they are each stored in their own OCFL objects, info:fedora/example and info:fedora/example/binary respectively.

Files

Header file: .fcrepo/fcr-root.json

Content file, based on type of resource: 

  • RDF: fcr-container.nt
  • Non-RDF: <LAST_PART>

LAST_PART: The final "path" part of the Fedora resource ID, eg binary

Atomic resources may contain additional files; these will be covered later in the System Resource section.

Examples

The following is a container info:fedora/foo

foo/

├── .fcrepo/

│   └── fcr-root.json

└── fcr-container.nt

The following is a binary info:fedora/foo/bar

bar/

├── bar

└── .fcrepo/

└── fcr-root.json

Archival Group Resources

An archival group resource is similar to an atomic resource, except that it must be a container and it must contain all of its children within the same OCFL object as archival part resources.

For example, if you have an archival group info:fedora/example that contains a binary info:fedora/example/binary, then they are both stored in the same OCFL object, info:fedora/example.

Because an archival group shares an OCFL object with its parts, an OCFL ID alone is not sufficient to identify the files related to a given Fedora resource that's contained in an archival group. This will be addressed in the Archival Part Resources section.

Files

Header file: .fcrepo/fcr-root.json

Content file: fcr-container.nt

As with atomic resource, archival groups may contain additional resources, which are detailed in the System Resources section.

Examples

The following is an empty archival group info:fedora/foo

foo/

├── .fcrepo/

│   └── fcr-root.json

└── fcr-container.nt

Archival Part Resources

An archival part resource is a resource that is contained within the same OCFL object as the archival group that contains it. This includes not just binaries but also nested containers of binaries as well. In this case, there are many Fedora resources that map to the same OCFL object. All of the archival part resources necessarily share a common prefix, the archival group resource ID, and so they are disambiguated based on their resource ID relative to the archive group resource ID.

For example, if you have an archival group info:fedora/example that contains a container info:fedora/example/book that in turn contains info:fedora/example/book/page1, then all three of these resources are contained within the OCFL object, info:fedora/example, and the archival group relative IDs are book and book/page1.

Files

Archival part containers are represented as directories within the OCFL object. So, in the above example, the container info:fedora/example/book is represented as the directory book within the info:fedora/example OCFL object. Similarly, info:fedora/example/book/page1 is located at page1 within the book directory.

The files stored for an archival part are the same as the other resource types; they only appear different because they are in a nested directory structure within the same object, instead of in a flat structure across multiple objects.

Header file: .fcrepo/<RELATIVE_ID>.json

Content file, base on type of resource:

  • RDF: <RELATIVE_ID>/fcr-container.nt
  • Non-RDF: <RELATIVE_ID>

RELATIVE_ID: The Fedora resource ID relative the containing archival group, eg book/page1

Finally, archival part resources may also have System Resources, as described next.

Examples

The following is an AG info:fedora/foo that contains the containers info:fedora/foo/bar and info:fedora/foo/bar/baz and the binaries info:fedora/foo/f1, info:fedora/foo/bar/f2, and info:fedora/foo/bar/baz/f3.

foo/

├── bar/

│   ├── fcr-container.nt

│   ├── baz/

│   │   ├── fcr-container.nt

│   │   └── f3

│   └── f2

├── f1
── fcr-container.nt

├── .fcrepo/

│   ├── bar/

│   │   ├── baz

│   │   │   └── f3.json

│   │   ├── baz.json

│   │   └── f2.json

│   ├── bar.json

│   ├── f1.json

│   └── fcr-root.json

└── fcr-container.nt

System Resources

Fedora supports 5 system resource types that are handled differently than the rest. They are: time maps (/fcr:versions), mementos (/fcr:version/timestamp), tombstones (/fcr:tombstone), ACLs (/fcr:acl), and binary descriptions (/fcr:metadata). The first three of these system resources are virtual, in that they are not represented in their own files on disk, and are not discussed here. On the other hand, ACLs and binary descriptions are stored in files within OCFL objects.

Any Fedora resource may have an ACL associated to it and every binary resource has a binary description associated to it. These system resources are always stored in the same OCFL object that the Fedora resource they describe resides in. This is true even for atomic resources, which normally only contain a single Fedora resource.

For example, if you have an atomic resource binary info:fedora/example/binary, and it has an ACL associated to it, then the OCFL object info:fedora/example/binary contains all of the following Fedora resources: info:fedora/example/binary, info:fedora/example/binary/fcr:metadata, and info:fedora/example/binary/fcr:acl.

Files

The following files are associated with binary metadata resources:

Header file: .fcrepo/<RELATIVE_ID>~fcr-desc.json

Content file: <RELATIVE_ID>~fcr-desc.nt

The following files are associated with ACL resources:

Header file, based on type of resource:

  • Atomic or archival group: .fcrepo/fcr-root~fcr-acl.json 
  • Archival part: .fcrepo/<RELATIVE_ID>~fcr-acl.json 

Content file, based on type of resource:

  • RDF: <RELATIVE_ID>/fcr-container~fcr-acl.nt
  • Non-RDF: <RELATIVE_ID>~fcr-acl.nt

RELATIVE_ID: The Fedora resource ID relative the containing archival group, eg book/page1

The above file definitions apply to all Fedora resource types. However, in the case of atomic resources, RELATIVE_ID is always equal to LAST_PART and there should be no sub directories.

Examples

The following is an atomic binary resource info:fedora/foo/bar an ACL (info:fedora/foo/bar/fcr:acl) and description (info:fedora/foo/bar/fcr:metadata).

bar/

├── bar

├── bar~fcr-acl.nt

├── bar~fcr-desc.nt

└── .fcrepo/

├── bar~fcr-acl.json

├── bar~fcr-desc.json

└── fcr-root.json

Collisions

One of the goals when mapping Fedora resources to OCFL objects and files within OCFL objects is for the mapping to be as transparent as possible. As much as possible there should be one-to-one mappings between the two. However, this is an impossible goal because Fedora must also write system files within the same OCFL objects and protect these files from being accidentally overwritten by users.

To that end, the following restrictions apply to all Fedora resource names:

  1. Cannot equal .fcrepo
  1. Cannot equal fcr-root
  1. Cannot equal fcr-container.nt
  1. Cannot end in ~fcr-desc or ~fcr-desc.nt
  1. Cannot end in ~fcr-acl or ~fcr-acl.nt

to its own OCFL object. Below is a comparison between a plain OCFL representation of a single item object and Fedora's atomic resource representation.

Single-binary OCFL Object

The following OCFL Object has a single version, which holds a single binary: "image.tiff":

No Format
[object root]
    ├── 0=ocfl_object_1.0
    ├── inventory.json
    ├── inventory.json.sha512
    └── v1/
        ├── inventory.json
        ├── inventory.json.sha512
        └── content/
            └── image.tiff

Fedora Atomic Resource - Binary

The following is the same single-binary OCFL object with both optional and required files used by Fedora.

Note: the ACL resources depicted in this and the following diagrams may be omitted if the given resource chooses to inherit its ACL from the repository's default ACL.

No Format
[object root]
    ├── 0=ocfl_object_1.0
    ├── inventory.json
    ├── inventory.json.sha512
    └── v1/
        ├── inventory.json
        ├── inventory.json.sha512
        └── content/
            ├── .fcrepo/
            │   ├── fcr-root.json           <-- Required "header" file holding system metadata about the binary. See description below.
            │   ├── fcr-root~fcr-desc.json  <-- Required "header" file holding system metadata about the binary's description. See description below.
            │   └── fcr-root~fcr-acl.json   <-- Optional, only present if this Fedora resource has its own ACL.
            ├── image.tiff
            ├── image.tiff~fcr-desc.nt      <-- Required "binary description". See description below.
            └── image.tiff~fcr-acl.nt       <-- Optional, only present if this Fedora resource has its own ACL.

Fedora stores system metadata about the binary (image.tiff) and the binary's description (currently, https://host/rest/image.tiff/fcr:metadata) in respective JSON files that contain elements such as the creation date and creator, as well as its interaction model (i.e. type), whether the object is an archival group, etc. Additionally, Fedora resources can optionally have their own ACLs, which if present also have their own JSON header file. 

Naming conventions

System content (located in the ".fcrepo/" directory)

  • The binary's header file: the filename is always "fcr-root.json". This allows Fedora to find the header file without knowing if the resource is a binary, container or archival group.
  • The binary description's header file: the filename is always "fcr-root~fcr-desc.json".
  • The binary ACL's header file: the filename is always "fcr-root~fcr-acl.json".

User content

  • The binary: the filename provided during ingest of the binary is retained in the OCFL persistence. ("image.tiff", in this example)
  • The binary's description: the filename is the name of the binary with the addition of the suffix, "~fcr-desc.nt". ("image.tiff~fcr-desc.nt", in this example)
  • The binary's ACL: the filename is the name of the binary with the addition of the suffix, "~fcr-acl.nt". ("image.tiff~fcr-acl.nt", in this example)

Fedora Atomic Resource - Container

A Fedora container resource is effectively an empty directory with LDP-defined behaviors depending on its "interaction model" (i.e. type: basic, direct, indirect), and associated metadata, along with an optional ACL. Since OCFL does not support empty directories as valid objects, there is no direct OCFL analogous object for a Fedora container atomic resource.

Note: the ACL resources depicted in this and the following diagrams may be omitted if the given resource chooses to inherit its ACL from the repository's default ACL.

No Format
[object root]
    ├── 0=ocfl_object_1.0
    ├── inventory.json
    ├── inventory.json.sha512
    └── v1/
        ├── inventory.json
        ├── inventory.json.sha512
        └── content/
            ├── .fcrepo/
            │   ├── fcr-root.json          <-- Required "header" file holding system metadata about the container. See description below.
            │   └── fcr-root~fcr-acl.json  <-- Optional, only present if this Fedora resource has its own ACL.
            ├── fcr-container.nt           <-- Required file for holding user-properties describing the container. See description below.
            └── fcr-container~fcr-acl.nt   <-- Optional, only present if this Fedora resource has its own ACL.

Naming conventions

System content (located in the ".fcrepo/" directory)

  • The container's header file: the filename is always "fcr-root.json". This allows Fedora to find the header file without knowing if the resource is a binary, container or archival group.
  • The container ACL's header file: the filename is always "fcr-root~fcr-acl.json".

User content

  • The container user-properties: the filename is always "fcr-container.nt".
  • The container's ACL: the filename is always "fcr-container~fcr-acl.nt".

Archival Group Resources

A Fedora archival group resource is container that contains a potentially nested hierarchy of zero or more children containers and/or binaries. Below is a comparison between a plain OCFL representation of a compound object and Fedora's archival group resource representation.

Compound OCFL Object

The following is an example of an OCFL object that contains one version, consisting of two binaries, one of which is nested within a child container.

No Format
[object root]
    ├── 0=ocfl_object_1.0
    ├── inventory.json
    ├── inventory.json.sha512
    └── v1/
        ├── inventory.json
        ├── inventory.json.sha512
        └── content/
            ├── image.tiff
            └── foo/
                └── bar.xml

Fedora Archival Group

Archival groups are a way to collect several Fedora resources into a single OCFL object. The constituent Fedora resources (archival parts) can be any combination of containers and binaries.

Note: the ACL resources depicted in this diagram may be omitted if the given resource chooses to inherit its ACL from an ancestor resource in the nested hierarchy or even from the repository's default ACL.

No Format
[object root]
    ├── 0=ocfl_object_1.0
    ├── inventory.json
    ├── inventory.json.sha512
    └── v1/
        ├── inventory.json
        ├── inventory.json.sha512
        └── content/
            ├── .fcrepo/
            │   ├── fcr-root.json              <-- Required "header" file holding system metadata about the archival group.
            │   ├── fcr-root~fcr-acl.json      <-- Optional, only present if this Fedora resource has its own ACL.
            │   ├── image.tiff.json            <-- Required "header" file holding system metadata about the binary.
            │   ├── image.tiff~fcr-desc.json   <-- Required "header" file holding system metadata about the binary's description.
            │   ├── image.tiff~fcr-acl.json    <-- Optional, only present if this Fedora resource has its own ACL.
            │   ├── foo.json                   <-- Required "header" file holding system metadata about the nested container.
            │   ├── foo~fcr-acl.json           <-- Optional, only present if this Fedora resource has its own ACL.
            │   └── foo/                       <-- Required nested structure within .fcrepo/ mirrors content structure
            │       ├── bar.xml.json           <-- Required "header" file holding system metadata about the binary.
            │       ├── bar.xml~fcr-desc.json  <-- Required "header" file holding system metadata about the binary's description.
            │       └── bar.xml~fcr-acl.json   <-- Optional, only present if this Fedora resource has its own ACL.
            ├── fcr-container.nt               <-- Required file for holding user-properties describing the archival group container.
            ├── fcr-container~fcr-acl.nt       <-- Optional, only present if this Fedora resource has its own ACL.
            ├── image.tiff
            ├── image.tiff~fcr-desc.nt         <-- Required "binary description".
            ├── image.tiff~fcr-acl.nt          <-- Optional, only present if this Fedora resource has its own ACL.
            └── foo/
                ├── fcr-container.nt           <-- Required file for holding user-properties describing the archival part container.
                ├── fcr-container~fcr-acl.nt   <-- Optional, only present if this Fedora resource has its own ACL.
                ├── bar.xml
                ├── bar.xml~fcr-desc.nt        <-- Required "binary description".
                └── bar.xml~fcr-acl.nt         <-- Optional, only present if this Fedora resource has its own ACL.

Naming conventions

System content (located in the ".fcrepo/" directory)

  • The archival group container's header file: the filename is always "fcr-root.json". This allows Fedora to find the header file without knowing if the resource is a binary, container or archival group.
  • The archival group container ACL's header file: the filename is always "fcr-root~fcr-acl.json".
  • The archival part container's header file: the filename is the name of the archival part container with the addition of the ".json". ("foo.json" in this example)
  • The archival part container ACL's header file: the filename is the name of the archival part container with the addition of the suffix, "~fcr-acl.json". ("foo~fcr-acl.json" in this example)
  • The binary's header file: the filename is the name of the binary with the addition of the ".json" extension. ("image.tiff.json" and "foo/bar.xml.json" in this example)
  • The binary description's header file: the filename is the name of the binary with the addition of the suffix, "~fcr-desc.json" extension. ("image.tiff~fcr-desc.json" and "foo/bar.xml~fcr-desc.json" in this example)
  • The binary ACL's header file: the filename is the name of the binary with the addition of the suffix, "~fcr-acl.json" extension. ("image.tiff~fcr-acl.json" and "foo/bar.xml~fcr-acl.json" in this example)

User content

  • The archival group container user-properties and sub-container user-properties: the filename is always "fcr-container.nt".
  • The archival group container's ACL and sub-container's ACL: the filename is always "fcr-container~fcr-acl.nt".
  • The binary: the filename provided during ingest of the archival part binary is retained in the OCFL persistence. ("image.tiff" and "foo/bar.xml" in this example)
  • The binary's description: the filename is the name of the binary with the addition of the suffix, "~fcr-desc.nt". ("image.tiff~fcr-desc.nt" and "foo/bar.xml~fcr-desc.nt" in this example)
  • The binary's ACL: the filename is the name of the binary with the addition of the suffix, "~fcr-acl.nt". ("image.tiff~fcr-acl.nt" and "foo/bar.xml~fcr-acl.nt" in this example)
  • The sub-container(s): the filename provided during ingest of the archival part container is retained in the OCFL persistence. ("foo/" in this example)

Storage Hierarchy Layout

The content that Fedora persists (user and system files) is a complete representation of the repository. These files are written to OCFL objects within a top-level storage hierarchy. This section describes the storage hierarchy layout and the algorithm for mapping an OCFL object identifier to its storage hierarchy path.

Unless configured otherwise, Fedora uses the default configuration of the "OCFL Community Extension 0003: Hashed Truncated N-tuple Trees for OCFL Storage Hierarchies" (0003-hashed-n-tuple-trees). That configuration is:

No Format
{
    "digestAlgorithm": "sha256",
    "caseMapping": "toLower",
    "tupleSize": 3,
    "numberOfTuples": 3,
    "shortObjectRoot": false
}

The SHA-256 hash of a Fedora resource's ID is calculated, converted to lowercase, then split into three-character segments. The first three segments are used in creating the first three levels of directories below the OCFL storage root. The OCFL object is persisted within a fourth-level directory, which is the entire, lowercase, SHA-256 value previously calculated. The extension definition provides an example.

Specification of Fedora-specific OCFL Files

Every Fedora OCFL object contains a ".fcrepo/" directory in the root of its "content/" directory. This directory contains JSON "header" files that describe the resources contained in the OCFL object. They contain information that enables Fedora to know how it is supposed to interact with a resource. Header files must be identifiable by Fedora without knowing anything about a resource other than its ID.

Outside of the ".fcrepo/" directory, all other OCFL object files store the content of Fedora resources. Every Fedora resource is persisted to OCFL using exactly two files. The first is a header file located in ".fcrepo/" and the second is a content file located in the root of the OCFL object's "content/" directory. The naming and location of content files can vary depending on the type of resource they belong to. It is not necessarily possible to identify a resource's content file without first determining what type of resource it is by reading its header file.

RDF Files

Fedora creates RDF files for three purposes:

  1. For persisting WebACLs associated with a given Fedora resource
  2. For persisting user-provided descriptions of binary resources
  3. For persisting user-provided properties associated with container resources

ACLs

The RDF content of WebACLs is user-provided, and must conform to the Web Access Control specifications. WebACL files are optional, and will only exist if the user creates an ACL on a Fedora resource.

Binary Descriptions

The RDF content of binary descriptions is user-provided, and can be any valid RDF. By convention, the content of the RDF should be used to describe the associated binary resource. Binary description files must exist for each binary resource and are auto-created by Fedora as empty files when a binary resource is created.

Container Properties

The RDF content of container resources is user-provided, and can be any valid RDF. Container property files must exist for each container resource and are auto-created by Fedora as empty files when a container resource is created.

Header Files

For a full description of the contents of Fedora header files, see Fedora Header FilesThese restrictions are necessary to prevent server managed files from being overwritten, while, at the same time, avoiding completely segregating server managed and user managed files within separate directories within the OCFL object.