Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Minimize change to the user via the API
  2. Retain URLs of migrated Fedora resources
  3. Compliance with OCFL
  4. Do not allow OCFL-isms from bleeding into Fedora API
  5. Rebuildability
  6. Compliance with OCFL
  7. Retain URLs of migrated Fedora resources
  8. Performance
  9. Reduce complexity of implementation
  10. Performance
  11. Reduce complexity of implementation

Issues being addressed

Based on feedback from users of Fedora 4 and 5, the design for the next major release of Fedora will address the following issues:

Preservation persistence

The notions of "completeness" and "transparency" are important when it comes to how a preservation repository persists its resources (metadata and binaries) to storage. The resources in storage should be "complete" in the sense that Fedora should be able to rebuild its indexes based on what is stored on disk as files. The documented transparency of those persisted files also allows for other applications to consume those resources. See section below: OCFL Persistence.

Query service

The ability to query Fedora for basic information regarding the contents of the repository has been a missing feature in Fedora 4 and 5. This design will include a simple query service for inspecting all of the Fedora resources or resources based on specific attributes. See section below: Query service.

OCFL persistence

Architecture

...

  1. Provide new implementation of fcrepo-kernel-api that interacts with OCFL persistence
  2. Interactions with OCFL persistence should initially take advantage of the JHU OCFL client
  3. For pre-existing OCFL storage hierarchies, Fedora-imposes the following constraint:
    1. The OCFL storage hierarchy must have a single, consistent "ocfl_layout" (i.e. the storage path mapping algorithm must be determinant)
  4. Many members: performance should improve significantly since list of members will be supplied by a database index (which should support a degree of in-memory caching)
  5. Deleting tombstone of OCFL Object purges the Object

  6. Deleting tombstone of "constituent part" is not supported (405)

Prototyping proposal

  1. Expose JHU OCFL client functionality with minimal HTTP endpoints
    1. Such an endpoint should implement minimal LDP interactions
  2. Use HTTP over OCFL to test:
    1. Performance bottlenecks
    2. Scale viability (e.g. NLM migration)
    3. User expectations, ergonomics

...

  1. Requirements:
    1. Check fixity of binary resource(s) by comparing computed value with stored value
    2. Check fixity of binary resource(s) given a specific set of Fedora object rdf:types
    3. Persist results of fixity check
      1. In log file?
      2. In database?
      3. In Fedora?
  2. Scheduled fixity service:
    1. Probably not part of the core
    2. Run as a separate service (see: Riprap)
    3. Schedule based on Potentially implemented as a circular queue of Fedora resources, ordered by "last fixity check" date property on Fedora resource
  3. Retain "fedora:hasFixityService" triple or header
  4. At the OCFL-level, interest in providing fixity over an OCFL storage hierarchy

Query service

  1. Should also consider this "Query Service Specification"
  2. Proposal: Query service / endpoint should support the following queries:
    1. List all resources
    2. List resources by mimetype
    3. List resources by parent
    4. List resources by mimetype, parent, and modified date (<>=)
    5. List resources where modified  <> x date
  3. Open questions around scope of resources to be searchable
    1. Fedora resources?
    2. Resources defined in RDF documents within the repository?
    3. Hash URIs?
  4. Open questions around properties to support
    1. Server-managed triples?
    2. All properties?
  5. Triplestore not necessarily required

...

  1. Proposal: no change to the Fedora API spec in 6
  2. We will either:
    1. align code with the (as-yet-to-be-ratified) side-car specification
    2. leave HTTP API unchanged while introducing the possibility of auto-versioning on transaction completion
  3. Potentially store updates within a transaction in a "txn/" directory at the sibling-level with OCFL version directories
  4. Support actions on multiple OCFL objects within a single transaction
  5. Deleting tombstone of OCFL Object purges the Object

  6. Deleting tombstone of "constituent part" is not supported (405)

Raw notes

  1. General VA Beach Meeting notes
  2. Design summary notes
  3. Migration notes
  4. Object modeling notes
  5. Versioning notes
  6. Fixity notes
  7. Bulk ingest notes
  8. Query service notes