Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


    • Are immediate updates required?
    • We should version the API independently
      1. This offers multiple backend implementations/optimizations
      2. A. Soroka: I think this requires a stronger definition of the API than currently exists in the form of user documentation. I suggest defining the API as ontology extensions to LDP.
      3. Clarifying and publicizing (formally and informally) the relationship between the Fedora API and LDP.
  2. Performance
    1. Read
    2. Writes
      1. Many small files
      2. Large files
      3. High throughput
    3. Scalable serialization to disk
      • Need to measure scale of load that async serialization can meet
      • Need to clarify async approaches: messaging and sequencers
    4. Replication of objects to another repository instance
    5. Full re-indexing
    6. Full integrity checks
  3. Multi-node / Clustered configurations
    1. High availability
    2. Bulk ingest
    3. High read loads
    • Note: generally need to define what clustering provides
  4. ModeShape
    1. Assess persistence approach (i.e. bit-level object and datastream persistence)
      1. Some backup/restore details: Backup and Restore
  5. Evolution-capability - The system permits graceful (incremental) changes without having to perform replacement of large parts of the system in one step
    1. The software permits the graceful replacement of old technology with new technology
    2. The software permits the integration of new technology gracefully
    3. New content formats can be added easily, and the system permits gracefully delivering new representations for existing content
    4. New capabilities can be added or old ones replaced gracefully
    5. Underlying hardware and software infrastructures can be replaced gracefully, and the system can use advances in technology or special characteristics of its technical infrastructure without changing the core Fedora software
  6. Ability to use in various integration patterns
    1. Inbound and outbound transformation
    2. Content Enrichment pattern for ingest at least
    3. Internal and external event driven (notification) patterns (especially external notification that an asynchronous operation is complete)
    4. Idempotent receiver pattern
    5. Message Bridge pattern
  7. Storage Options
    1. Tiered-storage
      1. Support having all or part of the content low performance storage including copies in near-offline storage
      2. Support having all or part of the content on offline storage (like tape - where items are not available until after staging)
      3. Support having meta information stored in offline or near offline
    2. Support storage other than file systems and using that storage's special features
      1. Object stores like S3 or Isilon
      2. Streaming stores for low latency, low dropout functions such as audio and video delivery
      3. Tape
    3. Support having specialized indices particularly for locating copies, metadata or discovery data, also removal of latency
      1. Direct queries to appropriate an appropriate index
      2. Marshall results from multiple indices
  8. Preservation-worthiness
    1. These comments are based on the assumption that the only form we currently know how to preserve is a serialized form, also some features overlap
    2. Permit copies to be made, maintained and validated at one or more geographically remote locations
    3. All archivally significant data is, at some point, stored in a serialized form
    4. No notification that results in the destruction of the original source materials is issue until all steps of the preservation policy are executed and verified
      1. e.g. content progresses from a (possibly) non-serialized form, to a serialized from and n copies are made, followed by a check of essential characteristics
      2. There is some definition of the essential characteristics of the representations that can be delivered for the unit of preservation
      3. There is some definition of the unit of preservation
    5. Bitstream level fixity of "preserved" representations can be verified
    6. Fixity of meta information can be verified
    7. Some approach to authenticity is selected and used including at least lifecycle records (one kind of audit record)
    8. Records of system operations including configuration changes are kept (a second kind of audit record)
