...
- REST API
- Are immediate updates required?
- We should version the API independently
- This offers multiple backend implementations/optimizations
- A. Soroka: I think this requires a stronger definition of the API than currently exists in the form of user documentation. I suggest defining the API as ontology extensions to LDP.
- Clarifying and publicizing (formally and informally) the relationship between the Fedora API and LDP.
- Performance
- Read
- Writes
- Many small files
- Large files
- High throughput
- Scalable serialization to disk
- Need to measure scale of load that async serialization can meet
- Need to clarify async approaches: messaging and sequencers
- Replication of objects to another repository instance
- Full re-indexing
- Full integrity checks
- Multi-node / Clustered configurations
- High availability
- Bulk ingest
- High read loads
- Note: generally need to define what clustering provides
- ModeShape
- Assess persistence approach (i.e. bit-level object and datastream persistence)
- Some backup/restore details: Backup and Restore
- Assess persistence approach (i.e. bit-level object and datastream persistence)
- Evolution-capability - The system permits graceful (incremental) changes without having to perform replacement of large parts of the system in one step
- The software permits the graceful replacement of old technology with new technology
- The software permits the integration of new technology gracefully
- New content formats can be added easily, and the system permits gracefully delivering new representations for existing content
- New capabilities can be added or old ones replaced gracefully
- Underlying hardware and software infrastructures can be replaced gracefully, and the system can use advances in technology or special characteristics of its technical infrastructure without changing the core Fedora software
- Ability to use in various integration patterns
- Inbound and outbound transformation
- Content Enrichment pattern for ingest at least
- Internal and external event driven (notification) patterns (especially external notification that an asynchronous operation is complete)
- Idempotent receiver pattern
- Message Bridge pattern
- Storage Options
- Tiered-storage
- Support having all or part of the content low performance storage including copies in near-offline storage
- Support having all or part of the content on offline storage (like tape - where items are not available until after staging)
- Support having meta information stored in offline or near offline
- Support storage other than file systems and using that storage's special features
- Object stores like S3 or Isilon
- Streaming stores for low latency, low dropout functions such as audio and video delivery
- Tape
- Support having specialized indices particularly for locating copies, metadata or discovery data, also removal of latency
- Direct queries to appropriate an appropriate index
- Marshall results from multiple indices
- Tiered-storage
- Preservation-worthiness
- These comments are based on the assumption that the only form we currently know how to preserve is a serialized form, also some features overlap
- Permit copies to be made, maintained and validated at one or more geographically remote locations
- All archivally significant data is, at some point, stored in a serialized form
- No notification that results in the destruction of the original source materials is issue until all steps of the preservation policy are executed and verified
- e.g. content progresses from a (possibly) non-serialized form, to a serialized from and n copies are made, followed by a check of essential characteristics
- There is some definition of the essential characteristics of the representations that can be delivered for the unit of preservation
- There is some definition of the unit of preservation
- Bitstream level fixity of "preserved" representations can be verified
- Fixity of meta information can be verified
- Some approach to authenticity is selected and used including at least lifecycle records (one kind of audit record)
- Records of system operations including configuration changes are kept (a second kind of audit record)
...