Date: Thursday February 4, 9am PST
Attendees
- Mark Bussey
- Adam Wead
- Longshou Situ
- Vivian Chu
- Tom Johnson
- Rob Sanderson
- Esme Cowles
Agenda
- Goals of API (and SPI) work
- Defining an API with a clear spec, versioned independent of the implementation, etc.
- Including HTTP API, messaging
- Having a spec opens up the possibility of multiple implementations with different priorities
- We could use the existing (4.5.0) API as the baseline of the spec, and be thoughtful about changes going forward?
- Don't expect dramatic API change, but do expect some changes
- Maybe we should codify existing API, and also plan a new version that improves parts of the API, and have a predictable process for moving from one to the other
- Would like more predictability of API changes
- There are release candidates available for 2-4 weeks, and testing against them would help identify breaking changes earlier
- DCE supports multiple projects on multiple Fedora releases, and needs to manage changes
- How much of the API is stable? How do people know about upcoming changes?
- Some changes (e.g. removing JCR types) known about long in advance, could improve communication and predictability
- The weekly Fedora committers call is a good way to know about changes, but too high an overhead for many people to participate in
- Roughly quarterly meetings (HydraConnect, LDCX, etc.) would be more convenient
- In favor of frequent releases, but not breaking changes
- Would like breaking changes to be less frequent and better communicated, to make it easier to test and adapt to them
- Defining an API with a clear spec, versioned independent of the implementation, etc.
- Discussion of proposed services, in the context of Hydra
- CRUD
- Aligned with LDP, so already specified
- Fedora's HTTP API docs also cover the particular implementation choices (e.g., Prefer headers supported)
- Fedora complies with the LDP spec and wants to keep compliant
- Fixity checking
- On upload, you can provide a checksum and it will be verified
- Hydra doesn't support this now, but it could
- May want to have a slightly different approach: upload and checksum at the same time, and then compare checksums
- On demand, you can check that the resource on disk matches the recorded checksum
- On upload, you can provide a checksum and it will be verified
- Versioning
- Existing versioning API Fedora-specific
- The implementation is efficient and full-featured
- Implementing it might complicate other implementations
- The API spec should specify how an implementation that didn't support versioning would behave
- Or the API spec could require versioning, since many storage backends support versioning
- Would like to use the Memento API for version retrieval
- But there is no Memento spec for how to create versions
- Marmotta's Memento implementation isn't LDP-aligned, it just auto-versions triples
- Fedora could auto-version metadata to avoid needing to create them explicitly
- Non-versioning backends could just report the current version following the Memento spec
- But Fedora would need to have explicit versioning for binaries because storage concerns
- Fedora also has an API to restore versions
- But that could be a COPY from the old version URI to the current URI
- ActiveFedora has limited support for versioning (files only), so need to support metadata versioning, subtree versioning
- Now would be a good time to change the API, since Hydra isn't really using it now
- Would be good to include the broader LDP community into the versioning API discussion to encourage a LDP-wide versioning approach
- Wouldn't mind having auto-versioning, but would still like to be able to tag/label specific versions
- Don't want lots of extra versions of files because I version the metadata that links to it
- ActiveFedora can control this and decide when to create versions and/or label versions
- ACTION: Esme: Check whether creating a version of a tree also creates distinct versions of unchanged files
- Transactions
- Would like to consider all the changes in a transaction as a version
- Can do this now by opening a transaction, making changes, creating a version, and then committing the transaction
- Somewhat awkward for RESTful API, so there is probably not an existing standard
- The current API is a good strawperson
- Haven't heard any complaints about the API, non-Hydra clients are using it
- Current discussion about what aspects of ACID Fedora supports
- Definitely Atomicity and Durability
- Atomicity might require all items to happen at the same time – would be hard to support in a distributed environment
- Want to make it as easy as possible to support diverse backends and scalability requirements
- Consistency and Isolation might be limited
- Different implementations might have different levels (e.g. snapshot isolation vs. read-uncommitted), and implementations should advertise what they support
- Definitely Atomicity and Durability
- ACID is a set of guarantees for all updates, not just transactions, so it's important to consider them more broadly
- Would like to consider all the changes in a transaction as a version
- Authorization
- Fedora provides authorization, but Hydra (for historical reasons) doesn't take advantage of it
- Hydra does use WebACLs, but the implementation is different from what Fedora expects, so they are not compatible
- We should align them so Fedora could enforce Hydra's WebACLs for other clients
- Hydra also currently cannot provide the user who is making a request, which would be needed to enforce the WebACLs
- ActiveFedora would need to be refactored to allow per-request identification of the end user making the request
- ACTION: Adam and Esme will compare Fedora and Hydra WebACLs to see where they differ
- Fedora authorization assumes either the client or the servlet container is handling authentication and group membership information
- If there is an IndirectContainer, I shouldn't be able to use it to add triples to resources I don't have permission to write to
- ACTION: Rob will create a ticket to investigate this
- CRUD
- Other API concerns
- Would like to have some kind of packaged version of all of the resources that make up a Work
- There is a Camel component that can sync updates to a triplestore, disk, etc. which might meet this need.
- An RDF import/export functionality (like the current JCR/XML import/export functionality) would also meet this need, and could be a useful bulk edit API to address other concerns about the performance of editing multiple related resources.
- Muti-resource CRUD
- PCDM and Hydra Works mean that many users who used to have a single Fedora 3 object now have many Fedora 4 resources.
- It would be great to have LDP community agreement on how this should work
- We can all join the LDP next mailing list and discuss our approach, and then implement it in Fedora
- Would like to have some kind of packaged version of all of the resources that make up a Work
Reference
Notes