...
- Danny Bernstein
- Andrew Woods
- David Wilcox
- Peter Winckles
- Ben Cail
- Aaron Birkland
- Ben Pennell
- Paul Cummins
- Bethany Seeger
- Jared Whiklo
- Yinlin Chen
Part 2:
- Danny Bernstein
- Andrew Woods
- Peter Winckles
- Ben Cail
- Aaron Birkland
- Ben Pennell
- Jared WhikloPeter Eichman
Agenda
- Announcements
- #sprints channel
- #github channel?
- kubernetes support in Deployment Tooling
- #general channel community issue
- fcrepo-upgrade-utils
- Minimal 4 →5 migration needs testing and code review:Kubernetes support
- Minimal 4 →5 migration needs testing and code review:Kubernetes support
- Update on Fedora 6 Pilots (NLM, Docuteam, UWM)
- migration-utils PRs
- Java 11: When?
- Sprint Planning
- 6.0 Architecture Review
- Transaction Sidecar Spec Update
- OCFL community feedback on OCFL client with Database as authoritative metadata source
- https://github.com/pwinckles/ocfl-java-parent
- Problems requiring design
- Tombstone Support in 6.0.0
- Caching/indexing strategy
- What caches and indexes do we need(ie in what layer(s)?)
- OCFL client
- Persistence Implementation (OCFL)
- Kernel Implementation
- Physical location of the cache (assuming we want to plan for horizontal scalability support)
- Cache per instance?
- synchronizing changes across instance
- 1 Global cache/index?
- Cache per instance?
- What caches and indexes do we need(ie in what layer(s)?)
- Your topic here...
...
In Review
Expand Jira server DuraSpace JIRA jqlQuery filter=13100 serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 Please squash a bug!
Expand Jira server DuraSpace JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution maximumIssues 20 jqlQuery filter=13122 serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 Tickets resolved this week:
Expand Jira server DuraSpace JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution maximumIssues 20 jqlQuery filter=13111 serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 Tickets created this week:
Expand Jira server DuraSpace JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution maximumIssues 20 jqlQuery filter=13029 serverId c815ca92-fd23-34c2-8fe3-956808caf8c5
Notes
- Announcements
...
- 5.1.0 release is out
- Includes bugfix that allows import/export to work correctly
- Testing of round-tripping is encouraged
- Michael Klein has put out a new Docker image
Fedora 6 Pilots
General updates
- Three pilot institutions
- DocuTeam
- Natl Library of Medicine
- Univ of Wisconsin - Madison
- It would be great to have addtional "partners"
- Even if not as full "pilots"
- Collecting requirements from pilot partners
- Pilot effort will be running for the ~next year
- Pilot meetings will be scheduled as needed
- There is an open #fedora6-pilots Slack channel
OCFL Implementation
- Peter W has put something together recently
- Open questions around what the interface should look like
- Currently supports PUT/GET... and most of the OCFL spec
F6 Architecture
- Important for us to establish a shared understanding of transactions/locking/etc on how the architectural layers should interact
- https://wiki.duraspace.org/display/FF/Fedora+2019+Architectural+Diagrams
- HTTP Layer remains intact
- Kernel Layer
- Could potentially support swappable implementations
- Session manages transactions/locking
- Persistence Layer
- Some features from prior ModeShape layer will move here
- OCFL Layer
- State may be maintained at different layers
- Transactions and Locking
- Currently, support txns across resources
- Maybe scoping txns to a single OCFL object may make sense
- If we want to scope txns across objects, that would have to be managed at the Fedora level... and the problem becomes more challenging
- Proposal: scope txns to a single OCFL object
- Primary use case: batch loader creating a complex resource
- Do we want to consider architecting for cross-object txns?
- Design the HTTP API to potentially support other scopes of txn, but implement for the single OCFL object case
- Archive Groups may be a door for broader scope for txns
- Would be nice if the txn-endpoint were advertised as a resource header... as opposed to a triple
- Ideally not coupling txn-model to ldp-model
- Bulk updates require cross-object txns
- Do we need to maintain session state across clients?
...The conversation will be continuing at 1pm ET
Horizontal scalability:
- Aaron Birkland : sticky sessions is probably as far as we want to go; distributed transaction management is going to be very complicated.
migration-utils
Mike Durbin created migration-utils to facilitate 3-4 migration. Andrew has extended that project to migrate from Fedora 3 to OCFL.
Expectations for migration-utils:
1) Akubra
2) Legacy FS
3) Archival Export of Fedora 3
- Mike's code iterates the tree and migrates the resources.
- Andrew Woods has implemented an OCFL writer using Aaron Birkland 's OCFL go client.
- Results were promising using Peter Winckles 's data from UW-Madison
- Question: should we consider using Peter Winckles 's java client instead / in addition to?
It would be good to fix the travis.yml file to install Go and the ocfl client.
Meeting part II
Transactions
Action on transactions? Seems like we don't want to preclude arbitrary resources in transactions
Also may need to look at the transactions API. The existing transaction API isn't part of the Fedora spec. There is a draft of a new API, but it has no standing
Action: Peter Eichman and Ben Pennellgather opinions on transactions and APIs
Question: How much do we have to plan about distributed/multiple Fedora instances?
To what extent do we have to plan for this pathway?
Ben Cail: Not interested in multiple Fedora instances. Don't anticipate needing to scale, don't want to worry about concurrency. Would look at it if it came for free
Maybe that's a question that should go out to the community?
Don't want to get into transaction management.
State/locking
How much can we guarantee?
If we do a transaction involving bits and pieces of multiple OCFL objects, it is even possible to roll that back without direct support for these sorts of transactions in the OCFL client? Probably not.
Strawman: What if an OCFL client supported such transactions like this? How could that be implemented? Is it even desirable?
- Write file content to staging places as usual
- If a failure/rollback happens here, we just have un-referenced staged content we can garbage collect later (GC)
- Maintain a database table that authoritatively OCFL metadata (that which gets written to inventory files)
- Upon "commit", copy files to the right location in OCFL
- If this fails, then these copied files can be GC'd later. They aren't referenced by any inventory files, so as far as other OCFL clients are concerned, the incompletely-copied files are invisible. That being said, the OCFL objects they've been copied to are technically invalid until these files are removed.
- Commit (or roll back) that db table. Report transaction success if that succeeds
- Asynchronously, start writing the inventory files to OCFL that references the copied content.
- If there's a failure here, that's OK. The authoritative DB still has the information in it to re-try until all the inventory files on the FS agree with the db.
...
- sprints channel open & participants invited
- #github - Slack integration with github for PRs, ... - Jared will create the new #github channel.
- What should we do with GitHub issues?
- fcrepo4-docker - has kubernetes support now: Deployment Tooling
- #general channel - issue with simple file persistence error. Size is wrong.
- point Islandora to mysql or postgres for default configuration.
- look at updating documentation.
- The hyrax setup instructions (https://github.com/samvera/hyrax/wiki/Production-Installation-Overview)
links to Deploying Fedora 4 Complete Guide
which says to use https://github.com/fcrepo4/fcrepo4/blob/4.7-maintenance/fcrepo-configs/src/main/resources/config/file-simple/repository.json
which uses the simple file system persistence approach
- The hyrax setup instructions (https://github.com/samvera/hyrax/wiki/Production-Installation-Overview)
- fcrepo-upgrade-utils
- should work for basic setups. People should try it out. Doesn't work with ACLs yet, but that's next.
- Fedora 6 pilots
- OCFL client (https://github.com/pwinckles/ocfl-java-parent) - implements most of the spec.
- API - put/get at object level, read/update for individual files. Automatically commits (in single version) at the end of the object update changes.
- Doesn't handle multiple clients writing at the same time (& Go client doesn't currently support that, either).
- OCFL spec talks about deposit directory for staging content, but defines content at rest, not content in motion.
- Aaron suggested OCFL RFCs repo - implementation patterns, definitions, clarifications, ...
- What functionality should an OCFL client have? It should be written up.
- OCFL client (https://github.com/pwinckles/ocfl-java-parent) - implements most of the spec.
- Java 11
- Peter's code uses 11. Should make a decision at some point and switch everything.
- Ideal scenario: all fedora eco-system tooling (fcrepo4, fcrepo-came-toolbox, fcrepo-import-export, fcrepo-upgrade-utils) will compile Java 11 source.
- bytecode compatibility ? Should that go to 11 as well?
- Draft JVM policy re-wording https://docs.google.com/document/d/1ulXLOF2dL-vEUKhUFjkL8UG97wjXJ-9dbmytKVwzDMM/edit
Current Oracle LTS is Java 11
We should likely move to Java 11 this calendar year
How hard is it for the community to move to Java 8
Andrew to email to community list
What is our plan?
LTS releases: 4.7
Fedora 6 to use openjdk-11
needs Java 11 to run
- Sprint planning
- OCFL implementations
- OCFL Community meeting next week, Wednesday
Conversation to continue on Slack
- Request for Aaron B to document thoughts
- Separation of concerns: what does Fedora need to know, what is internal to the OCFL client
- Danny to update diagram (include 'deposit' space and 'versions')
- Request for Aaron B to document thoughts
- Transaction sidecar spec
- https://github.com/fcrepo/fcrepo-specification-atomic-operations
- Peter E is working on this
- Interest in scoping transactions at the Archival-Group
- OCFL client with database as metadata store
- https://wiki.duraspace.org/display/FF/Fedora+2019+Architectural+Diagrams#Fedora2019ArchitecturalDiagrams-OCFLClientwithpersistentdatabaseasauthoritativemetadatasource
- Diagram needs to be re-worked
- Fedora transactions will need support from underlying OCFL client
- An OCFL client backed by a database may be able to support transactions that span multiple OCFL objects
- Diagram details an OCFL interaction with transactions scoped to a single OCFL object
- Todo: design how an OCFL client would support transactions over a single OCFL object
- Todo: design how an OCFL client would support transactions over multiple OCFL objects
- https://wiki.duraspace.org/display/FF/Fedora+2019+Architectural+Diagrams#Fedora2019ArchitecturalDiagrams-OCFLClientwithpersistentdatabaseasauthoritativemetadatasource
Conversation to continue on Slack
- Request for Aaron B to document thoughts
- Separation of concerns: what does Fedora need to know, what is internal to the OCFL client
- Danny to update diagram (include 'deposit' space and 'versions')
- Request for Aaron B to document thoughts
- OCFL implementations
Actions
- Danny Bernstein will reach out to Greg about DRASTIC test results
- Aaron Birkland to work with Andrew Woods to get the Go client working on travis.
- Aaron Birkland to look explore notion of OCFL client with database as authoritative metadata source + asynchronous writing of the inventory.json file
- Peter Eichman and maybe Ben Pennell to make recommendations re transaction side car specification.
...