Time/Place
This meeting is a hybrid teleconference and slack chat. Anyone is welcome to join...here's the info:
- Time: 11:00am Eastern Daylight Time US (UTC-4)
- Audio/Video Conference Link: https://lyrasis.zoom.us/my/fedora
- Dial-in:
+1 408 638 0968
+1 646 876 9923
+1 669 900 6833
Meeting ID:
812 835 3771
- Dial-in:
Join fedora-project.slack.com on the "tech" channel
Attendees
Part 1:
- Danny Bernstein
- Peter Winckles
- Andrew Woods (out)
- David Wilcox
- Peter Eichman
- Joshua Westgard (out)
- Jared Whiklo
- Bethany Seeger
- Youn Noh
- Thomas Bernhart
- Ben Cail
- Rosie Le Faive
- Daniel Lamb
- Aaron Birkland
- Ben Pennell
Part 2:
Agenda
- Announcements
- Andrew, Danny and David will be out next week - volunteer to facilitate the this meeting?
- OCFL and Fedora: inventory.json bloat and what to do about it. Is OCFL intended for a small number of versions? And if so, is that intention at odds with autoversioning in Fedora
- Status on organizing a Fedora documentation review
- Applying a digital preservation framework (e.g. NDSA Levels of Digital Preservation) to Fedora 6
- Organizing Sprint work
- Review of Goals for Sprint 1
- Kick Off Meeting: Monday September 16 at 10am Eastern
- Tentative plan for who will focus on what:
- Major Areas of Work
- Design/Development
- Interface Definition
- Persistence API
- ?
- OCFL Client Development
- OCFL Java API
- OCFL Java Client Implementation
- Transactions
- Persistence API
- Interface Definition
- Documentation
- Matrix of all the pages a la 5.x Documentation Updates
- Review of docs, flagging pages that will need to be changed, deleted, or added
- Testing
- Performance Testing
- Import/Export/Migration
- ?
- Design/Development
- Sprint Planning
- 6.0 Architecture Review
- Coming to consensus on:
- Transaction Sidecar Spec Update
- Status
- API Test Suite PRs
- Minimal 4 →5 migration needs testing and code review:
- API Test Suite PRs
- Your topic here...
Tickets
In Review
Please squash a bug!
Tickets resolved this week:
Tickets created this week:
Notes
Part I
- Sprint kickoff meeting for 10:00ET
- Documentation review: Call for people who are interested in participating. Have heard back from 4 people so far. Now to connect and look to organize the sprint. Anyone who is interested can connect with David Wilcox.
- This first pass is just to review the documentation with an eye towards the usability, correctness and understand-ability of the docs. A second pass to align the software with Fedora 6 can occur later.
- Could we align Fedora with one or more preservation standards. These standards require a whole view towards preservation, but the software usually needs to do certain things and is the technical team considering these standards.
- OCFL filesystem bloat:
- It was known that in the case of a largish number of versions and a largeish number of OCFL objects. The inventory.json can become quite larger. Because there is a mapping between logical files and physical file paths. This must be performed during each version.
- Writing/parsing is not a huge issue and is mitigated by caching of the parsed inventory.
- The issue is the huge amount filesystem storage required by the inventory.json. There is a SHOULD in the spec which suggests that you store a version of the inventory.json in the version directory so end up storing several copies of these larger inventory files.
- If Fedora creates an OCFL version for each change, this could cause quite large file storage requirements.
- Things we could do to reduce/mitigate this file storage issue.
- Not use SHA-512 (though recommended) and use SHA-256 instead for smaller hashes.
- Don't store additional inventory.json files in each version. This is seemingly more of a soft-requirement for the purposes of forensics in the case of a file-system error.
- Are there any concrete actions we need to take or blockers for the upcoming sprint? None identified.
- Should bring these issues to the OCFL community (via Slack or meeting) to ensure they are aware of our concerns. Ben Cail will take the previous issue to the OCFL community for discussion amongst that group.
- Jared Whiklo feels that if the OCFL library is not ready to start, should we be working towards a simple DB + filesystem backend to get the core Fedora ITs working and spend time on those things.
- Getting the PersistantStorage API fleshed out might help with some of the OCFL vs simple filesystem backend questions.
Actions
- Aaron Birkland to look explore notion of OCFL client with database as authoritative metadata source + asynchronous writing of the inventory.json file
- Peter Eichman and maybe Ben Pennell to make recommendations re transaction side car specification.
- Andrew Woods will look into java 11 transition
- David Wilcox will review the NDSA matrix and pull out the concrete technical requirements that could be considered during the Fedora 6 development.
- Jared Whiklo will try to do some work on the PersistentStorage Interface.