Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

  1. Danny Bernstein   
  2. Andrew Woods 
  3. David Wilcox 
  4. Peter Winckles
  5. Ben Cail (star)
  6. Aaron Birkland
  7. Ben Pennell
  8. Paul Cummins
  9. Bethany Seeger
  10. Jared Whiklo
  11. Yinlin Chen

Part 2:

  1. Danny Bernstein   
  2. Andrew Woods  (star)
  3. Peter Winckles
  4. Ben Cail 
  5. Aaron Birkland
  6. Ben Pennell
  7. Jared Whiklo

Agenda

  1. Announcements
    1. #sprints channel 
    2. #github channel?
    3. kubernetes support in fcrepo4-docker in Deployment Tooling
    4. #general channel community issue
  2. fcrepo-upgrade-utils
    1. Minimal 4 →5 migration needs testing  and code review:
      1. https://github.com/fcrepo4-exts/fcrepo-upgrade-utils/pull/17
  3. Update on Fedora 6 Pilots (NLM, Docuteam, UWM)
    1. migration-utils PRs
  4. Java 11:  When? 
  5. Sprint Planning
    1. 6.0 Architecture Review
    2. Transaction Sidecar Spec Update
    3. OCFL community feedback on OCFL client with Database as authoritative metadata source
    4. https://github.com/pwinckles/ocfl-java-parent 
    5. Problems requiring design
      1. Tombstone Support in 6.0.0 
      2. Caching/indexing strategy 
        1. What caches and indexes do we need(ie in what layer(s)?)
          1. OCFL client 
          2. Persistence Implementation (OCFL) 
          3. Kernel Implementation
        2. Physical location of the cache (assuming we want to plan for horizontal scalability support)
          1. Cache per instance? 
            1. synchronizing changes across instance
          2. 1 Global cache/index?  
  6. Your topic here...

...

  1. In Review

    Expand

    Jira
    serverDuraSpace JIRA
    jqlQueryfilter=13100
    serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5


  2. Please squash a bug!

    Expand

    Jira
    serverDuraSpace JIRA
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    maximumIssues20
    jqlQueryfilter=13122
    serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5


  3. Tickets resolved this week:

    Expand

    Jira
    serverDuraSpace JIRA
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    maximumIssues20
    jqlQueryfilter=13111
    serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5


  4. Tickets created this week:

    Expand

    Jira
    serverDuraSpace JIRA
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    maximumIssues20
    jqlQueryfilter=13029
    serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5


Notes

  1. Announcements

...

  1. 5.1.0 release is out
    • Includes bugfix that allows import/export to work correctly
    • Testing of round-tripping is encouraged
  2. Michael Klein has put out a new Docker image

Fedora 6 Pilots

General updates

  1. Three pilot institutions
    • DocuTeam
    • Natl Library of Medicine
    • Univ of Wisconsin - Madison
  2. It would be great to have addtional "partners"
    • Even if not as full "pilots"
  3. Collecting requirements from pilot partners
  4. Pilot effort will be running for the ~next year
  5. Pilot meetings will be scheduled as needed
  6. There is an open #fedora6-pilots Slack channel

OCFL Implementation

  1. Peter W has put something together recently
  2. Open questions around what the interface should look like
  3. Currently supports PUT/GET... and most of the OCFL spec (smile)

F6 Architecture

  1. Important for us to establish a shared understanding of transactions/locking/etc on how the architectural layers should interact
  2. https://wiki.duraspace.org/display/FF/Fedora+2019+Architectural+Diagrams
    • HTTP Layer remains intact
    • Kernel Layer
      • Could potentially support swappable implementations
      • Session manages transactions/locking
    • Persistence Layer
      • Some features from prior ModeShape layer will move here
    • OCFL Layer
      • State may be maintained at different layers
  3. Transactions and Locking
    • Currently, support txns across resources
    • Maybe scoping txns to a single OCFL object may make sense
    • If we want to scope txns across objects, that would have to be managed at the Fedora level... and the problem becomes more challenging
    • Proposal: scope txns to a single OCFL object
      • Primary use case: batch loader creating a complex resource
    • Do we want to consider architecting for cross-object txns?
      • Design the HTTP API to potentially support other scopes of txn, but implement for the single OCFL object case
      • Archive Groups may be a door for broader scope for txns
    • Would be nice if the txn-endpoint were advertised as a resource header... as opposed to a triple
    • Ideally not coupling txn-model to ldp-model
      • Bulk updates require cross-object txns
    • Do we need to maintain session state across clients?

...The conversation will be continuing at 1pm ET

Horizontal scalability: 

  • Aaron Birkland : sticky sessions is probably as far as we want to go;  distributed transaction management is going to be very complicated.

migration-utils 

Mike Durbin created migration-utils to facilitate 3-4 migration.  Andrew has extended that project to migrate from Fedora 3 to OCFL.

Expectations for migration-utils:

1) Akubra

2) Legacy FS

3) Archival Export of Fedora 3

  • Mike's code iterates the tree  and migrates the resources.
  • Andrew Woods  has implemented an OCFL writer using Aaron Birkland 's OCFL go client.
  • Results were promising using Peter Winckles 's  data from UW-Madison
  • Question:  should we consider using Peter Winckles  's  java client instead / in addition to? 

It would be good to fix the travis.yml file to install Go and the ocfl client.

Meeting part II

Transactions

Action on transactions?  Seems like we don't want to preclude arbitrary resources in transactions

Also may need to look at the transactions API.  The existing transaction API isn't part of the Fedora spec.  There is a draft of a new API, but it has no standing

Action: Peter Eichman and Ben Pennellgather opinions on transactions and APIs

Question:  How much do we have to plan about distributed/multiple Fedora instances?

To what extent do we have to plan for this pathway?

Ben Cail:  Not interested in multiple Fedora instances.  Don't anticipate needing to scale, don't want to worry about concurrency.  Would look at it if it came for free

Maybe that's a question that should go out to the community?

Don't want to get into transaction management.  

State/locking

How much can we guarantee?

If we do a transaction involving bits and pieces of multiple OCFL objects, it is even possible to roll that back without direct support for these sorts of transactions in the OCFL client?  Probably not.

Strawman:  What if an OCFL client supported such transactions like this?  How could that be implemented?  Is it even desirable?

  • Write file content to staging places as usual
    • If a failure/rollback happens here, we just have un-referenced staged content we can garbage collect later (GC)
  • Maintain a database table that authoritatively OCFL metadata (that which gets written to inventory files)
  • Upon "commit", copy files to the right location in OCFL
    • If this fails, then these copied files can be GC'd later.  They aren't referenced by any inventory files, so as far as other OCFL clients are concerned, the incompletely-copied files are invisible.  That being said, the OCFL objects they've been copied to are technically invalid until these files are removed.
  • Commit (or roll back) that db table.  Report transaction success if that succeeds
  • Asynchronously, start writing the inventory files to OCFL that references the copied content.
    • If there's a failure here, that's OK.  The authoritative DB still has the information in it to re-try until all the inventory files on the FS agree with the db.

...


    1. sprints channel open & participants invited
    2. #github - Slack integration with github for PRs, ... - Jared will create the new #github channel.
      1. What should we do with GitHub issues?
    3. fcrepo4-docker - has kubernetes support now: Deployment Tooling
    4. #general channel - issue with simple file persistence error. Size is wrong.
      1. point Islandora to mysql or postgres for default configuration.
      2. look at updating documentation.
        1. The hyrax setup instructions (https://github.com/samvera/hyrax/wiki/Production-Installation-Overview)
          links to Deploying Fedora 4 Complete Guide
          which says to use https://github.com/fcrepo4/fcrepo4/blob/4.7-maintenance/fcrepo-configs/src/main/resources/config/file-simple/repository.json
          which uses the simple file system persistence approach
  1. fcrepo-upgrade-utils
    1. should work for basic setups. People should try it out. Doesn't work with ACLs yet, but that's next.
  2. Fedora 6 pilots
    1. OCFL client (https://github.com/pwinckles/ocfl-java-parent) - implements most of the spec.
      1. API - put/get at object level, read/update for individual files. Automatically commits (in single version) at the end of the object update changes.
      2. Doesn't handle multiple clients writing at the same time (& Go client doesn't currently support that, either).
        1. OCFL spec talks about deposit directory for staging content, but defines content at rest, not content in motion.
        2. Aaron suggested OCFL RFCs repo - implementation patterns, definitions, clarifications, ...
        3. What functionality should an OCFL client have? It should be written up.
  3. Java 11
    1. Peter's code uses 11. Should make a decision at some point and switch everything.
    2. Ideal scenario:  all fedora eco-system tooling (fcrepo4, fcrepo-came-toolbox, fcrepo-import-export, fcrepo-upgrade-utils) will compile Java 11 source. 
      1. bytecode compatibility ?  Should that go to 11 as well? 
    3. Draft JVM policy re-wording https://docs.google.com/document/d/1ulXLOF2dL-vEUKhUFjkL8UG97wjXJ-9dbmytKVwzDMM/edit
    4. Current Oracle LTS is Java 11

    5. We should likely move to Java 11 this calendar year

    6. How hard is it for the community to move to Java 8

      1. Andrew to email to community list

      2. What is our plan?

        1. LTS releases: 4.7

        2. Fedora 6 to use openjdk-11

        3. needs Java 11 to run

  4. Sprint planning
    1. OCFL implementations
      • OCFL Community meeting next week, Wednesday
    2. Conversation to continue on Slack

      • Request for Aaron B to document thoughts
        • Separation of concerns: what does Fedora need to know, what is internal to the OCFL client
      • Danny to update diagram (include 'deposit' space and 'versions')
    3. Transaction sidecar spec
    4. OCFL client with database as metadata store
    5. Conversation to continue on Slack

      • Request for Aaron B to document thoughts
        • Separation of concerns: what does Fedora need to know, what is internal to the OCFL client
      • Danny to update diagram (include 'deposit' space and 'versions')

Actions

  •  Danny Bernstein  will reach out to Greg about DRASTIC test results
  •  Aaron Birkland  to work with Andrew Woods to get the Go client working on travis.
  •  Aaron Birkland  to look explore notion of OCFL client with database as authoritative metadata source + asynchronous writing of the inventory.json file
  •  Peter Eichman   and maybe Ben Pennell to make recommendations re transaction side car specification.
  •   

...