Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Attendees

Part 1:

  1. Danny Bernstein   
  2. Andrew Woods (star)
  3. David Wilcox
  4. Peter Winckles
  5. Ben Cail
  6. Aaron Birkland
  7. Ben Pennell
  8. Paul Cummins
  9. Peter Eichman

Part 2

  1. Danny Bernstein   (star)
  2. Andrew Woods 
  3. Ben Cail
  4. Aaron Birkland (star)
  5. Ben Pennell
  6. Peter Eichman

Agenda

  1. Announcements
    1. Docker options: Deployment Tooling
    2. Fedora 5.1.0 Release
  2. Update on Fedora 6 Pilots (NLM, Docuteam, UWM)
    1. migration-utils work
      1. Configure "fedora4Client" to new implementation of the Fedora4Client.java interface
      2. Writing to OCFL instead of Fedora4/5/6 API
  3. Sprint Planning
    1. 6.0 Architecture Review
    2. Problems requiring design
      1. Transaction and Lock Management
        1. Transactions scope:  what do we need to support?
          1. Individual Resources
          2. Groups of resources
          3. Containment hierarchies 
        2. A Fedora transaction may span multiple HTTP Requests.   Assuming that we do not want to commit anything to OCFL until the Fedora Transaction is committed, how do we maintain the state of open OCFL Sessions 

          1. across requests?

          2. across instance reboots?
          3. across multiple horizontally scaled instances? 
          4.  How also do we ensure that two requests using the same transaction ID against the same OCFL object do not stomp on each other

        3. Globally accessible state (for horizontal scalability)
      2. Caching/indexing strategy 
        1. What caches and indexes do we need(ie in what layer(s)?)
          1. OCFL client 
          2. Persistence Implementation (OCFL) 
          3. Kernel Implementation
        2. Physical location of the cache (assuming we want to plan for horizontal scalability support)
          1. Cache per instance? 
            1. synchronizing changes across instance
          2. 1 Global cache/index?  
  4. OCFL implementation updates
    1. migration-utils pull-request
    2. Peter Winckles work-in-progress
      1. https://github.com/pwinckles/ocfl-java-parent
      2. https://github.com/pwinckles/ocfl-java-api
      3. https://github.com/pwinckles/ocfl-java-core
      4. https://github.com/pwinckles/ocfl-java-filesystem
  5. Your topic here...

...

...The conversation will be continuing at 1pm ET


Horizontal scalability: 

  • Aaron Birkland : sticky sessions is probably as far as we want to go;  distributed transaction management is going to be very complicated.

migration-utils 

Mike Durbin created migration-utils to facilitate 3-4 migration.  Andrew has extended that project to migrate from Fedora 3 to OCFL.

...

  • Mike's code iterates the tree  and migrates the resources.
  • Andrew Woods  has implemented an OCFL writer using Aaron Birkland 's OCFL go client.
  • Results were promising using Peter Winckles 's  data from UMW.UW-Madison
  • Question:  should we consider using Peter Winckles  's  java client instead / in addition to? 

It would be good to fix the travis.yml file to install Go and the ocfl client.


Meeting part II


Transactions

Action on transactions?  Seems like we don't want to preclude arbitrary resources in transactions

Also may need to look at the transactions API.  The existing transaction API isn't part of the Fedora spec.  There is a draft of a new API, but it has no standing

Action: Peter Eichman and Ben Pennellgather opinions on transactions and APIs

Question:  How much do we have to plan about distributed/multiple Fedora instances?

To what extent do we have to plan for this pathway?

Ben Cail:  Not interested in multiple Fedora instances.  Don't anticipate needing to scale, don't want to worry about concurrency.  Would look at it if it came for free

Maybe that's a question that should go out to the community?

Don't want to get into transaction management.  

State/locking

How much can we guarantee?

If we do a transaction involving bits and pieces of multiple OCFL objects, it is even possible to roll that back without direct support for these sorts of transactions in the OCFL client?  Probably not.

Strawman:  What if an OCFL client supported such transactions like this?  How could that be implemented?  Is it even desirable?

  • Write file content to staging places as usual
    • If a failure/rollback happens here, we just have un-referenced staged content we can garbage collect later (GC)
  • Maintain a database table that authoritatively OCFL metadata (that which gets written to inventory files)
  • Upon "commit", copy files to the right location in OCFL
    • If this fails, then these copied files can be GC'd later.  They aren't referenced by any inventory files, so as far as other OCFL clients are concerned, the incompletely-copied files are invisible.  That being said, the OCFL objects they've been copied to are technically invalid until these files are removed.
  • Commit (or roll back) that db table.  Report transaction success if that succeeds
  • Asynchronously, start writing the inventory files to OCFL that references the copied content.
    • If there's a failure here, that's OK.  The authoritative DB still has the information in it to re-try until all the inventory files on the FS agree with the db.

TODO:  Aaron Birklandflesh this out.  Maybe bounce it by the OCFL community.  Is an OCFL client that behaves that way "proper"?

Actions

  •  Danny Bernstein  will reach out to Greg about DRASTIC test results
  •  Aaron Birkland  to work with Andrew Woods to get the Go client working on travis.
  •  Aaron Birkland  to look explore notion of OCFL client with database as authoritative metadata source + asynchronous writing of the inventory.json file
  •  Peter Eichman   and maybe Ben Pennell to make recommendations re transaction side car specification.
  •