Time/Place
This meeting is a hybrid teleconference and IRC chat. Anyone is welcome to join...here's the info:
- Time: 11:00am Eastern Daylight Time US (UTC-4)
- U.S.A/Canada toll free: 866-740-1260, participant code: 2257295
- International toll free: http://www.readytalk.com/intl
- Use the above link and input 2257295 and the country you are calling from to get your country's toll-free dial-in number
- Once on the call, enter participant code 2257295
- IRC:
- Join the #duraspace-ff chat room via Freenode Web IRC (enter a unique nick)
- Or point your IRC client to #duraspace-ff on irc.freenode.net
Attendees
Agenda
- 4.0 Features, raise any comments/concerns now
- Development focus on reaching 4.0
- Single-node Transactions issue
- Cluster, how do we get over the hump?
- other...?
Minutes
- 4.0 Release features
- Andrew: walkthrough of feature groups: candidates for beta/4.0 release, then others (prioritized, unprioritized, parallel feature development in Hydra/Islandora)
- Timeline: 4.0 Beta by end of Q1 (April), 4.0 point release by end of Q2
- Short-term goal: communicate to community, stakeholders the concrete timeline and set of features for the 4.0 release
- Andrew: what do the developers think of the candidate feature list and timeline?
- Scott: any features on the list in danger of slowing down the project, its release?
- Andrew: none of the features proposed cover 100% of the use cases associated with them, but they are all on track to deliver some of the requested functionality for the initial release. The only risky feature is clustering; we need to get it to work, and we need to get it performant (faster than single node, faster than Fedora 3, for CRUD operations)
- Stefano: timeline for releasing robust content modelling features?
- Andrew: Stefano is working on a branch for testing, developing complex CNDs and node types. Goals:
- Get a branch that builds
- Get others to test it out
- Stefano: Goal: he'll get a working branch by end of Q1
- Scott: Also: documentation for how you create, manipulate CNDs and objects, once his branch is in working order
(Andrew asks Stefano to make sure he lets others know he's working on a branch when describing build problems on the tech list, to avoid confusion about the state of master) - Stefano: Connector/sequencer features (Fedora/JMS connectors, Fedora/Modeshape filesystem connector)?
- Andrew: Implement the Fedora interface wrapper around the Modeshape filesystem connector interface: see Modeshape documentation, examples
- Frank, Osman: get clustering working
- Andrew: OK – we'll work on these listed features over the next month and a half.
- Single Node Transactions bug
- Kai, Mike Durbin, Adam working on it
- Background: overall Fedora 4 performance is slightly better than Fedora 3, but the session.save() method is costly in terms of performance. Goal of transactions: reduce the number of times we call session.save(), by bundling up several actions into a single transaction, then calling session.save() at the end of the transaction.
- Bug: when a principal is tied to a transaction, the http session gets wiped out
- Scott: how are multiple sessions, transactions handled when there is a shared single principal (such as fedoraAdmin?)
- Andrew: same principal attached to the transactions
- Table this discussion, as none of those working on the problem are present
- Cluster work
- Frank, Scott, Greg working on clustering this sprint
- Andrew: goals are:
- Get clusters to work (no bugs, no problems)
- Get it performant
- Scott: current work: has nodes provisioned, managed by puppet; working on deploying, managing Fedora cluster with puppet module
- Question: focus on getting something running first, then tooling?
- Andrew: balance between developing tools to make cluster deployment easy and fast, and just getting a cluster up and running; any tools you can develop to accomplish the second goal, great, but the priority is to iron out bugs and performance problems in clustered setups
- Scott: will focus on getting a Fedora cluster up and working, by end of Friday
- Frank: working on resolving two problems:
- Serialized processing: have to create parent objects, wait for them to replicate across nodes, then create their child objects (slow)
- Unsynchronized commits
- Frank: Goal is to get 10% speed improvement over Fedora 3 for ingests
- overhead of managing cluster is expensive
- Denmark use case: ingesting large numbers of large binary objects (audio/video) – the higher the number of objects being ingested, the slower the cluster runs
- Andrew: can you create properties on the object at the same time you create the object? Frank agrees, will try that
- Scott: read tests? (argues that if reads are fast, then maybe slower ingests aren't so important)
- Frank: Hasn't done any thorough read tests yet. Agrees that most repositories will be mostly WORM-ish, but that if you can't get your stuff ingested in a timely manner, then repo admins will be frustrated – need to get ingests performant. Andrew nods head in agreement.
- Andrew: are you wrapping your ingests in transactions?
- Frank: not really: using direct Java API to perform actions, within a single http session
- Andrew: are you splitting up the ingests across the nodes into groups of ingests per node?
- Frank: using queues. Query the cluster for info about nodes, creates a queue for each node, feeds ingests into queues.
- Andrew: using a load balancer?
- Frank: Set one up, then turned it off, to implement the queue-based processing directly against each node
- Andrew: Frank, Scott, Greg: work together on solving cluster issues (Scott nods head)
- Frank: Scott can help by getting his Fedora cluster minimally configured, and reproduce Frank's issues
Frank will push his scripts for configuring Fedora clusters up to github, for Scott/Greg to adapt and use
- Frank: Scott can help by getting his Fedora cluster minimally configured, and reproduce Frank's issues
Meeting adjourned.
New Actions
Scott: Get a cluster similar to Frank's up and running by Friday, Feb. 14th
Frank: Put cluster config scripts up on github for Scott, Greg to use
Stefano: continue work to get content modelling branch working, ready for testing by others