...
- Danny Bernstein
- David Wilcox
- Jared Whiklo
- Peter Winckles
- Daniel Lamb
- Ben Pennell
- Thomas Bernhart
- Calvin Xu Mike
- Ritter Michael Ritter
Agenda
- Announcements
- 2021-04 Fedora 6 10 Camel Toolbox Sprint
- North American User Group Debrief
- Pilots / Testing
- Performance issues needing attentionPre release short list of bug fixes and improvements
- Short-lived transactions related issues
Jira server LYRASIS JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key FCREPO-3697 Jira server LYRASIS JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key FCREPO-3695 Jira server LYRASIS JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key FCREPO-3696
- Migration Utils slow-down
- Do we know the problem is neither memory nor IO bound?
- Are there JVM tunings that we haven't tried?
- Is a heap dump likely to help us?
- Status of instrumenting migration-utils with micrometer:
Jira server LYRASIS JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key FCREPO-
3692
- Short-lived transactions related issues
- Pilots / Testing
- Pre release short list of bug fixes and improvements
Jira server LYRASIS JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key FCREPO-3620 Jira server LYRASIS JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key FCREPO-3561 Jira server LYRASIS JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key FCREPO-3638 Jira server LYRASIS JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key FCREPO-3672
Jira server LYRASIS JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key FCREPO-3690 Jira server LYRASIS JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key FCREPO-3691 - others
- Marmotta is retired: fcrepo-camel-toolbox/fcrepo-ldpath depends on it. What's our plan?
- Volunteers to be the maintainer on Your topicModeshape CI https://github.com/fcrepo-exts/modeshape/pull/3
- fcrepo-java-client question migration-utils
- Any other new tickets/issues to be considered for the release
- Your topicmigration util still have performance issue after tuning java heap size and turning off checksum validation
Tickets
In Review
Expand Jira server DuraSpace JIRA jqlQuery filter=13100 serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 Please squash a bug!
Expand Jira server DuraSpace JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution maximumIssues 20 jqlQuery filter=13122 serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 Tickets resolved this week:
Expand Jira server DuraSpace JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution maximumIssues 20 jqlQuery filter=13111 serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 Tickets created this week:
Expand Jira server DuraSpace JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution maximumIssues 20 jqlQuery filter=13029 serverId c815ca92-fd23-34c2-8fe3-956808caf8c5
Notes
- Announcements
- North American User Group
- Pilots / Testing
- Performance testing by Andrey, Peter and Jared
- Andrey and Jared were seeing fairly bad performance
- Andrey has performed the same tests on Fedora 4 and 5, and is seeing significantly worse performance on 6
- Will check to see what the specs of the machine running the tests were
- Peter and Danny have been unable to reproduce the poor performance
- Andrey and Jared were seeing fairly bad performance
- Including search index commits in same DB commit as the other db commits seems to maintain normal performance
- Search queries are fast except when getting back RDF types. Danny is working on resolving this in a separate PR.
- Comparing POSTs and PUTs - Jared reported that PUTs were a little bit slower
- Single object vs Multi-object transactions
- At the moment all transactions assume they are multi-object
- It may reduce the database latency significantly if single object transaction commits were implemented, but at the cost of considerable complexity
- what percentage of the latency is due to database? To determine how much of a concern this is.
- Calvin testing migration-util with larger heap, checksum turned off
- Initially faster, slowed back down to 25k per day after about 1.5 million objects
- Previous runs used about 30% cpu, but not it is only using about 7%
- It seems to still be processing image files, which are not particularly large
- UVA originally ran their migration before alpha2, had significantly slower performance in their second migration after alpha2
- Ben C created a ticket to look into slow resumption of migration-util, which iterates back through all the Fedora 3 objects.
- Performance testing by Andrey, Peter and Jared
- Pre release short list of bug fixes and improvements
- Any new tickets/issues to be considered for the release
- Performance Issues
- Short live transactions
- More of an impact on reads than writes, e.g. getting the root resource of your repository with 1m+ items
- Ordering adds more latency
- Ongoing work for short lived transactions
- Long lived transactions perform full join
- Short transactions perform a simpler query
- Could be expanded to writes
- Search Index
- Current work is for a synchronous update
- Question of how much of a performance impact there is when updating the index
- Possibility of adding config to update async through event bus from initial discussion
- Long running transactions in bad states
- If changes are already committed to the ocfl layer, changes can't be rolled back
- Might be able to attempt to rollback, then mark transaction as failed
- Can also prevent the transaction from being committed at all
- Migration utils
- For pid list: need to stop iterating the F3 repository once we've processed all items
- metrics: Try and capture bytes/sec
- Short live transactions
- fcrepo-java-client
- Need to follow through on java 11 update and update of integration tests.
- Thomas will take a look at the existing two tickets for updating it
- Modeshape
- Danny will look at the gh actions PR
- Will update Seth's PR to run gh actions to make sure tests pass
- We will review the unreleased updates to modeshape since the last release was cut, to determine if we want to sync up our fork
ocfl-java databasePeter will be checking on its usage of postgres for inventory caching and distributed object locking, as there are additional costs incurred by its use of separate processes per connection that he had not realized
Actions
- Look through migration util commits since December to see if there are any potential causes of performance drop off.