Time/Place

Attendees 

Agenda

  1. Status of establishing performance benchmarks
    1. This is needed in order to pinpoint areas for improvement
  2. MySQL and/or PostgreSQL testing?
  3. ...

Minutes

Status of establishing performance benchmarks

  • Scott Prater has been working on this but he is not on the call
    • Hit an issue (producing a stack trace) but was able to proceed
      • Issue may be due to LevelDB - need to retest with a different database
  • Running tests would reveal areas in need of improvement
    • Also produces benchmarks for comparisons down the road
  • Is anyone planning on executing more of these tests?
    • Difficult to get cycles to work on these tests
  • What is the state of the art right now?
    • Current production deployments have relatively small numbers of objects
    • DCE may know of some large implementations
    • We have tested with millions of objects but not with real data
    • NLM tested migration tool and discovered memory issues at around 150,000 objects
      • F4 needs to support millions of objects in production
  • Do we have tickets for tests?
    • Not yet, but we could create tickets from the test plan
    • Nick will create tickets based on discussion on the call
  • The tests are meant to exercise real world scenarios that would help us troubleshoot issues
    • If an institution ran into issues with ingest/migration we could compare with test scenarios to see where the differences lie
  • Ideally members of this group will participate in mailing list discussions and help others in the community with performance issues
  • Bill: Working on Hydra-in-a-box
    • Need to understand deployment scenarios under which Fedora will perform well (at a reasonable AWS cost)
    • Need to modify tests to try different databases and back-end stores
      • Running the same test with different configs to compare
    • Automated tests would be great to have for trying different scenarios 
      • Bill can spend some time working on this
  • Completion scenarios
    • Test ends when a certain number of resources have been created, performance requirements are not met, etc.
      • Long-running tests could be expensive from an AWS standpoint
      • Would it be possible to get useful information from a test without running it for days?
      • So far we have been focused on scalability rather than other performance characteristics, e.g. something that could be tested in minutes rather than days
  • NLM will continue testing migration-utils once the 4.5.1 release is available
  • Esme will run some JMeter tests