Regarding the performance test summary report: Nick Ruest was waiting on Colin Gross to get the R scripts together. But Colin has generated the data. Analysis and summary of graphs is needed. We should come up with some questions contextualize the graphs? For example: How does performance change as resources increase? Then we could drill down onto each kind of resource - size, type, etc. Long term it would be helpful to create a template for the summary so that we can generate the summary automatically based on the summaries.
Questions / Advice for a State of Performance and Scale page:
Joshua Westgard would be interested in best practices for performance wiki page :
What are the different ways to set up a repository and what are the performance characteristics and performance mitigation strategies for each type.
How to protect against performance degradation as repository grows?
Esmé Cowles seconded that thought.
Colin Gross is working on a mock file system that will enable testing of very large files. The file system path of an item is based on the SHA-1. The metadata cannot go into the mock system. Colin Gross 's python based FUSE file system is not quite ready. He will let us know when it reaches 0.1 he will let us know.
Esmé Cowles is has tested Danny's performance changes. He has not seen the performance improvements. But thinks it may have something to do with large numbers of hash-uris stored on a node and possibly inefficiencies in hydra updates. Developing...