Below are the results of performance testing comparing performance of Fedora-based applications with real-world data.
Ingesting a large book with 1000 100MB TIFF images, repeated with Fedora 4.5.1 release (based on Modeshape 4), and the experimental Modeshape 5 branch (in both cases, Fedora was configured to use the PostgreSQL database object store). Durations are reported as HH:MM:SS, for batches of 100 images loaded using Princeton's Hydra head, Plum.
|Batch||Duration (Modeshape4)||Duration (Modeshape5)||Improvement|
Compared to objects with a large number of literal properties or URI properties, objects with a large number of links to repository objects are much slower. E.g., an object with 10,000 properties where the objects are literals or non-repository URIs can be retrieved in 200 milliseconds, but an object with 10,000 properties where the objects are repository objects takes 7-36 seconds, depending on the settings, storage backend, etc.
There are also significant differences between LevelDB and PostgreSQL/MySQL backends, with LevelDB being much faster: 7-10 seconds as opposed to 30+ seconds for the object with 10,000 links to repository objects.
See test scripts.
Testing initially focused on:
However, those do not appear to significantly impact performance. So the process of looking up which node a proxy points to and converting the node reference to a URI seem to be the problem. The process is:
Each of these steps is reasonably fast (~1msec). But as the number of members grows, even 3 msec per member eventually adds up. For example, a collection with 10,000 members would take 30 seconds.
Some possible options for improving performance include: