Excerpt |
---|
Fedora 4 can handle a large number of at least 10 million objects, but performance degrades if there are too many children and/or grandchildren of a single node. |
...
4.85 million files were ingested using a three-level hierarchy (74 top-level nodes, 256 second-level nodes in each, 256 third-level nodes in each, and one 10KB datastream in each), taking 111 hours. After each batch, three REST API operations were timed: listing the top level of the repository ("toplist"), listing a second-level node ("dirlist"), and retrieving a file ("fileget"). Performance retrieving files and listing the second-level nodes did not degrade with larger numbers of objects. However, listing the top-level of the repository degraded roughly linearly as more objects were added, and became increasing erratic.
...
10 million files in a 4-level hierarchy
6.5 10 million files were ingested into a test repository running Fedora 4.0-beta1 (lib-devsandbox1.ucsd.edu) using a four-level hierarchy (25 39 top-level nodes, 64 second- through fourth-level nodes, and one 10KB datastream in each bottom-level node), taking 30 days54 days (averging 186K objects/day). After each batch, three REST API operations were timed: listing the top level of the repository ("toplist"), listing a third-level node ("dirlist"), and retrieving a file ("fileget"). Performance retrieving files did not degrade with larger numbers of objects. However, listing the top-level of the repository degraded roughly linearly as more objects were added, and listing a third-level node increased more rapidly, with increasing variability as more objects were created.
...