...
Stanford has a collection of publications consisting of page images, metadata, and arrangement (Saltworks), containing 16712 objects/655237 items/273GB of data with the following distribution:
Code Block |
---|
> quantile(file_sizes$size, c(0, .5, .7, .9, 1))
0% 50% 70% 90% 100%
0 43447 195719 1835010 288032768 |
Code Block |
---|
> quantile(file_counts$X1, c(0, .5, .7, .9, 1))
0% 50% 70% 90% 100%
1 22 28 62 1478 |
In production, the object metadata is stored in Fedora, but the page images and other assets are stored on the file system and (somehow associated back to the object.. TBD).
...