Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Stanford has a collection of publications consisting of page images, metadata, and arrangement (Saltworks), containing 16712 objects/655237 items/273GB of data with the following distribution:

 

Image Added

Code Block
> quantile(file_sizes$size, c(0, .5, .7, .9, 1))
0% 50% 70% 90% 100%
0 43447 195719 1835010 288032768

Image Added

Code Block
> quantile(file_counts$X1, c(0, .5, .7, .9, 1))
0% 50% 70% 90% 100% 
1 22 28 62 1478

 

 

 

In production, the object metadata is stored in Fedora, but the page images and other assets are stored on the file system and (somehow associated back to the object.. TBD).

...