Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

 

Test Data

 

Set 1: Digital Corpora govdocs1

 

Set 2: OpenPlanets

 

Set 3: Random binary data created from a stable set of filesizes

 

 

 

The govdocs dataset includes (…), (…some characteristics, e.g. N PDF documents, varying in size from X to Y)

 

 

 

The OpenPlanets dataset …

 

 

 

(Description of fixture processing, generation of bagits)

 

The generated binary data set

...