Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

University of Wisconsin - Madison

Observations

  1. Storage environment:  for the purposes of this test (and for our real migration), we are migrating from one CIFS-mounted remote filesystem to another CIFS-mounted remote filesystem.
  2. Speed: TBD
  3. Datastream index:  takes about XX minutes to build.
  4. CPU time. Consumes about 15%.
  5. Source layout.  Akubra hash storage, using the pattern "#/##/##" for both datastreams and objects.
  6. OCFL storage:  Pairtree.  It will be good when the OCFL storage profile specification is set and incorporated into migration-utils, so that we can define the OCFL layout, similar to how we can specify the Akubra filesystem layout.

Issues

Migration Tests

UW Digital Collections Center Production Repository

Approx. 390,000 objects (382GB): mostly books, pages and still images, with some audio, video, and PDF resources.  Approximately 2.33M datastreams (6.3TB). Content objects have one binary datastream.  Container objects have XML metadata datastreams.  All datatreams are either inline or managed (no external or redirect datastreams).

Fedora 3.8.1.  Migration run on desktop workstation with 8 cores, 16 GB RAM.  CentOS Linux release 7.7.1908 (Core), Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz

Command run:

Code Block
languagebash
titleUW Madison migration-util command line
$ java -jar target/migration-utils-4.4.1-SNAPSHOT-driver.jar --source-type=akubra --datastreams-dir=/fedora3-prod/fedora/datastreams --objects-dir=/fedora3-prod/fedora/objects --target-dir=/fedora-migration-test --layout=pairtree --index-dir=/var/tmp/datastream-index


Number

of objects

Execution

Time

Source

Layout

Dest.

Layout

Migration

tool version

Notes
Datastream indexX hoursAkubrapairtree02/02/20 (cd7ece7)XXMB
1000X minAkubrapairtree02/02/20 (cd7ece7)1K fedora items produced X+ files
100,000X hoursAkubrapairtree02/02/20 (cd7ece7)
All 390,000X hoursAkubrapairtree02/02/20 (cd7ece7)