Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Storage environment:  for the purposes of this test (and for our real migration), we are migrating from one CIFS-mounted remote filesystem to another CIFS-mounted remote filesystem.
  2. Speed: The tool migrates approximately 1700 objects/hr.  At this rate, it will take approximately 10 days to migrate the entire repository,
  3. Datastream index:  takes about 67 1h10m minutes to build, and occupies 327MB of disk space.
  4. CPU time. Consumes about 15%.
  5. Source layout.  Akubra hash storage, using the pattern "#/##/##" for both datastreams and objects.OCFL storage:  Pairtree.  It will be good when the OCFL storage profile specification is set and incorporated into migration-utils, so that we can define the OCFL layout, similar to how we can specify the Akubra filesystem layout.

Issues

Migration Tests

UW Digital Collections Center Production Repository

Fedora 3: Approx. 390561,000 objects (382GB559GB): mostly books, pages and still images, with some audio, video, and PDF resources.  Approximately 2.33M 36 million datastreams (610.3TB). Content objects have one binary datastream and 5 XML metadata datatstreams.  Container objects have ~5 XML metadata datastreams.  All datastreams are either inline or managed (no external or redirect datastreams).

Fedora 3.8.1.  Migration run on desktop workstation VM with 8 4 cores, 16 8 GB RAM.  CentOS Linux release 78.72.1908 2004 (Core), Intel(R) CoreXeon(TM) i7-6700 CPU @ 3.40GHzR) Gold 5220 CPU @2.20GHz

Command run:

Code Block
languagebash
titleUW Madison migration-util command line
$ java -jar target/migration-utils-4.4.1-SNAPSHOT-driver.jar --migration-type=FEDORA_OCFL --source-type=akubra --datastreams-dir=/fedora3-prod/fedora/datastreams --objects-dir=/fedora3-prod/fedora/objects --target-dir=/fedora-migration-test --layout=pairtree --index-dir=/var/tmp/datastream-index


Number
of objects

Execution
Time

Source

Layout

Dest.
Average seconds per objectOCFL repository size

Source
Layout

Migration
tool version

Notes
1000Datastream index
1 hr 7 minutes
: 1h17m
OCFL repo: 4h36m
16.3 sec184GBAkubra
pairtree02/02/20 (cd7ece7)233MB100037 minAkubrapairtree02/02/20 (cd7ece7)1K fedora items produced 33.5K+ files100,00059 hours 58 minAkubrapairtree02/02/20 (cd7ece7)Stopped and restarted;  time excludes read time for datastream index on startup (29 minutes)


(81586bf )

with param --pid-file=1000pids.txt
datastream index cleared after run
10,000Datastream index: 1h5m
OCFL repo: 11h48m
4.3 sec688GBAkubra
(81586bf )


with param --pid-file=10000pids.txt
datastream index cleared after run

100,000Datastream index: 1h9m
OCFL repo: 3d20h16m
3.3 sec6.8TBAkubra

 
(4a9f19c)

with param --pid-file=100000pids.txt
datastream index cleared after run
All 561,000

Datastream index: 1h10m
OCFL repo:
20d21h12m

3.2 sec39TBAkubra

 
(43b7bae)

all pids
All 390,000X hoursAkubrapairtree02/02/20 (cd7ece7)