After the entire site is exported and imported via AIP, internal identifiers of DSpace objects (bitstreams, items, collections and communities) will have changed. This is because the AIP format doesn't persist these internal identifiers and AIP import will generate new internal identifiers. Only handles will be persisted.
This poses a problem if you want to also migrate usage events (Solr statistics) to the new site, because usage events have been tied to DSpace objects via the aforementioned internal identifiers (this is true up to and including DSpace 5). This can be observed as numbers (item_ids) appearing in the "statistics-home" pages instead of item titles.
The procedure described on this page will allow you to export old usage events, convert the old internal identifiers to new identifiers and import the usage events to the new site. The usage events will then match objects in the new site.
This can be used when migrating from Oracle to Postgres because AIP export/import is the only easy way to achieve such migration.
- database of the old site is available (handle and bitstream tables)
- database of the new site is available (handle and bitstream tables)
- Solr statistics from the old site are available (can be exported to CSV starting from DSpace 5.3)
- python 2.7
- vim (for regex edits to .csv files, but other tools like sed or perl can be used as well)
If you're upgrading from older DSpace than 5.3 which didn't have the
solr-export-statistics command, you'll need to migrate the Solr statistics core to DSpace 5.3 first and export it to CSV.
You will need to prepare the following files: handle-old.csv, handle-new.csv, bitstream-old.csv, bitstream-new.csv, solr-in.csv
handle-old.csv, handle-new.csv, bitstream-old.csv, bitstream-new.csv - these need to be exported from the old and new database, respectively.
Postgres makes this easy:
Oracle sqlplus has no native way of exporting CSVs, so here's a workaround:
solr-in.csv can be exported from DSpace 5.3 and newer using this procedure:
To convert the identifiers in solr-in.csv and write the statistics to solr-out.csv, run:
Please note that this will leave out any usage events pertaining to items, collections or communities that were deleted, so there will likely be fewer lines in solr-out.csv than in solr-in.csv.
Then you can import the statistics to your new site, replacing all data that is already present there:
This procedure is a workaround for a problem that currently doesn't have a solution in DSpace. As DSpace 6 will contain work replacing internal identifiers with UUIDs and Solr statistics and AIP export will have to be changed to accomodate that, there's hope that this procedure will soon be obsolete.