Introduction

Since the release of Fedora 4.0 in 2015, several community-supported migration tools have been developed. Each of these tools takes a different approach and serves a limited set of use cases. The tools include:

What follows is an assessment of each tool with a particular focus on current migration use cases.

migration-utils

First developed in early 2015, migration-utils is a Java-based command-line utility that iterates over the FOXML resources in a Fedora 3.x repository (either within the native filesystem or as exported content) and transforms them into Fedora 4-compatible resources before ingesting them into a Fedora 4.x repository. A Spring XML configuration file is used to define the source and destination repositories, as well as the nature of the Fedora 3.x content.

This utility uses a default set of property mappings that can be found in the README. These mappings are based on community best practices, and they can be changed by editing configuration files. XML content that can be easily mapped to RDF (e.g. Dublin Core metadata and the contents of the RELS-EXT datastream) are transformed into RDF properties before being imported into Fedora 4.x. Any managed datastreams are stored as binary resources in Fedora 4.x, and external content can optionally be fetched and stored in this manner.

The primary advantage of migration-utils is its agnosticism toward front-end applications. It is designed to maintain the basic structure of Fedora 3.x data in Fedora 4.x with some XML to RDF transformations where appropriate. The application also has a number of configuration options and supports customization via plugins that could be written for specific use cases. The tool could potentially save institutions the effort of writing custom data migration scripts, particularly if they are using a custom front-end environment that would not be able to take advantage of either of the other two tools.

FedoraMigrate

FedoraMigrate was developed within the Samvera community to facilitate migrations between Fedora 3 and Fedora 4 repositories within the context of Sufia, a popular Samvera institutional repository application. FedoraMigrate “iterates over your existing Fedora3 application using the Rubydora gem. For each object it finds, it creates a new object with the same id in Fedora4 and proceeds to migrate each datastream, including versions if they are defined, and verifies the checksum of each. Permissions and relationships are migrated as well but using different procedures due to the changes in Fedora4.” The migration process takes place in two steps: first, the resources are migrated, and then the relationships are added.

FedoraMigrate is capable of transforming XML-based metadata in Fedora 3 to RDF properties in Fedora 4; however, the mappings for each metadata element must be defined in the tool’s configuration, which could be time consuming. In general, the tool is configurable, but this configuration must be done in Ruby code, so a developer with Ruby on Rails experience will need to configure and run the migration. FedoraMigrate was written with Sufia in mind, so it would need to be customized to support other Samvera applications.

Migrate_7x_claw

Islandora 8 (CLAW) makes use of the Drupal Migrate API to provide tooling for migrations from Fedora 3.x Islandora installations to Islandora 8.x. The Drupal Migrate API provides services for migrating data from different sources to Drupal 8; plug-ins can be written to support different migration use cases.  The Islandora community developed the migrate_7x_claw module based on this API, which includes plug-ins for different types of data stored by Fedora 3.x-based Islandora installations.

While this module can be run from the command line using Drush, it can also be accessed via the user interface, making it easier for repository administrators to use. Configuration is relatively straight-forward - the user simply enters the base URLs for their Fedora and Solr instances, along with the username and password for Fedora, along with a Solr query to find and retrieve the objects to be migrated. These queries can be based on content model or anything else that might be indexed in Solr.

The relative uniformity of Islandora installations will make this tool quite useful to the community. Users who have customized the defaults will need to make some configuration changes, but in general this tool should help most Islandora 7.x users easily move their data into an Islandora 8.x repository.

Comparison with Sample Data from Institutional Profiles

Islandora

Three of the profiled repositories (Florida State, Michigan State, and Williams College) use Islandora 7.x as their repository framework. Like most Islandora institutions, none of them have made heavy customizations to their repositories, though in some cases they have added custom datastreams to the standard Islandora content models. However, these custom datastreams will simply be exported and imported by the migrate_7x_claw tool as media in Islandora 8.

The main issue with Islandora is not the migration tool, but the lack of full support for Islandora 7.x content types in Islandora 8. Specifically, paged content (e.g books and newspapers) is not yet supported (though a module has been developed by the community and should be incorporated soon) and correspondingly there is no support for paged content in the migration tool. However, once paged content is supported in Islandora 8, it should be relatively straightforward to support this content in the migration tool.

Samvera

The FedoraMigrate tool is specifically designed to work with the Sufia Samvera application, and therefore would only be useful to institutions making use of this application (which has since been superseded by the Hyrax application). While the migration tool could certainly be updated, it has not received any substantive code commits for over two years. Even if the tool were to be updated to work with Hyrax, which is similar to Sufia, it would not be useful to institutions like Stanford that have heavily customized both their Samvera applications and their data models. A migration to any new system would likely need to be done in a customized, in-house way at Stanford.

Custom

Of the three available tools, migration-utils would be the most useful to the custom Fedora 3.x repositories (National Library of Medicine, University of Wisconsin-Madison, UNC Chapel Hill, Amherst College). While it won’t address any of their front-end applications, migration-utils could be helpful in simply getting the data from Fedora 3.x to Fedora 4.x. In each case this would require some configuration and likely customization via plug-ins, but it would save the effort required to write custom migration scripts. However, the tool has not had any releases since Fedora 4.6.x so it would need to be updated to support Fedora 5.x and higher.

Gaps and Analysis

Of the currently available migration tools, migrate_7x_claw is the most robust and well-supported with greatest opportunity to impact a large number of institutions in the Fedora community. As more content types are supported, a greater number of Islandora repositories will be able to be migrated to Islandora 8. With over 260 installations around the world running on Fedora 3.x, this represents an enormous opportunity for the Fedora and Islandora communities.

Migration-utils is a useful tool in principle, but it is hampered by a lack of updates and its support for generic migration use cases. However, this represents a potential opportunity for the Fedora community to improve the tool based on the migration needs of those with custom front-end implementations. While it wouldn’t be possible to develop a tool that will work out-of-the-box in every scenario, a focus on configurable property mappings and data transformations could make the tool much more useful to the community.

Conclusion

While the Islandora community has taken longer to release a version of Islandora that supports Fedora 4.x and higher, their use of Drupal and a common application framework has given them a huge advantage in terms of developing migration tools that will support a majority of use cases in the Islandora community. The greatest gaps in support are therefore with custom Fedora 3.x repositories and those that are using Samvera tools but not a common application like Sufia or Hyrax. By taking migration-utils as a starting point and gathering requirements for improvements it would be possible to support a greater number of migration projects throughout the community.

  • No labels