Current Release
This documentation covers the latest release of Islandora 7.x. For the very latest in Islandora, we recommend Islandora 8.

Skip to end of metadata
Go to start of metadata

Overview

The Islandora Batch module provides the ability to ingest multiple objects at a time, either from a .zip file or a directory on the server. The ingest is a three-step process:

  • Preprocessing: The data is scanned, and a number of entries created in the Drupal database. There is minimal processing done at this point, so it can complete outside of a batch process.
  • Ingest: The data is actually processed and ingested. This happens inside of a Drupal batch.
  • Cleanup: The batch entries in the Drupal database need to be deleted, so the associated temp files can be purged. This can be configured to happen automatically, or can be done manually.

Dependencies

This module requires the following modules/libraries:

Additionally, installing and enabling Views will allow additional reporting and management displays to be rendered.

Downloads

Release Notes and Downloads

Installation

Unzip this module into your site's modules directory as you would any other contrib module. See this for further information.

Configuration

After you have installed and enabled the Islandora Batch module, go to Administration » Islandora » Islandora Utility Modules » Islandora Batch Settings (admin/islandora/tools/batch) to configure the module.


You should make sure that the path to your java executable is correct.  Optionally, if you have the Drupal Views module enabled, you can also have the module link back to the Batch Queue in its results messages.

Usage

Detailed usage instructions for Islandora Batch are available here.

Drush Integration

The Islandora Batch module can also be run using Drush.  This is a two stage process.  First you populate the Batch Queue by running the islandora_batch_scan_preprocess command.  


Drush made the target parameter reserved as of Drush 7. To allow for backwards compatability we use target in Drush 6 and below, but scan_target in Drush 7 and up.


Drush 7 and above:

drush -v -u 1 --uri=http://localhost islandora_batch_scan_preprocess --type=zip --scan_target=/path/to/archive.zip

Drush 6 and below:

drush -v -u 1 --uri=http://localhost islandora_batch_scan_preprocess --type=zip --target=/path/to/archive.zip


Available options include (Drush 7 and above):

 --content_models                          A comma-separated list of content models to assign to the objects.                          
 --namespace                               Namespace of objects to create. Defaults to namespace specified in Fedora configuration.    
 --parent                                  The collection to which the generated items should be added. Defaults to the root Islandora 
                                           repository PID.                                                                             
 --parent_relationship_pred                The predicate of the relationship to the parent. Defaults to "isMemberOfCollection".        
 --parent_relationship_uri                 The namespace URI of the relationship to the parent. Defaults to                            
                                           "info:fedora/fedora-system:def/relations-external#".                                        
 --scan_target (or --target)               The target to directory or zip file to scan. Required. 
                                           Requires the full path to your archive from root directory,
                                           e.g. /var/www/drupal/sites/archive.zip                                     
 --type                                    Either "directory" or "zip". Required.


After the Batch Queue has been populated, you start the actual ingest process with the islandora_batch_ingest command.  For example:

drush -v -u 1 --uri=http://localhost islandora_batch_ingest

This will process the batch, ingesting the objects and creating their derivatives.

Batch Ingest Cleanup

It is important to clean up completed batches periodically, as the record of the batch prevents the associated Drupal temp files - i.e. any uploaded payload - from being automatically deleted. There is a configuration option that will do this immediately after a successful ingest (see config screenshot, above). It is possible (but not in scope here) to create a cron task to do this after a period of time. It is also possible to delete batches manually through the interface. An illustrated explanation of this process is found at How to Batch Ingest Files#BatchIngestCleanup.

Customization

Custom ingests can be written by extending any of the existing preprocessors and batch object implementations. Checkout the example implemenation for more details.

Clearing the semaphore table

If a user kills a Drush batch ingest, or a batch ingest initiated via the web GUI dies for some reason, it is impossible to start another batch ingest until the Islandora Batch entry in Drupal's semaphore table expires. You may clear this entry manually within your database, but doing so may impact other batch ingest jobs that are running. If you are sure no other batch ingest jobs are running, delete the row from Drupal's semaphore table where the name is 'islandora_batch_ingest'.


  • No labels

5 Comments

  1. FYI - I needed the Rules and Entity API modules as well in order for the reporting functionality to work. 

    1. As Dependencies? You are welcome to add them here. 

  2. FYI - I could not see all the options in the "Islandora Batch Settings" menu nor could I see the links "Islandora Batch Queue" or "Islandora Batch Sets" links in /admin/reports until I enabled "Views" on the site.



  3. I have thoroughly checked out these problems using a vagrant site generated by the OVA file specifically designated as the test site for Release Candidate 7.x-1.12.

    i did not change (enable/disable) anything on the modules page.

    --  The link for Islandora Batch Que appears under Reports as expected.

    --  The link for Islandora Batch Sets appears under Reports as expected.

     -- The Islandora Batch Settings appears as explained above in this issue under 

    Admin Dashboard >> Islandora >> Islandora Utility Modules >> Islandora Batch Settings


    The best way to test this Release Candidate is to use the OVA file.

    The Code Freeze Announcement contains a link to the OVA file ( Islandora7.x-12-RC1.0.ova )

    Go to:  https://groups.google.com/forum/#!topic/islandora/iG6cUIcoZnI


    You may have to destroy all the other vagrant machines you are running.

    If you are on windows (as I am) vagrant destroy will not get rid of backup machines saved in the file path

    C:|Users\username\.vagrant.d\boxes\

    Use your favorite delete method to remove these.

  4. The explanation for Islandora Batch Settings (a single image) could be extended.

    There are 3 possible states for the settings form:

    Status
    1.  Nothing Checked.


    2.  Link to batch queue after batch successfully processes?

    3.  Auto-remove batch set after batch successfully processes?


    Click on the links to view the larger versions of the 3 screenshots.  #1 and #2 look exactly the same except for the presence of a check mark in the box for  Link to batch queue after batch successfully processes?

    But #3 looks different, because checking the Auto_-emove box collapses the form and hides the Link to batch queue after batch successfully processes?

    I will see about adding a second image to the wiki page to clarify this.