...
- Go to Manage > Newspaper Batch
Image Modified
Image Removed
Image Added
- Zip file - Upload the ZIP file for batch ingest.
- Create PDFs? - Checking this box creates a PDF derivative that contains all the pages associated with a newspaper issue.
- Namespace for created objects - Set the namespace for the issue and page objects created for this batch ingest.
- Generate OCR? - Checking this box causes OCR to be generated for each Page object. OCR will be attached as a datastream to each page. If checked, another option appears below it, "Aggregate OCR?".
- Generate HOCR? - Checking this box causes HOCR to be generated for each Page object (text highlighting after full text search). HOCR will be attached as a datastream to each page.
- Aggregate OCR? Check this box to create an OCR datastream in the issue object that aggregates the OCR datastreams from all of the page-level objects in that issue.
- Notify admin after ingest? - Check this box to send an email to the site admin (user 1) that a newspaper batch ingest has completed. This requires the Drupal Rules module and a rule for newspaper batch notifications.
- Ingest immediately? - Checking this box will cause the batch to go through both steps of the ingest (pre-processing and actual ingest) immediately.
- If you do not check "Ingest Immediately", the files will be pre-processed only and added to the Islandora batch queue for an administrator to approve.
- To approve the batch, go to Administration > Reports > Islandora Batch Sets and select "View Items in Set" next to an unprocessed set. To process the set, click "Process Set" and process all items.
![Pre-processed newspaper batch items in the batch queue Pre-processed newspaper batch items in the batch queue](/download/attachments/69834628/newspaper_batch_716_02.png?version=1&modificationDate=1445960208978&api=v2&effects=drop-shadow)
...
Here are the options in the drush command:
```
drush help islandora_newspaper_batch
...
_preprocess
...
Preprocessed newspaper issues into database entries.
Options:
...
-
...
-aggregate_ocr A flag to cause OCR to be aggregated to issues, if OCR is also being generated per-page.
--content_models A comma-separated list of content models to assign to the objects. Only applies to the "newspaper issue"
level object.
--create_pdfs A flag to cause PDFs to be created in newspaper issues. Page PDF creation is dependant on the configuration
within Drupal proper.
--directory_dedup A flag to indicate that we should avoid repreprocessing newspaper issues which are located in directories.
--do_not_generate_ocr A flag to allow for conditional OCR generation.
--email_admin A flag to notify the site admin when the newspaper issue is fully ingested (depends on Rules being enabled).
--namespace The namespace for objects created by this command. Defaults to namespace set in Fedora config.
--parent The collection to which the generated items should be added. Only applies to the "newspaper issue" level
object. If "directory" and the directory containing the newspaper issue description is a valid PID, it will
be set as the parent. If this is specified and itself is a PID, all newspapers issue will be related to the
given PID. Required.
--parent_relationship_pred The predicate of the relationship to the parent. Defaults to "isMemberOf".
--parent_relationship_uri The namespace URI of the relationship to the parent. Defaults to
"info:fedora/fedora-system:def/relations-
...
external#".
--
...
target The target to directory or zip file to scan.
...
Required.
--type Either "directory" or "zip". Required.
--wait_for_metadata A flag to indicate that we should hold off on trying to ingest newspaper issues until we have metadata
available for them at the newspaper issue level.
Aliases: inbp
Third, process all items in the batch queue:
...