The Web Archive Solution Pack adds all required Fedora objects to allow users to ingest and retrieve web archives through the Islandora interface.
The Web Archive Solution Pack configuration options can be accessed at http://path.to.your.site/admin/islandora/solution_pack_config/web_archive. Set the paths for warcindex
and warcfilter
here:
If you are using Solr 4+, the |
The Web Archive Solution Pack comes with the following objects in http://path.to.your.site/admin/islandora/solution_pack_config/solution_packs:
A file ingested using the Web Archive Solution Pack's content model will have the following datastreams:
RELS-EXT | Default Fedora relationship metadata |
MODS | MODS record filled out during ingest |
DC | Dublin Core record |
OBJ | Original WARC file uploaded |
TN | Thumbnail derivative of SCREENSHOT |
SCREENSHOT | Optional screenshot to represent the WARC |
Optional pdf (screenshot) to store with the WARC | |
JPG | Medium sized JPEG of SCREENSHOT |
WARC_CSV | Comma-separated index of the .warc file |
WARC_FILTERED | Full-text filtered WARC for Solr index |
The Web Archive Solution Pack comes with the Web Archive MODS form.