Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Islandora Workbench uses YAML files to configure its operations. These files are documented in detail. Here is an example config file, including a link to a sample CSV (file uploaded to wiki, update csv link below after published):

...

task:

...

create

...


host:

...

"https://islandora.traefik.me/"

...


username:

...

xxxx

...


password:

...

xxxx

...


media_type:

...

file

...

input_csv:

...

'https://wiki.lyrasis.org/download/attachments/273351517/AG%20Photos.xlsx?version=1&modificationDate=1674544809225&api=v2'
id_field:

...

PID

...


csv_field_templates:

...

-

...

field_rights:

...

"http://rightsstatements.org/vocab/CNE/1.0/"

...


-

...

field_member_of:

...

103

...


-

...

field_model:

...

13

...


-

...

field_resource_type:

...

25

...


-

...

field_display_hints:

...

21

...

default_file_mimetype:

...

'image/tiff'

...


default_file_extension:

...

".tif"

...


use_node_title_for_media:

...

1

...


allow_adding_terms:

...

true


NOTE: This CSV associated with this config file uses file URLs. To use a file directory, the input_dir configuration option may be used. More information is available in the Workbench documentation.

...

Resource Type

Default fields are:


New terms can be created on the fly (during ingest). If your metadata uses terms that already exist by default, you can reference the term name, its ID, or its URI (if it has one) in the CSV.

...

System Model

Available terms are:


NOTE: The Born-Digital i8 theme requires specific combinations of Resource Type and Model terms in order for compound objects, collections, and paged objects to display correctly. Please refer to Appendix B: Born-Digital i8 Theme Object View Configurations.

...

The machine names for each field are what need to be used as the column headers in the CSV. To find the machine name for a field, go to Structure > Content Type > Repository Item > Manage Fields.

Label
(displayed on the front-end)

Machine Name or Workbench-required name (use for column header in CSV)

Field Type and Notes

Title

title

text field (see more information above, under CSV Required Fields)

Alternative Title

field_alternative_title

text field

Identifier

field_identifier

text field; multi-value

Resource Type

field_resource_type

taxonomy reference (see more information above, under CSV Required Fields)

Genre

field_genre

taxonomy reference; multi-value

Linked Agent

field_linked_agent

typed relation field; multi-value - see more information about how to set up this field below, under Linked Agent.

Date Created

field_edtf_date_created

EDTF field; multi-value; must be in EDTF format - see more information about how to set up this field below, under EDTF Formats.

Date Issued

field_edtf_date_issued

EDTF field; multi-value; must be in EDTF format - see more information about how to set up this field below, under EDTF Formats.

Date

field_edtf_date

EDTF field; multi-value; must be in EDTF format - see more information about how to set up this field below, under EDTF Formats.

Edition

field_edition

text field; multi-value

Place Published

field_place_published

text field; multi-value

Language

field_language

taxonomy reference field; multi-value

Description

field_description_long

text-formatted-long, can support line breaks

If your metadata has line breaks, as long as they are included in the cell in the spreadsheet, saving as a CSV should wrap the content of the cell in quotes and the line break will be preserved.

Table of Contents

field_table_of_contents

text-formatted-long, can support line breaks

Physical Form

field_physical_form

taxonomy reference; multi-value

Extent

field_extent

text field; multi-value

Rights

field_rights

text field; multi-value; if you want this to be an external link, field needs to be changed to a Link type field.

Subject

field_subject

taxonomy reference (from corporate body, family, geographic location, person, or subject vocabularies); multi-value; data in CSV must include a namespace before the term (i.e., person: , corporate_body: , geo_location: , family: , or subject: ).

Geographic Subject

field_geographic_subject

taxonomy reference (from Geographic Location vocabulary); multi-value

Coordinates

field_coordinates

geolocation fields (latitude and longitude); multi-value - see more information about how to set up this field below, under Coordinates

Coordinates (Text)

field_coordinates_text

text field; multi-value

Temporal Subject

field_temporal_subject

taxonomy reference (from Temporal vocabulary); multi-value

Subjects (name)

field_subjects_name

taxonomy reference (from person, family, or corporate body); multi-value; data in CSV must include a namespace before the term (i.e., person: , family: , or corporate_body)

Dewey Classification

field_dewey_classification

text field; multi-value

Library of Congress Classification

field_llc_classification

text field; multi-value

Classification (Text)

field_classification

text field; multi-value

Local Identifier

field_local_identifier

text field; multi-value

ISBN

field_isbn

text field; multi-value

OCLC Number

field_oclc_number

text field; multi-value

Note

field_note

text-formatted-long, can support line breaks; multi-value

System Model

field_model

taxonomy reference (from Islandora Models vocabulary); (see more information above, under CSV Required Fields)

Member of

field_member_of

node reference; multi-value; points to the containing collection or parent node - see more information below under Member Of

Language

n/a

(dropdown; defaults to English - ignore this for the purposes of the CSV)

Access Control

field_access_terms

taxonomy reference (from Islandora Access vocabulary)

Display hints

field_display_hints

taxonomy reference (from Islandora Display vocabulary) - see more information below, under Display Hints

PID

field_pid

text field

Weight

field_weight

number (integer); indicates the order of a resource in a collection of resources (used for compound objects and paged content)


Adding New Fields

Follow these steps to add new fields as required:

...

All dates must follow EDTF formatting rules. Here are some examples:

EDTF Input

Front-End Output

1933?

1933 (year uncertain)

1945~

1945 (year approximate)

2016-04-12

2016-04-12

1860/1880?

1860 to 1880 (year uncertain)

1870/1880

1870 to 1880


A Link field type stores URLs and link text in separate data elements.

...

These terms, from the Islandora Display vocabulary, will define which viewer is used on an object page. 

Term Name

Used For

Open Seadragon

Large Image, Page

PDFjs

PDF

Configure the CSV

Your CSV will include only columns that are a) required and b) have data in them. You should not include columns in your CSV that do not have data. This will cause an error during Workbench’s configuration check process.

If the collection(s) that will contain the objects have NOT yet been created in Drupal, you can include rows for the collection(s) in your CSV. Each object will also have a row, and will reference the id of the collection it belongs to. For example:

title

id

parent_id

field_member_of

Test Collection 

55



Easthampton Town Hall 

1

55


Nehemiah Strong House

2

55


Amherst College, Lawrence Observatory

3

55



If the collection(s) that will contain the objects already exist in Drupal, you can use the field_member_of column to reference the node ID of the collection(s). (In this case, you will not use the parent_id column except for compound objects and paged content; see Configuring Complex Objects in the CSV for more information.)

...

Assuming, for example, your top-level collection had a node ID of 100, your CSV would look like this (this is only showing a portion of the columns):

title

id

parent_id

field_member_of

Easthampton Town Hall 

1


100

Nehemiah Strong House

2


100

Amherst College, Lawrence Observatory

3


100

Anchor
#ConfigComplex
#ConfigComplex

...

Assuming your Newspaper Parent node has not yet been created, and you’d like it to be contained within another top-level collection whose node has also not yet been created, the table below shows how the three newspaper parts (and the top-level collection) would be laid out in your CSV:

title

id

parent_id

field_weight

Sample Collection

1



Connecticut Western News (Newspaper)

2

1


Connecticut Western News Vol. 1 No. 7

3

2


Connecticut Western News Vol. 1 No. 7 - page 1

4

3

1

Connecticut Western News Vol. 1 No. 7 - page 2

5

3

2

Connecticut Western News Vol. 1 No. 7 - page 3

6

3

3

Connecticut Western News Vol. 1 No. 7 - page 4

7

3

4


The Newspaper Parent is a child of the containing collection (or it may have no parent). The Newspaper Issue is a child of the Newspaper Parent, and each Newspaper Page is a child of the Newspaper Issue.

...

If, however, your Newspaper Parent is already present on your site, you can use field_member_of to identify the parent using its node ID. In that case, the Newspaper Issue row would have nothing in the parent_id column, but would include the Newspaper Parent’s node ID in the field_member_of column. The Newspaper Page rows would have the Newspaper Issue’s id in the parent_id column. These rows of your CSV would look like this (assuming the Newspaper Parent’s node ID were “100”):

title

id

parent_id

field_weight

field_member_of

Connecticut Western News Vol. 1 No. 7

1



100

Connecticut Western News Vol. 1 No. 7 - page 1

2

1

1


Connecticut Western News Vol. 1 No. 7 - page 2

3

1

2


Connecticut Western News Vol. 1 No. 7 - page 3

4

1

3


Connecticut Western News Vol. 1 No. 7 - page 4

5

1

4



Books

Configuring Books in your CSV is very similar to configuring Newspapers. The parent Book object will have child Page objects. 

...

Assuming your Book Parent node has not yet been created, and you’d like it to be contained within a collection whose node has also not yet been created, the table below shows how the two Book parts (and the collection) would be laid out in your CSV:


title

id

parent_id

field_weight

Sample Collection

1



On the Tides at Malta (Book)

2

1


On the Tides at Malta - page 1

3

2

1

On the Tides at Malta - page 2

4

2

2

On the Tides at Malta - page 3

5

2

3

On the Tides at Malta - page 4

6

2

4

On the Tides at Malta - page 5

7

2

5


The Book Pages must each have a field_weight assigned.

...

If, however, the collection into which you are ingesting the Book is already present on your site, you can use field_member_of to identify the parent using its node ID. In that case, the Book Parent would not have anything in the parent_id column, but would include the collection node ID in the field_member_of column. The Book Page rows would have the Book Parent’s id in the parent_id column. These rows of your CSV would look like this (assuming the collection node ID were “100”):

title

id

parent_id

field_weight

field_member_of

On the Tides at Malta (Book)

1



100

On the Tides at Malta - page 1

2

1

1


On the Tides at Malta - page 2

3

1

2


On the Tides at Malta - page 3

4

1

3


On the Tides at Malta - page 4

5

1

4


On the Tides at Malta - page 5

6

1

5


Compound Objects

Configuring Compound Objects in your CSV is very similar to Newspapers and Books. The Compound Object will have a containing “parent object” as well as one or more children. The containing parent object will not be visible on the front-end, aside from its metadata (it cannot have any media of its own; that would never be seen).

...

Assuming you’d like the Compound Object to be contained within a collection whose node has also not yet been created, the table below shows how the two Compound Object parts (and the collection) would be laid out in your CSV:

title

id

parent_id

field_weight

Sample Collection

1



Historic Western Mass (Compound Object)

2

1


Amherst 1886 (Child Object 1)

3

2

1

Adams 1882 (Child Object 2)

4

2

2


The child objects must each have a field_weight assigned.

...

If, however, the collection into which you are ingesting the Compound Object is already present on your site, you can use field_member_of to identify the parent using its node ID. In that case, the Compound Object parent would not have anything in the parent_id column, but would include the collection node ID in the field_member_of column. The child object rows would have the Compound Object parent’s id in the parent_id column. These rows of your CSV would look like this (assuming the collection node ID were “100”):

title

id

parent_id

field_weight

field_member_of

Historic Western Mass 

(Compound Object)

1



100

Amherst 1886 (Child Object 1)

2

1

1


Adams 1882 (Child Object 2)

3

1

2



Check, Then Run

You should always check your configuration and spreadsheet are valid before running the ingest. Fortunately, Islandora Workbench makes this easy with the --check command:

...

Appendix B: Born-Digital i8 Theme Object View Configurations

Object Type

Resource Type term

System Model term

Collection

Collection

Collection

Audio

Sound

Audio

Basic Image (jpg)

Still Image

Image

Binary

[any]

Binary

Book (parent of pages)

Collection

Paged Content

Compound Object

Collection

Compound Object

Large Image (tiff)

Still Image

Image

Newspaper (parent of issues)

Collection

Newspaper

Newspaper Issue (parent of pages)

Collection

Publication Issue

Newspaper Page

Text

Page

Page

Text

Page

PDF

[any]

Digital Document

Video

Moving Image

Video