Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: A little more progress on documentation for 7.x-1.11

Table of Contents

Introduction

Islandora Scholar is a suite of modules designed to help Islandora function as an Institutional Repository (although some features are helpful in other use cases as well). It is unique compared to other Islandora modules in terms of the number of features it provides as well as the vast amount of submodules that it contains. It is helpful to think of Scholar as being a kind of scholarly content solution pack due to the new content models it provides (citation & thesis), but it also differs from other solution packs in that it provides new functionality that may be used with other cmodels as well.

...

  • New citation & thesis content models
  • Sherpa/RoMEO integration
  • Creation of new objects from EndNote XML or RIS files exported from other systems
  • Creation of new objects from Digital Object Identifiers (DOIs) and PubMed IDs (PMIDs)
  • Suppression of objects from display or disable viewing/downloading of particular datastreams
  • Exporting of collections of bookmarks as RIS, RTF or PDF files
  • User-selectable dynamic citation styling using Citation Style Language (CSL) files
  • Extra HTML metadata to assist with Google Scholar indexing

Requirements

This module requires the following external modules/libraries:

...

Please note that additional features are available through optional Scholar submodules. They must be enabled separately, and they have their own separate requirements as well.

Features of Scholar "Core"

While "Scholar Core" is not an official community term, it can be a helpful way to keep track of the numerous features Islandora Scholar offers. In this context, "core" features are those that come from the Islandora Scholar module itself as opposed to coming from one of the numerous submodules which can be enabled or disabled independently.

Citation & Thesis Content Models

The core Islandora Scholar module provides two new content models: the Citation Content Model (ir:citationCModel) and theThesis the Thesis Content Model (ir:thesisCModel). The Citation model is intended for general scholarly works, while the Thesis model is for handling electronic theses and dissertations (ETDs). While the datastream structure and default display of these two Content Models is nearly identical, they can be tweaked separately to accommodate the different use cases one may have for theses vs. other scholarly works, such as altered displays, collection configuration, or separate metadata forms.

Info
titleNamespaces

Please note that the citationCModel and thesisCModel are in the "ir:" namespace, as opposed to the standard "islandora" namespace that other cmodels

...

use. Due to this nonstandard namespace, Islandora instances that make use of namespace restrictions will need to enable the "ir" namespace, both in Islandora and in Islandora Solr, in order for the citation & thesis

...

content models to work.

Object Display

Citation and thesis objects have special components on their "landing page", including

  • a formatted citation (using a configurable Citation Style Language or CSL - defaults to APA)
  • a metadata "Details" tab that uses COinS metadata (may be disabled to use the Islandora Default metadata display, e.g. Islandora Solr Metadata)
  • optionally, a link to search for this object in Google Scholar
  • optionally, a link to search for this object in a library catalogue or discovery service
  • optionally, a tab containing the copyright and self-archiving policies of the journal according to Sherpa/RoMEO.

These features can be configured on the Scholar admin page, at Administration » Islandora » Solution Pack Configuration » Scholar (admin/islandora/solution_pack_config/scholar).

Islandora Scholar Configuration ScreenImage Added

Sherpa/RoMEO

Scholar Core includes SherpaSherpa/RoMEO integration.  Sherpa/RoMEO is a service which keeps on file and makes searchable provides information about the copyright & self-archiving policies of various academic journals. Configure this in the Scholar admin menu at 'admin/islandora/solution_pack_config/scholar'.  Checking "Enable RoMEO attempts" turns on the functionality.  When that is checked, then when viewing a Citation Content Model object when the object has When this feature is enabled, a "RoMEO" tab will appear on objects with the Citation Content Model which have a MODS identifier of type "issn". This can be used in staff workflows, the person viewing the object will see a tab labeled "RoMEO" which shows the journal policies pulled from Sherpa RoMEO.

Image Removed

to help encourage self-archiving. Results of RoMEO queries are cached in the Drupal database, but are not saved into object metadata. 

Without an API key, Sherpa/RoMEO provides up to 500 requests per day. With an API key, requests are unlimited.  Sherpa/RoMEO API There is also a place in the admin menu at 'admin/islandora/solution_pack_config/scholar' to provide a Sherpa/RoMEO API key. No API key is needed.  Instead, there's a cap of 500 requests per day if you don't have an API key, but no cap if you do.  API registration is free-of-charge as of summer 2016.

Rights information from Sherpa/RoMEO is not copied into object metadata, nor into any datastream.  Instead, this is a quick link to the Sherpa/RoMEO information which can be used in staff workflows.

PDF.js Viewer

At the bottom of Scholar's admin menu there is an option to choose what viewer you would like to use for citation and thesis objects ("Select a viewer"). By default, no viewer is selected, meaning that citation and thesis object displays will show an image of the cover page of the PDF datastream for that object. Islandora Scholar can be configured to use the Islandora PDF.js module, which leverages Mozilla's PDF.js library to enable a JavaScript-powered window in the display of citation & thesis objects that allows a user to read the object's PDF datastream in-page without needing to download it.

To use the Islandora PDF.js module, you must first install it according to the installation instructions on the module's README page. Once installed, go to the Scholar admin menu and select it as your viewer. If you see "No viewers detected" under the "VIEWERS" section, this means that there is a problem with your installation of Islandora PDF.js or that you have not enabled it. 

Once enabled, you will see the PDF.js viewer window on citation & thesis object displays in place of the default static cover page image. See the image below for an example of what this looks like:

Image Removed

Importers

Scholar provides options for importing objects from various sources.

These are configured by enabling the following modules:

  • PMID Populator  (imports metadata through PubMed's API)
  • RIS Populator (when creating a Citation object, allows upload of a Research Information Systems (RIS) formatted citation to prepopulate metadata)
  • DOI Populator (imports metadata through Crossref OpenURL)
  • EndNote XML Populator (when creating a Citation object, allows upload of an EndNote XML exported citation to prepopulate metadata)
    (each module enables import from a different source)

When a module is enabled, and someone clicks to create an object in Islandora, Islandora will display the option to "Prepopulate metadata from source".  The person can choose a source, and either upload the file or provide the identifier, and then will be shown a metadata input form prepopulated with metadata from the source.

Image RemovedImage Removed

Here is more detailed info on each source for metadata:

spring 2018.

Standard Metadata Display (vs. COinS)

Islandora Scholar provides a special metadata display block for Citation and Thesis objects, using the COinS metadata standard. COinS is the standard that is used in OpenURL implementations, and is a standard set of terms for citation objects including "Title", "Authors", "Abstract", "Journal", "ISSN", etc.  The COinS metadata ("Details") block pulls this information from the MODS datastream according to a hard-coded transform. To customize the metadata display, select "Use Standard Metadata Display", set Islandora to use "Islandora Solr" as a metadata display, and configure a Solr metadata profile for Citation and/or Thesis objects. 

The COinS display is "expanded by default", unlike the Default DC or Islandora Solr metadata display blocks.

Scholar can include "search links" on the display of Citation and Thesis objects (formatted URLs using metadata from the object, to direct the user to a Google Scholar search or a library catalogue/discovery layer search). The metadata used to create these links can be configured by providing an XPath, which will be applied to the MODS datastream. By default, it will use the Primary Search XPath (defaults to //mods:identifier[@type="doi"], which will find any DOI identifier present in the MODS record). If the Primary Search XPath does not find anything, it will fall back to use the Default Search XPath. If this does not provide a search string either, the object label will be used. 

PDF Viewer

Citation and Thesis objects can be associated with PDF datastreams (by default, the datastream ID is 'PDF', which differs from the standard Islandora use of 'OBJ' for the payload, or 'preservation master'). The PDF datastream can be displayed in an in-browser javascript viewer. The Islandora PDF.js module, which leverages Mozilla's PDF.js library, is an example (the only officially supported PDF viewing module as of 2018).  Whether or not to use a viewer can be configured at the bottom of the Scholar admin page, in the "VIEWERS" section. If you see "No viewers detected", this means that there is a problem with your installation of Islandora PDF.js or that you have not enabled it. If no viewer is selected, citation and thesis object displays will show an image of the cover page of the PDF datastream for that object. 


CSL (Citation Style Language)

(To come)

Importers and Populators

Scholar provides options for importing objects from various sources. These are provided by the following Scholar submodules:

  • PMID (Populator/Importer)  - Given a PubMed ID, imports metadata through the PubMed API
  • DOI (Populator/Importer) - Given a Crossref DOI, imports metadata through the Crossref API
  • RIS (Populator/Importer) - Parses citations from a Research Information Systems (RIS) formatted file
  • EndNote XML (Populator/Importer) - Parses citations from an EndNote XML formatted file


Info

Populator is used when creating a single object, and will pre-fill a metadata form, allowing you to edit the data before ingesting. An Importer will automatically create one or more objects in a batch. 

To use a Populator, follow the normal workflow to add an object to a collection. After selecting a content model, "Prepopulate metadata from source" will display an option for each installed populator module. If one is selected, the user will be able to upload the file or provide the identifier, and then will be shown a metadata input form prepopulated with metadata from the source.

Be aware that some metadata from the source might not display in the form, but might be saved in the object's MODS datastream regardless. The transforms from source to MODS are defined by each populator/importer module, but the form that displays can be configured by an Islandora administrator. If the form is missing fields for certain MODS elements, such as within <mods:relatedItem type="host">, that information may not show up in the form that the user sees, but will still be saved in MODS. Populator modules can be used to create objects of any content type, and this may happen if creating an object with a form that was not designed for citations.

Populator options during the ingest workflowImage Added

To use an Importer, log in, and browse to a Collection. Click the "Manage" tab, then the "Collection" sub-tab, then click "+ Batch Import Objects". You will get a drop down menu with an option to use the "DOI Importer" and any other importers that are enabled. 

Importer options during Batch IngestImage AddedThe RIS Importer and EndNoteXML Importer submodules allow users to take citation data files exported from other sources (such as RefWorks, EndNote or Zotero) and turn them into Islandora objects using the standard Islandora importer interface (similar to using the zip importer). The DOI Importer and PMID Importer submodules work in much the same way, but instead of using files exported by other citation managers they use Digital Object Identifier (DOI) or PubMed ID (PMID) strings.


DOI Populator:  Registration with Crossref required.

...

For more information about Crossref Open URL or to register, click here.


Batch Import

Batch Import of DOIs:

Import

Batch Import of DOIs:

When the DOI Importer is enabled, a new option will be available in each Collection's "Manage" tab.  When the DOI Importer is enabled, a new option will be available in each Collection's "Manage" tab.  When you log in, browse to a Collection, and click to the "Manage" tab, then click "Batch Import Objects", you will get a drop down menu with an option to use the "DOI Importer".  If you select this DOI Importer, then you can either upload a .txt file of DOIs or you can enter the DOIs into a form field.

...

In order to place embargoes on objects, you must grant the "Manage embargo on any objects" permission to the appropriate user roles under the Islandora Scholar Embargo section of admin/people/permissions. This will allow a user to place object and datastream embargoes on objects, as well as remove or update datastream embargoes. Due to issues with the connection between Drupal permissions and the way Scholar Embargo implements its embargo policies through XACML, currently only the object's owner will be able to remove or update object embargoes. This is because the XACML policy works independently of Drupal permissions, and the embargo XACML policy will block everyone except the object's owner from managing the object at all once embargoed regardless of what Drupal permissions they have. A fix for this is in the works and should be part of the Islandora 7.x-1.8 release.

PDF Configuration

Like the PDF solution pack, Scholar supports extracting searchable text from uploaded PDFs. This Text extraction relies on the text stream being embedded into the pdf. This isn't going to generate OCR for the book if it wasn't already done. Consider PDF having an embedded text stream (e.g. a born-digital PDF, or one with OCR already performed and embedded). Scholar will not generate OCR from raw scans. Consider converting text-filled images with no text streams to TIFFs, and using the Book Solution Pack with OCR enabled.

...