DSpaceDirect Resources


DSpaceDirect Website
DSpaceDirect Demo Site

DSpace Resources


DSpace Website
DSpace Documentation
DSpace Wiki

DSpaceDirect KnowledgeBase


Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • DSpace provides an "Advanced Policy Manager" (also known as the "item wildcard policy admin tool"), which allows an Administrative user to perform bulk permissions changes to all of the Items or Files (bitstreams) within a specified Collection.
    • For more information on permissions / policy settings in general, please also refer to the section "Individual Item Permissions Changes" above.
  • WARNING: This Advanced Policy Manager is a bit of a "beta-level" tool. It works, but it's not the most user friendly page in DSpace. It's also not the smartest tool, so you need to sometimes take several steps to make the changes you want to make.
  • How-To: Batch Permissions Changes basics
    • Sample Use Case: The easiest way to explain how this tool works is via a common use-case. Suppose that you have a Collection of open access (viewable/readable to anyone in the world) Items which you now want to restrict to only be viewable to a group of users called "On Campus Users". Here's the steps you would take to perform that change:
      • Login to your site as an Administrator

      • Under the "Administrative" side menu, click on "Authorizations" (under Access Control submenu)

      • Just under the box at the top of the page, click the link that says "Click here to go to the item wildcard policy admin tool"

      • Step 1: Remove existing metadata access rights for all Items in the specified Collection.   To do so, fill out the form as follows:

        • Description: (optional, usually is left blank as it's only really useful for bulk changes to embargo)

        • Group: (leave blank in this case as you will remove any existing permissions)

        • Action: READ (you want to remove "READ" access)

        • Content Type: Item (you want to remove READ access on an Item level – this controls metadata access)

        • Collection: [select the collection]

        • Start Date: (leave blank - this is only useful for bulk-changes to embargo dates)

        • End Date: (leave blank - this is only useful for bulk-changes to embargo dates)

        • CLICK the "Clear Policies" button

        • (NOTE: Even though you get no confirmation screen, the changes will be immediately applied)
      • In Step #1, essentially all we've done is remove access to the Item metadata. The metadata is now only visible (readable) by Administrators.  However, the content files within those Items are unfortunately still accessible (if someone had bookmarked the URL)
      • Step 2: Remove existing content file access rights for all Items in the specified Collection.  To do so, fill out the form as follows:

        • Description: (optional, usually is left blank as it's only really useful for bulk changes to embargo)

        • Group: (leave blank in this case as you will remove any existing permissions)

        • Action: READ (you want to remove "READ" access)

        • Content Type: bitstream (you want to remove READ access on the files, or bitstreams)

        • Collection: [select the collection]

        • Start Date: (leave blank - this is only useful for bulk-changes to embargo dates)

        • End Date: (leave blank - this is only useful for bulk-changes to embargo dates)

        • CLICK the "Clear Policies" button

        • (NOTE: Even though you get no confirmation screen, the changes will be immediately applied)
      • In Step #2, we've also removed access to the Item files. This means that only Administrators can now access/download any files associated with the Items. Now, we  need to assign NEW permissions for our "On Campus Users" group in the following two steps.
      • Step 3: Give the "On Campus Users" group access to all metadata for all Items in the specified Collection. To do so, fill out the form as follows:

        • Description: (optional, usually is left blank as it's only really useful for bulk changes to embargo)

        • Group: Select the "On Campus Users" group

        • Action: READ (you want to add "READ" access to the selected group)

        • Content Type: item (you want to add READ access on Items)

        • Collection: [select the collection]

        • Start Date: (leave blank - this is only useful for bulk-changes to embargo dates)

        • End Date: (leave blank - this is only useful for bulk-changes to embargo dates)

        • CLICK the "Add Policies" button

        • (NOTE: Even though you get no confirmation screen, the changes will be immediately applied)
      • In Step #3, we've now given the "On Campus Users" group the ability to read the metadata for all Items in this collection. So, the final step is to also give them the ability to read/download files associated with these Items.
      • Step 4: Finally, give the "On Campus Users" group access to all files for all Items in the specified Collection. To do so, fill out the form as follows:

        • Description: (optional, usually is left blank as it's only really useful for bulk changes to embargo)

        • Group: Select the "On Campus Users" group

        • Action: READ (you want to add "READ" access to the selected group)

        • Content Type: bitstream (you want to add READ access on all files, or bitstreams)

        • Collection: [select the collection]

        • Start Date: (leave blank - this is only useful for bulk-changes to embargo dates)

        • End Date: (leave blank - this is only useful for bulk-changes to embargo dates)

        • CLICK the "Add Policies" button

        • (NOTE: Even though you get no confirmation screen, the changes will be immediately applied)
      • At the end of this process, all the Items (and their Files) in the selected Collection will now only be accessible to users who belong to your "On Campus Users" group.  Other non-Administrative users will be presented with an Access Restricted message.

BasicLinkChecker

DSpace provides a Basic Link Checker as part of the system Curation Tasks. This can be used for small collections as follows:

  • Login as a DSpace Administrator
  • Under the Administrative menu, select Curation Tasks
  • Enter the Handle of the Community, Collection, or Item to check
  • Select "Check Links in Metadata" in the Task menu and choose Perform.

As noted above, this will work fine for individual items or small collections. Larger collections will likely take too long to run and will result in an error.

Checking with Exported Metadata

Links can be checked by exporting the site metadata and using an external process to verify links. An example of an external process using Google Sheets follows

  • First, export the metadata to be checked
    • Login as a DSpace Administrator
    • Select the Community or Collection to be checked
    • In the Context menu select "Export Metadata" and save the resulting CSV file
  • Links can be checked using Google Sheets and a simple script
    • Open Google Drive and drag the CSV file into an appropriate folder (this uploads the file)
    • Right-click on the file and select "Open with > Google Sheets"
    • Find the metadata column with links to be checked (often this is dc.identifier.uri[]). You may choose to hide other columns to simplify the view.
    • Select Tools → Script Editor
    • Replace the default script with this code: 

      Code Block
      function getStatusCodes(urlset){
        if('' == urlset) {
          return '';
        }
        
        var urls = urlset.split("||");  
        var responseCodes = [];
         
        for (var i=0; i<urls.length; i++){
          var responseCode = getStatusCode(urls[i]);
          responseCodes.push(responseCode);
        }
      
        return responseCodes.join();
      }
      
      function getStatusCode(url) {
        var options = {
          'muteHttpExceptions': true,
        };
        
        var response = UrlFetchApp.fetch(url.trim(), options);
        return response.getResponseCode();
      }


      • This code includes two functions. The getStatusCodes function expects an array (list) of URLs to check. The getStatusCode function expects only a single URL. Which of these you use depend on whether the metadata column you need to check has one URL or multiple URLs in each row. If in doubt, use getStatusCodes, as it will work for one or more URLs.

    • Save the script with File → Save
      • You may be asked by Google to provide permissions to access your spreadsheet at this point. You will need to grant these permissions.
    • Back on your Google Sheets file, select an empty column that will be used for script results. On the first row with data (usually row 2) add this to the cell, replacing "Y2" with the cell ID where the URL to be checked can be found, then hitting enter. (This is also where you can choose to use the getStatusCode function rather than getStatusCodes.)

      Code Block
      =getStatusCodes(Y2)


    • The result posted in the cell should provide an HTTP response code. The success code is 200. A code of 404 means the page cannot be found. Other response codes are listed here: https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html. If something else goes wrong with the request you may see an error listed here.
    • Assuming this works properly for the first row, you can apply the function to all rows by either:
      • Selecting the cell where you placed the function, selecting the small box in the bottom right corner of the cell and dragging it down to all other cells
      • Or, selecting the cell where you placed the function, copying it (Edit → Copy), then selecting all rows in the column and pasting (Edit → Paste). This method works better if the number of rows is large.
    • Once you have a response code (or multiple response codes if there are multiple URLs) in each row you will be able to review the results looking for non-200 codes that may need further investigation.