Introduction

The following lists were defined shortly after the DfR Architecture Meeting, which was held Jan 3-5, 2012. These lists provide both research needs and software needs from the perspective of the DfR project, in relation to existing DuraCloud and Fedora software capabilities.

DfR Research Needs

  • (DFR-2) Shiboleth integration with:
    • DuraCloud REST APIs
    • DuraCloud and Management Console UI
    • DuraCloud clients
  • (DFR-1) File processing workflow, how to coordinate each step in the processing pipeline needed to create Fedora objects
    • Should take a look at Aaron Birkland's conversion pipeline, existing ETL tools, the FAST search and access tool, the Harvard work, DROID, etc.
  • (DFR-3) Verification that files synced to DuraCloud via local connections to Box.net/Dropbox/etc work as expected
  • (DFR-4) Determine pieces of Islandora content model which need to be evoked for Islandora to provide full use/access to Fedora objects which were not created via Islandora
  • (DFR-5) Determine Islandora Drupal to Fedora XACML security model

DfR Needs which overlap with DuraCloud needs

  • (DFR-74 -> DURACLOUD-525) Need an auto-user to be able to handle running multiple services on a schedule
  • (DURACLOUD-530, DURACLOUD-531) Services need to be maintained across reboots/upgrades
  • (DURACLOUD-582) Sync tool to create a transfer log of all files moved to DuraCloud (with checksum, in format to allow for running through bit integrity checker)
  • (DFR-75 -> DURACLOUD-522, DURACLOUD-627) Sync and Retrieval tools to handle file encryption (assumes Encryption Model 2; location/origination of key TBD)
  • (DURACLOUDPRIV-135) Update management console to be able to spin up auxilliary instances (for Fedora, etc) along with the DuraCloud instance
  • (DURACLOUDPRIV-137) Create REST API for Management Console

DfR Needs which are not otherwise directly applicable to DuraCloud

  • (DFR-6) Management Console, be able to create a DfR account
  • (DFR-7) Split apart sync tool to produce:
    1. A UI which collects all of the necessary parameters (currently command line and Swing versions)
    2. A piece to discover changes
    3. A piece that pushes content to DuraCloud - this would be considered the sharable SDK
  • (DFR-8) Sync tool should be provided with a unique ID, and should include that ID as a property on all items synced
  • (DFR-9) Sync tool to allow for collection of additional metadata and store as a manifest in DuraCloud
  • (DFR-10) Integrate external auth (e.g. Shiboleth) based on research findings in DuraCloud at: REST API, DurAdmin, and client utilities (e.g. sync tool)
    • Consider if it makes sense for us to provide an IDP server to support a Shibboleth id manager that we host for those users who are not wanting to hook up to an existing provider (i.e. all DuraCloud users end up in Shib)
  • (DFR-11) Create a object creation service duracloud web app service (as a strawman), should include a basic pipeline and default object creation stage
  • (DFR-33) Create UI for researcher to set up Sync tool

DfR CloudSync needs

  • (CLOUDSYNC-25) Need to add a real one-way sync action. Would be able to determine changes in the source and move them to the destination on an ongoing basis
  • (CLOUDSYNC-10) Need to be able to initialize cloudsync via REST calls
  • (tick) Need to be able to get status/reports to ensure that the service is running continually (this should be possible through existing GET calls
  • No labels