Introduction
The following lists were defined shortly after the DfR Architecture Meeting, which was held Jan 3-5, 2012. These lists provide both research needs and software needs from the perspective of the DfR project, in relation to existing DuraCloud and Fedora software capabilities.
DfR Research Needs
- (DFR-2) Shiboleth integration with:
- DuraCloud REST APIs
- DuraCloud and Management Console UI
- DuraCloud clients
- (DFR-1) File processing workflow, how to coordinate each step in the processing pipeline needed to create Fedora objects
- Should take a look at Aaron Birkland's conversion pipeline, existing ETL tools, the FAST search and access tool, the Harvard work, DROID, etc.
- (DFR-3) Verification that files synced to DuraCloud via local connections to Box.net/Dropbox/etc work as expected
- (DFR-4) Determine pieces of Islandora content model which need to be evoked for Islandora to provide full use/access to Fedora objects which were not created via Islandora
- (DFR-5) Determine Islandora Drupal to Fedora XACML security model
DfR Needs which overlap with DuraCloud needs
- (DFR-74 -> DURACLOUD-525) Need an auto-user to be able to handle running multiple services on a schedule
- (DURACLOUD-530, DURACLOUD-531) Services need to be maintained across reboots/upgrades
- (DURACLOUD-582) Sync tool to create a transfer log of all files moved to DuraCloud (with checksum, in format to allow for running through bit integrity checker)
- (DFR-75 -> DURACLOUD-522, DURACLOUD-627) Sync and Retrieval tools to handle file encryption (assumes Encryption Model 2; location/origination of key TBD)
- (DURACLOUDPRIV-135) Update management console to be able to spin up auxilliary instances (for Fedora, etc) along with the DuraCloud instance
- (DURACLOUDPRIV-137) Create REST API for Management Console
DfR Needs which are not otherwise directly applicable to DuraCloud
- (DFR-6) Management Console, be able to create a DfR account
- (DFR-7) Split apart sync tool to produce:
- A UI which collects all of the necessary parameters (currently command line and Swing versions)
- A piece to discover changes
- A piece that pushes content to DuraCloud - this would be considered the sharable SDK
- (DFR-8) Sync tool should be provided with a unique ID, and should include that ID as a property on all items synced
- (DFR-9) Sync tool to allow for collection of additional metadata and store as a manifest in DuraCloud
- (DFR-10) Integrate external auth (e.g. Shiboleth) based on research findings in DuraCloud at: REST API, DurAdmin, and client utilities (e.g. sync tool)
- Consider if it makes sense for us to provide an IDP server to support a Shibboleth id manager that we host for those users who are not wanting to hook up to an existing provider (i.e. all DuraCloud users end up in Shib)
- (DFR-11) Create a object creation service duracloud web app service (as a strawman), should include a basic pipeline and default object creation stage
- (DFR-33) Create UI for researcher to set up Sync tool
DfR CloudSync needs
- (CLOUDSYNC-25) Need to add a real one-way sync action. Would be able to determine changes in the source and move them to the destination on an ongoing basis
- (CLOUDSYNC-10) Need to be able to initialize cloudsync via REST calls
Need to be able to get status/reports to ensure that the service is running continually (this should be possible through existing GET calls