Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

Research
Researcher
Research Materials
Dataset Data
Subset
Selection
Search
Analysis
Correlation
Service Administrator
Account
Access
Deposit
Copy
Research Materials
Research Information Life Cycle

...

DfR has the opportunity to reduce the impact of the data deluge on researchers — enabling them to concentrate on their real work — by automating many of the rote activities for handling digital research material. But DfR also has the opportunity to do much more since "value added" services can easily operate on data when copies reside in a Cloud. It is the work of this project to analyze researcher and research institution needs — then build software and offer services to meet those needs.

...

Enterprise-class Cloud Backup

The first goal of DfR is to enable backup to a "safe" Cloud storage infrastructure that automatically makes copies of research materials to a remote location, but still under the control of the research and/or institution. Having remote copies serve in the role of "backup" for research materials, particularly when only one copy exists on a computer "under the desk". Having the only copies in the same location does not protect you from losing your work to damage or destruction of the computer's storage or secondary backup (disk or tape). The copy must be remote.

...

But the CONOPs for DfR imagines something more than backup. Having backups is very good but limiting. First, when the scale of the digital materials get too large current backup technology has problems. The backup media gets massive and finding things is hard. Plus, digital materials need to be returned to operational storage if you want to work on older versions. The biggest problem with backups is that you can lose the organization of the materials. Usually, the only organization you can save is the directory structure, at it changes over time. DfR needs to capture and maintain the relationships between research material. Also with backups, you may not be able to capture metadata that helps you characterize the research materials.

One feature identified by this project that researcher's want help in "finding their stuff". This is particularly hard when the "stuff" is on a collaborators or assistants laptop.

...

Please refer to the picture below. We cannot describe all of the capabilities imagined in the DfR in one picture so the picture below describes a part of the automated safe copy backup capability. It is similar to other remote copy capabilities but it is conceived to be an open but trusted approach to creating a network of storage providers.

Research material materials (notably data) usually originates in the researcher's "work space." The may be a laboratory, office, from an instrument or papers plucked from a library. There are so many potential "work spaces" we cannot enumerate them here. And research materials are shared between researcher's, research assistants and staff (often by thumb drive). Team members come and go, often taking the only copy of research materials with them. Commonly organization of the research material is in someone's head (at worst) or by a piece of software (sometimes hard to figure out).

...

Thereafter, the storage provider network synchronizes your operational storage with their copy storage. The synchronization can be one way (to the provider or to you), both ways, with or without versions, automatic or on-demand. So when a laptop walks out the door, you can fetch back it monitored contents (but not the personal ones outside the plan).To Do:

Additional Topics

  • Web of Trust, Policy, Policy Enforcement

...

  • Backup does not solve many requirements for the "data life cycle" (move that to a data life cycle section).

...

  • Institutional Repositories don't work well for research data and and research workflow.

...

  • Storage Infrastructure

Summary User Stories

This section contains selected high level user stories. These stories are not intended to cover the entire DfR service in detail; indeed too much detail may obscure the core functions.

...

Services, Components and Aspects

...

Security

...

Object Creation Service

Service Execution Environment

Service and Component Interactions

...

Service and Component Interfaces

Technologies

  • Cloud

    • DuraCloud

    • Eucalyptus

    • Commercial Cloud

  • Box.net/DropBox like

  • Replication Services

    • SAM/QFS

  • Bus/Orchestration Services

    • Messaging/Notification

    • iRODS

    • Grid Workflow (Taverna, Kepler)

    • Apache Camel
  • Policy Framework

    • Access Security

    • Identity Services

  • Other Clients

    • Online Laboratory Notebooks

    • Virtual Research Environments