You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

DuraCloud Storage

The primary DuraCloud application is called DuraStore, this is where storage is managed. DuraStore has a variety of storage adapters, each of which is built to communicate with a specific storage system. All calls to work with content go through a thin mediation layer, to ensure consistency between the storage providers, and then are passed along to the providers themselves to complete the work.

DuraCloud storage begins with spaces. Spaces are the containers into which content is placed. Spaces are also where access control is defined. Once a space has been created, content can be stored there. All content stored through DuraCloud lands first in a primary Storage Provider and is then copied as needed to other providers. Actions that occur with content always take place through the storage REST API (even if this is hidden by client tools or nice UI controls.)

Storage REST Interface

Space Actions

  • Add Space
  • Get Space Properties
  • Get Spaces List
  • Get Space Content List
  • Get/Set Space Access
  • Delete Space

Content Actions
  • Add Content
  • Get/Set Content Properties
  • Get Content
  • Copy Content
  • Delete Content

Other Actions

  • Get Stores
  • Get Tasks List
  • Perform Task

DuraCloud Instance

Wrapping around the storage system is a DuraCloud instance. This is the runtime system, which includes two other web applications. One of these applications, DurAdmin, provides the web-based DuraCloud user interface. The other, DuraBoss, manages the generation of storage reports. Both of these applications operate by making calls through DuraStore. All three of these applications live on a virtual server where they are hosted to provide the DuraCloud application endpoints.

 

DuraCloud Mill

The DuraCloud Mill is the back-end system which handles much of the data processing for the DuraCloud system. The Mill makes use of a set of queues where all work tasks that need to be performed are placed. Each DuraCloud instance puts a task in one of the Mill queues every time any storage action occurs. This allows the Mill to keep track of all content in the DuraCloud system.

A series of workers perform the jobs outlined in the work tasks. These workers can scale automatically based on the number of items in each queue, allowing the number of workers to increase and decrease as the amount of work to be done changes. The workers communicate with the storage providers, as well as with a database that is used to maintain system state.

DuraCloud maintains a set of duplication policies which define which spaces should be duplicated between which providers. This information is used by the workers to perform the duplication actions that are needed to keep all content in sync between providers. As a secondary measure to ensure consistency, another part of the Mill system, called the Duplication Producer, looks at the duplication policies, and adds a task for each content item that should be duplicated to a task queue. The workers use these tasks to verify that, indeed, all content is duplicated as it should be.

Another type of producer, the Bit Integrity Producer, is used to perform checks for content bit-level integrity. Much like the Duplication Producer, the Bit Integrity Producer adds a task to the task queue for each content item in the DuraCloud system. The workers then execute those tasks by retrieving the content, and verifying that the checksum computed for the content matches that provided by the storage provider and maintained in the DuraCloud manifest.

 

 

  • No labels