Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Add Content
  • Get/Set Content Properties
  • Get Content
  • Copy Content
  • Delete Content

Other Actions

  • Get Stores
  • Get Tasks List
  • Perform Task
Panel

DuraCloud

...

Deployment

The DuraCloud applications are deployed via a cluster of instances each running both the DuraCloud Storage (DuraStore) and DuraCloud UI (DurAdmin) applications. The cluster uses auto-scaling to set the number of servers based on load; in front of the cluster is a load balancer to distribute load between the available servers. The DuraCloud UI application makes calls through the REST API, which in turn communicates with storage providers to manage data. Storage actions are sent to queues that are watched by the DuraCloud Mill for processing. The processed Mill data is stored in the Mill database, which the DuraCloud applications consult to display audit logs, spaces manifests, bit integrity reports, and other results of Mill processing.

Users and accounts in DuraCloud are maintained via the Management Console application. This application also runs in a auto-scaled cluster with a load balancer. The Management Console interacts with a database to update and store account and user information. The DuraCloud storage and UI applications also consult the administrative database to retrieve account and user information. When updates occur in the Management Console, notifications are sent to ensure changes are picked up properly by the DuraCloud applications. Similar notifications are sent by the DuraCloud applications themselves on changes to ACLs and other security settingsWrapping around the storage system is a DuraCloud instance. This is the runtime system, which includes two other web applications. One of these applications, DurAdmin, provides the web-based DuraCloud user interface. The other, DuraBoss, manages the generation of storage reports. Both of these applications operate by making calls through DuraStore. All three of these applications live on a virtual server where they are hosted to provide the DuraCloud application endpoints.

 

Panel

DuraCloud Mill

The DuraCloud Mill is the back-end system which handles much of the data processing for the DuraCloud system. The Mill makes use of a set of queues where all work tasks that need to be performed are placed. Each DuraCloud instance puts a task in one of the Mill queues every time any storage action occurs. These tasked are then worked, resulting in updates to the audit and manifest details for each space. This allows the Mill to keep track of all content in the DuraCloud system.

A series of workers perform the jobs outlined in the work tasks. These workers can scale automatically based on the number of items in each queue, allowing the number of workers to increase and decrease as the amount of work to be done changes. The workers communicate with the storage providers, as well as with a database that is used to maintain system state.

...

Another type of producer, the Bit Integrity Producer, is used to perform checks for content bit-level integrity. Much like the Duplication Producer, the Bit Integrity Producer adds a task to the task queue for each content item in the DuraCloud system. The workers then execute those tasks by retrieving the content, and verifying that the checksum computed for the content matches that provided by the storage provider and maintained in the DuraCloud manifest.

The DuraCloud manifest and audit cache are maintained in the Mill database. This database is kept up-to-date by the workers as they process audit tasks. The DuraCloud instance consults the Mill database when manifest or audit information is required. The audit cache maintained in the database is transitioned into long-term file-level audit storage after a short period of time.

The capabilities of the DuraCloud Mill are likely to expand over time, as it is designed to be a generic and scalable task processing engine.

Panel