Following is a list of DPN requirements and assumptions
In the development of the system, legal and technical constraints have been imposed that change some assumptions, as originally conceived.
Those changes do not affect the fundamental assumptions of the DPN concept.
DPN is dark, geographically and technologically diverse replication, with individual and centralized audit and succession.
"The intent of DPN remains the same. Build a system that ensures that deposited material remains available to future generations by architecting around single points of failure broadly conceived (i.e., technical failure, political failure, organizational failure, geographic failure)."
DPN is a federated preservation network of independent preservation archives
.All archives act to preserve content, not just as remote storage.
DPN
will havehas agreements to allow for Succession of content in the event an archive can no longer perform its function as an archive.
This is currently envisioned as a “Quit Claim” framework
Based on input from the legal team, comprised from the Node institutions, we will update recommendations
The successor of content will take on
the responsibilities of and become the
Administrative Node for the failed archive
- Brighten content so as to meet the needs of the community of the content.
- Act as the arbiter of content for clients of the former archi
- In the event of a Succession occurrence, all replicating nodes will recognize the new successor and act in accordance with prior agreements held by the former archive.
DPN,
DPN willas much as possible
have, has independent preservation implementations at each node.
DPN communications
willfollow best practices and use mutual authentication over secure channels.
DPN
will provideprovides for replication of content from a
FirstIngest Node
(originator of content)to Replicating Nodes
.The replicationDPN provides recovery of lost content
- DPN First Nodes will be the authoritative node for content.
- The First Node will be the arbiter of digital objects until such time as succession takes place.
- DPN will provide recovery of lost content at First Node, or Replicating Nodes (healing).
at any Node
DPN core service keeps 3 copies of ingested objects
DPN provides a service
DPN will provide a servicesmodel so that any of the
FirstNodes
, or Replicating nodescan ascertain if their content is valid.
DPN
will enable organizational audit.enables centralized auditing
Process audit at the DPN federation
Process audit at the Node
Audit trails for events
DPN
supports Content Fixity audit
- (random, periodic, full) DPN will provide
Global (across federation)
At the Node level
Periodic Reporting
Significant event reporting, for example succession events, loss of content, etc.
DPN
will supportsupports
DPN will support a DPN UUID
(globally unique identifier for each bag deposited)
a common lightweight wrapper (bag) for content transfer
retention of ingested content indefinitely
Content may be de-accessioned upon extenuating circumstances (e.g. court order)
duplication of critical metadata
in the 'registry'
and also
in the content bags
durable and persistent communication methods to support unreliable networks and node failure
a distributed model, assuming eventual consistency (CAP Theorem) for replication of content and registries
DPN provides status reporting, activity, logging, traffic, volume, events, etc.
DPN content and services
will beare distributed and federated.
DPN
nodes will all be first class nodes with respect to each other.Successionsupports succession and brightening of content
will be difficult, expensive and time consuming. We must test brightening and not assume that the vast institutional knowledge of each repository is easily captured or represented in DPN..
Implementations are de-coupled implementations and architecturally distinct, as practicable, but the communication methodology is shared, resilient, and redundant.
DPN objects are preserved by all Nodes and support DPN preservation functions.
DPN has decoupled inter-node communication channels for
content transfer
process control.
Communication between nodes is not dependent upon other nodes.
Production Deliverables - Jan 2016
There is no expectation that we’ll have signed SLA’s before launch
DPN will have agreements to allow for Succession of content in the event an archive can no longer perform its function as an archive.
SLA’s with Ingest Nodes/Administrative Node
SLA documents between Ingest Nodes & Depositor shared and with legal staff
We will require in the SLA that one of the OTHER Administrative Nodes will become the Administrative node for the failed Node.
Act as the restorer of content for clients of the former administrative node
In the event of a Succession occurrence, all replicating nodes will recognize the new successor and act in accordance with prior agreements held by the former archive
- We will use a registry.
- We will use lightweight packaging, such as bagit bags, but only for wrapping content, specific to DPN. The First Node initiates replication and controls changes in registry/database holding information regarding content it is the First Node for.
Replication and update will take time - We are aware of this,
- Because this is a distributed model, we assume eventual consistency for replication of content and registries.
- Implementations will be de-coupled, but communication will not.
- We can create DPN first class objects that are useful for DPN purposes. These objects will be preserved by all Nodes and useful for DPN preservation functions.
- DPN will have separate inter-node communication channels for
- content transfer
- process control
- Communication can be point-to-point and point-to-multi-point.
- Communication will be durable and persistant to support unreliable networks and node failure.
- Support for both synchronous and asynchronous communication as required.
- The goal here is to reduce failures of same implementation.
might need to put something in the SLA recognizing this fact
SLA will state that: The communication layer is shared, but the repository layer is not
Depositor will be able to give us stuff and we can put it into storage
There will be no expectation of global (DPN Level) fixity checking at launch
Initial fixity checking will occur at ingest
Each Node will check fixity according to their local policy
Some cursory reporting available to Depositor
Notification of ingest and replication will be provided by the Ingest Node to the depositor
Replication of content from the Administrative Node to all Replicating Nodes
Make sure we know how many Replicating Nodes will be storing the content
Make sure we know when the Ingest Node will store the ingested content and when they won’t
Make sure we have a clear idea of what kind of storage is available at each
The Ability to recover content to the depositor will be supported via the Ingest/Administrative node
Maintain a registry for objects, create transfer records for replicating nodes, update status of an existing object
Track stored status of replicated transfer
Agreement on the bag size limits that can be replicated across the Nodes
250 Gig bags are the uppermost limit for this release
Bags will be validated upon receipt by the Ingesting Node and the Replicating Node
Validation means:
We will validate all of the files are present and the checksums match the manifest
The structure of the bag will be validated according to the DPN specs
In-Person Post-Mortem/Planning for Phase II - July 16th & 17th
6-Month Post-Launch Roadmap Deliverables (Jun 2016)
Clear idea of what kind of storage capacity is available at each Replicating Node with a framework for deciding to which nodes objects are deposited (Internal Documentation)
Auditing consistency of registry
Auditing the local storage inventory of registry
Fixity tracking - when replicating nodes are doing auditing - including node that performed the fixity check (as part of the provenance/history of the bag)
Support multiple fixity types across the federation
Ongoing fixity checks by each node with reporting out to DPN administration
12-Month Post-Launch Roadmap Deliverables (2017)
Depositor Dashboard
DPN will provide status reporting, activity, logging, traffic, volume, events, etc.
Global (across federation)
At Node level
Periodic Reporting
Significant event reporting, for example succession events, loss of content, etc.
Billing information
Bag Discovery & Retrieval Request Mechanism
a depositor wants all of their bags will satisfy some criteria
- Determination and tracking of depositor assets
...