Table of Contents |
---|
Stakeholders
- Esmé Cowles
- Benjamin Armintor
- Michael Durbin
- Joshua Westgard
- Youn Noh
- Nick Ruest
- Michael J. Giarlo
- Jon Stroop
- Karen Estlund
- Jim Tuttle
Sprinters
Developers
...
Sprints
Sprint
...
1
...
Expand |
---|
...
|
...
Status | ||
---|---|---|
|
...
Status | ||
---|---|---|
|
Status | ||||
---|---|---|---|---|
|
...
Status | ||
---|---|---|
|
Testing and Validation
- Michael Durbin
Status title Phase 1 - Joshua Westgard
Status title Phase 1 Status colour Blue title Phase 2 - Justin Simpson
Status title Phase 1 - Youn Noh
Status title Phase 1 - Yinlin Chen
Status title Phase 1 - Nick Ruest
Status title Phase 1 Status colour Blue title Phase 2 - Bethany Seeger
Status title Phase 1
Documentation
- Youn Noh
Status title Phase 1 Status colour Blue title Phase 2 - Joshua Westgard
Status title Phase 1 Status colour Blue title Phase 2 - Nick Ruest
Status title Phase 1 Status colour Blue title Phase 2
| ||||||||
|
Sprint 3
Expand | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
|
Sprint 4
Expand | ||||||
---|---|---|---|---|---|---|
| ||||||
|
Use cases
Transfer between Fedora and external preservation systems, such as APTrust, MetaArchive, LOCKSS, DPN, Archivematica, etc
Package[Export] the content of a single Fedora container and all its descendant resourcesTransfer between fedora instances or (more generally) from Fedora to an LDP archive
load[Import]the contents of a packageinto a specified container.Round-tripping resources in Fedora in support of backup/restore
A start has been made on this in FCREPO-1990;
The implementation referenced in the above ticket is not dead, though not actively being worked on at the moment; pull requests welcomed (though others may well wish to take it in a different direction).
A rebuilder that:
Is not solely dependent on a intact backup of the repository index
Works off shredded serializations that can be supported with file preservation techniques
Can recover as much as possible of a repository in the face of integrity issues (supports partial recovery)
Supports gathering copies of the shreds (serializations) from multiple sources to recover a repository
Round-tripping resources in Fedora in support of Fedora repository version upgrades
Batch loading arbitrary sets of resources from metadata spreadsheet and binaries (may well be difficult – or not worth it – to try to generalize such a feature).Import or export containers or binaries using add, overwrite, or delete operations. Configure the data model and the source and the target for each resource that will be updated. Allow target containers to be non-empty before import and source containers to be non-empty after export. Maintain ordering, etc. Support versioning. Examples: add issues to a publication; add fragments to a manuscript; add data sets to a longitudinal study; add time-series images from telescopes; remove resources determined to be under copyright; release resources after restrictions on access have expired.
Perform multiple metadata-only exports, and then restore an earlier version from an export.
Use cases yet to be rolled into requirements
Import objects from an external system (such as Figshare, where a research data object might be prepared) into a Fedora preservation repository with either Hydra or Islandora on top. (Implies compliance with Hydra and/or Islandora object models)
To migrate from internal content to external content, export metadata only and then import it into another repository. The links to the new external content locations would be added afterwards.
Requirements
External Systems
Support import from and export to a TBD list of external systems.Status colour Blue title Phase 2 APTrust - University of Maryland (Joshua Westgard)
Archivematica - Artefactual Systems (Justin Simpson)
MetaArchive - Penn State (Ben Goldman)
Perseids - Tufts - Bridget Almas
General
Support transacting in RDFStatus title Phase 1
Support allowing the option to include BinariesStatus title Phase 1
Support references from exported resources to other exported resourcesStatus title Phase 1
Support transacting in BagIt bagsStatus colour Blue title Phase 2
Support import into a non-existing Fedora containerStatus title Phase 1
Support import into an existing, empty Fedora containerStatus colour Blue title Phase 2
Support import into an existing, non-empty Fedora container with various policies: add, overwrite, delete, version, skipStatus subtle true colour Blue title Phase 3
Support export of resource versionsStatus subtle true colour Blue title Phase 3
Support import of resource versionsStatus subtle true colour Blue title Phase 3
Support export of resource and its "members" based on the ldp:contains predicateStatus title Phase 1
Support export of resource and its "members" based on a user-provided membership predicateStatus colour Blue title Phase 2 Support recursive RDF insert/updates with LDP Indirect Container specified POST (and PUT / PATCH?) (ref: FCREPO-2042)
Round-tripping
Defined as: Export all or a subset of a Fedora repository and importing the export artifacts into a Fedora repository.
Support preservation of dates during round-trippingStatus subtle true colour Blue title Phase 3
Support preservation of version snapshots during round-trippingStatus subtle true colour Blue title Phase 3
The URIs of the round-tripped resources must be the same as the original URIsStatus title Phase 1
Support lossless round-tripping. (ie, if you export a resource, delete that resource and import there is no difference from if you had never performed any of those operations).Status subtle true colour Blue title Phase 3
BagIt
Single resource bagsStatus colour Blue title Phase 2
The structure and scope of accepted and produced BagIt bags must be configurable (resource)Status colour Blue title Phase 2 Clarification: structure relates to required and optional tagfiles in the bag
Clarification: scope relates to contents of the bag, e.g. single object or object and all members based on specific membership predicate
Multi-resource bagsStatus subtle true colour Blue title Phase 3
Unambiguously support linking between resources within a bag, and from resources in the bag to resources outside the bagStatus subtle true colour Blue title Phase 3 e.g. for bagged resources A and B, if A contains statement <A> myns:rel <B>, then it is unambiguous that B is a resource in the bag. Suppose some archive ingests the bag and exposes its contents as web resources with URIs P and Q. If the archive preserves intra-bag links, resource P will have statement <P> myns:rel <Q>. Likewise, if A contains external link <A> myns:rel2 <http://example.org/outside/the/bag>, then an archive that preserves links will have <P> myns:rel2 <http://example.org/outside/the/bag>
Verification Tool
Verify same number of resources on disk as in fcrepoStatus colour Blue title Phase 2
Verify same number of resources in fcrepo as on diskStatus colour Blue title Phase 2
Verify same checksum for binariesStatus colour Blue title Phase 2
Verify same triples for containersStatus colour Blue title Phase 2
Record which resources have been verified (Include checksum for binary resources)Status colour Blue title Phase 2
Verify subset of repository resourcesStatus colour Blue title Phase 2
Verify fcrepo to fcrepoStatus subtle true colour Blue title Phase 3
Verify disk to diskStatus subtle true colour Blue title Phase 3
Use generated config file as sole inputStatus subtle true colour Blue title Phase 3
Considerations
Import/export performance as is possible under the assumption that this work is done via the REST interface
Resources
https://www.ietf.org/archive/id/draft-wilper-semantic-content-pkgs-00.txt
http://dataconservancy.github.io/dc-packaging-spec/dc-packaging-spec-1.0.html (explanation below)
https://github.com/acdha/restful-bag-server (a resource-oriented RESTful HTTP API for exchanging bags)
Meetings
...