Page History

Versions Compared

Key

This line was added.
This line was removed.
Formatting was changed.

Comment: Updated use cases/requirements for query service

...

Use cases and requirements

Query service

Use cases

if a query service is integrated into fedora, docuteam would like to use this service for the following queries:
- total number of objects in a namespace (used to be PID namespace in Fedora3; equivalent to PID namespace in Fedora6 has to be defined; possible solutions: a) use toplevel objects to group objects by namespaces, b) use a rdf triple to assign a object to a pid namespace, c) use the Name Assigning Authority Number (NAAN) of ARKs to group objects)
- get all available file formats in a fedora instance (we use PRONOM identifiers PUIDs to store this information) and to number of objects with a specific file format
- get the total space used
- get available free space for the persitent storage (optional as this could be solved by monitoring tools)
- get the total space used by namespace (see above on what a namespace could be)
- get all objects with a specific Triple (e.g. PRONOM identifier, some alternative identifier)

Requirements

...

use existing standards for the query language (SPARQL comes to mind)
should support aggregation functions (e.g. sum(), count())
should reuse existing indexes like Solr or triple store if present?
→ do not build another index, if these tools are present anyway

Questions/Comments

if integrated into fedora: make optional
possibility to query persistence layer for used storage and free storage

Questions

will query service be part of the fedora API spec?
does it make sense to hardcode the supported queries?
use a real SPARQL endpoint in the background ? and only deliver a UI/API?

Persistence

Use cases

For our cloud infrastructure, docuteam would like to use s3 compatible object storage (provided by Ceph https://ceph.io/)
Some docuteam clients might want to stick with a simpler storage model than OCFL to reduce storage requirements

Requirements

OCFL backend: native support to use s3 compatible object storage
alternative “simple” persistence implementation (optional?)

Migration

Use cases

We Wtih the switch to Fedora6, docuteam wants to switch to a ontology-based data model.
Docuteam would like to leverage the functionality of a generic migration utility.
Docuteam would like to migrate Fedora 3 XML-Datastreams into triples.

Requirements

possibility to select/configure which Fedora3 datastreams should be converted to Fedora6 binaries
possibility to extent migration utilities with custom parsers/functionality to create triples

Documentation

Requirements

Step by step installation manual for production use
Recommendations for storage systems (e.g. WORM, Object Storage, generic NAS)

...