Page History

Versions Compared

Old Version 10

changes.mady.by.user Thomas Bernhart

Saved on Sep 10, 2019

compared with

New Version Current

changes.mady.by.user Thomas Bernhart

Saved on Sep 20, 2019

Key

This line was added.
This line was removed.
Formatting was changed.

Comment: Updated requirements for Query Service

...

Use cases and requirements

General requirements

we must be able to run Fedora 6 on Windows Servers

Query service

Use cases

if a query service is integrated into fedora, docuteam would like to use this service for the following queries:
- total number of objects in a namespace (used to be PID namespace in Fedora3; equivalent to PID namespace in Fedora6 has to be defined; possible solutions: a) use toplevel objects to group objects by namespaces, b) use a rdf triple to assign a object to a pid namespace, c) use the Name Assigning Authority Number (NAAN) of ARKs to group objects)
- get all available file formats in a fedora instance (we use PRONOM identifiers PUIDs to store this information) and to number of objects with a specific file format
- get the total space used
- get available free space for the persitent storage (optional as this could be solved by monitoring tools)
- get the total space used by namespace (see above on what a namespace could be)
- get all objects with a specific Triple (e.g. PRONOM identifier, some alternative identifier)

...

use existing standards for the query language (SPARQL comes and LDPath (https://marmotta.apache.org/ldpath/) come to mind)
should support aggregation functions (e.g. sum(), count())
should reuse existing indexes like Solr or triple store if present?
→ do not build another index, if these tools are present anyway

...

make optional?
will query service be part of the fedora API spec?
does it make sense to hardcode the supported queries?
only provide a search UI but actually use a triple store?

Answers/Remarks by Andrew Woods

Regarding "Query service", we have gotten consistent feedback that an integrated, synchronous search index should come with Fedora. The use case is for clients to create/update a Fedora resource, then immediately be able to query Fedora with the expectation that the resource be in the index. The externalized, asynchronous indices do not satisfy this use case.
We will certainly use the queries you have enumerated in the testing of the new query service.

Persistence

Use cases

For our cloud infrastructure, docuteam would like to use s3 compatible object storage (provided by Ceph https://ceph.io/)
Some docuteam clients might want to stick with a simpler storage model than OCFL to reduce storage requirements

...