Date: Thursday March 23, 2023 @ 10am Eastern
Where: https://lyrasis.zoom.us/j/85992335527
Attendance:
Ugnius Lukosius
Antanas Streimikis
Basil Marti
Oliver Schoner
Gytis Vievesis
Dan Field
Arran Griffith
Joshua Westgard
Resources:
https://wiki.lyrasis.org/display/FF/OAI-PMH+Use+Cases
Prior Work
- Fedora Ontology exists https://fedora.info/definitions/v4/2015/03/26/oai-pmh
- Python/Django Fedora 6 implementation https://github.com/jnphilipp/django_oai_pmh
- Islandora implementation https://github.com/Islandora/islandora_oai
- PROAI https://proai.sourceforge.net/ and https://github.com/fcrepo3/proai
- Similar E-Theses project which used OAI-PMH to harvest NLW theses from Fedora 3 w/ ProAI: https://ethos.bl.uk/
- Wiki on proai implementation https://biowikifarm.net/meta/OAI-PMH_service_for_K2N_Fedora_Commons_based_repository
Mapping Proposal
Mapping to Fedora proposal:
Repository = Single FCREPO instance
Resource = Fedora Object?
Unique ID = PID. MUST be URI
Deletions - tombstones ?
Sets = isMemberOfCollection ?
Sets
- Everyone has a different concept of what “set” is
- Could use Fedora ontology which is “is MemberofCollection”
- https://fedora.info/definitions/v4/2015/03/26/oai-pmh
- Doesn’t have isMemberOfCollection but has OAI set outlined
PIDs - all PID’s would be unique ID
- UMD wants to encourage use of handle to harvest
- Could we add on to the existing ontology?
- Default to the PID, but provide option to override the PID in a configurable way
- Resource being a Fedora object may cause issues for UMD
- PCDM object can be a page in a book or it can be any part of a book
- Maybe define an additional resource in OAI?
- Have PCDM and there is no distinction between resources/objects that are top level vs. Parts of other objects
- But they want to be able to selectively expose objects but not the entire object or vice versa
- DocuTeam
- Have a similar issue in their F6
- Modelling is specialized and wouldn’t want to run in to issues
Main take-away was the both resources and identifiers should probably be defined by new properties to allow users to opt-in to harvesting on a per resource level
Use case for Fedora as a client?
- Will proceed with implementing from the repository side unless other use cases come forward
Will be built as an extension
- How will it communicate wth Fedora
- Persistent storage required for recording deletions?
- XML exports - how to handle these? Do we holdi a cache of generated OAI-PMH XML?
- Scheduled vs. Manual update process?
- UMD using a lot of asynchronous work and ties back to people’s love of Fedora
Berlin - maybe able to commit to come documentation
UMD - testing and stakeholder for DPLA use case