...
September 9, 2020
Where
Online via Zoom
Meeting Notes
- Notes added here or link to collaborative notes document
Agenda/Presentations
All times are CEST.
Time | Topic | Presenter |
---|---|---|
13:00 - 13:10 | Welcome and Introductions | |
13:10 - 13:30 | Fedora Program and Community Update | David Wilcox, LYRASIS |
13:30 - 13:50 | Islandora Updates | Melissa Anez, Islandora Foundation |
13:50 - 14:10 | Phaidra - University of Vienna | Raman Ganguly, University of Vienna |
14:10 - 14:30 | Break | |
14:30 - 14:50 | Memobase 2020: Using Fedora as central repository to store linked data and digital objects | Thomas Bernhart, Docuteam |
14:50 - 15:10 | A field report from the relaunch of an infrastructure: A FEDORA based framework for long-term storage and dissemination of research data | Johannes Stigler, Centre for Information Modelling, University Graz |
15:10 - 15:30 | Break | |
15:30 - 16:20 | Lightning Talks:
|
|
16:20 - 16:50 | Wrap-up and Discussion |
Notes
13:10 - 13:30 Fedora Program and Community Update David Wilcox, LYRASIS
Questions:
- What is the timeline for release of the Fedora 6 beta?
- What will the DB cache be used to store?
- All of Fedora's content and metadata are stored on disk in an OCFL storage root. For performance reasons, Fedora maintains a cache of system and user metadata in a rebuildable database: https://wiki.lyrasis.org/display/FEDORA6x/Configuring+JDBC+Object+Store
13:30 - 13:50 Islandora Updates Melissa Anez, Islandora Foundation
- Islandora: Drupal + Fedora
- Islandora 7: Vertical stack (cheeseburger)
- Islandora 8: Distributed architecture (bento box)
- Using Fedora for smart storage, fixity, versioning (memento), and OCFL (in version 6)
- Only putting preservation copies in Fedora; access copies are stored in Drupal
- Flysystem
- Fedora is treated as the Drupal filesystem from the user perspective
- Islandora community
- 320+ installations
- Backed by the Islandora Foundation, which is member supported
- Examples:
- Archives Central → probably the first digital archive using the Records in Context ontology in production
- Questions
- Does Islandora make assumptions about data model used to describe the collection → default form is based on MODS, but Islandora is very flexible regarding the data model (use Drupal forms and map those to Fedora properties)
13:50 - 14:10 Phaidra - University of Vienna
- Phaidra - repository system based on Fedora
- Started in 2007
- Both the repository itself and the services around the repository
- Community: Austria, Italy, Western Balkans
- Architecture
- Changed from monolithic to modular
- Fedora at the centre, surrounded by Phaidra Core
- System of plug-ins and a rich API
- Multiple user interfaces and workflows
- Workflows
- 4 phased workflow model
- Pre-ingest, ingest, management, re-use
- Focused on long tail of research data
14:30 - 14:50 Memobase 2020 - Thomas Bernhart, Docuteam
- Using Fedora to store linked data and digital objects
- Memobase: audio/video collection of Swiss heritage
- http://memobase.ch
- Collaboration between University of Basel, Outermedia, and Docuteam
- Built on Fedora 5
- Data model
- Transition from XML to RDF
- Architecture
- SFTP server where users can upload data
- Transformed in import pipeline and stored in Fedora
- Processed via messages (Kafka cluster)
- Fedora integration
- Fedora ingest service
- Messages (Activity Streams) to event handler
- Metadata extractor
- Challenges
- Mapping external URIs to internal URIs
- Handling references between resources (cannot create relationships to resources that don’t exist yet - this is no longer an issue in Fedora 6)
- When to use different LDP container types
14:50 - 15:10 A field report from the relaunch of an infrastructure: A FEDORA based framework for long-term storage and dissemination of research data - Johannes Stigler, Centre for Information Modelling, University Graz
- Migrated Fedora 3 to 4.7
- Using Fedora since version 2
- New versions of Fedora represent a paradigm shift
- Infrastructure
- Fedora is a storage layer in a complex system that stores research data generated by a variety of digital humanities projects
- Based on XML structure
- 114,000 digital objects in Fedora 3, many of which are complex
- Services are triggered by user uploads
- Content models
- Structural definitions for object types
- Manuscript transcripts based on TEI
- Content models trigger workflows
- Content models support specific XML schemas
- Migration project
- Need to migrate Fedora 3 REST API functionality to Fedora 6
- Java based thin desktop client to handle ingest workflows
- Using Docker and Kubernetes
- Advantage: isolation
- Disadvantages: complexity, difficulty identifying errors, need to learn new practices
- Using Kubernetes for clustering and orchestration
Questions:
- From Gunter Vasold: What software is used for the Media Server?
Response: I think it is a custom implementation from the University Basel
15:30 - 16:20 Lightning Talks:
Fedora Commons in the CLARIN-D infrastructure - Thomas Eckhart, Leipzig University
- CLARIN - Common Language Resources and Technology Infrastructure
- CLARIN-D - National consortium in Germany
- Distributed infrastructure with standardized interfaces
- Using OAI-PMH for metadata exchange - most important endpoint
- CLARIN-D (10 centres): majority Fedora 3.x + ProAI
- Migration
- Issues migrating to Fedora 4.x
- Majority still using Fedora 3.x
- Created a requirements document
eHumanities - Jaime Penagos, University Library Ludwig-Maximilian University Munich
- Research data from the digital humanities
- Lexical information from the Alpine region
- Approximately 500,000 datasets
- Many relations between datasets
- Infrastructure
- Used to transform XML to specific file formats
- Fedora 5, Solr, Blacklight
- Apache Camel
- Storing CSVs
- Challenges
- Scalability: issues with messaging queue
- Many concurrent operations can lead to performance degradation
- Questions
- Cluster vs. standalone?
- Impacts of eventual migration to Fedora 6?
Past problems - Fine Future - Oliver Schöner, Berlin State Library
- Many Fedora 4.x repositories at the Berlin State Library
- ITR (East Asian department): Many millions of objects (mostly pages)
- Stabirep: Born digital materials
- RGZ: Prussian judgments
- DC: Bibliographic materials (no binaries)
- Fedora 4.5.1 server shutdown due to power outage
- One file in the Infinispan layer had a bad OS owner
- Fedora restarted but Tomcat did not
- Problem was difficult to discover and solve
- If migrations were easier the data could have been transferred to a new instance with an upgraded Fedora
- Believe that OCFL is the future
Phaidra / Islandora - Dragana Stolić, University library "Svetozar Marković"
- Using Phaidra for almost a decade
- Started using Islandora 2 years ago
- Website: https://phaidrabg.bg.ac.rs
- Started in 2011
- PhD theses
- Permanent archiving
- Looking for a more adjustable tool
- Serbian Literary Criticism
- Using Islandora 7
- Self-archiving of literary criticism texts
- Phaidra is very user-friendly and simple
- Islandora is more adjustable
- Archiving ethnographic data
- Ethnographic data is very compatible with Fedora / Phaidra
- Which repository to choose?
- Decided to return to Phaidra when the time comes (probably next year)
Wrap-up and Discussion
- Nice to have an event scheduled around European timezones to hear from the European Fedora community
- Issue with Fedora: too many major version upgrades, hard to absorb
- Many smaller institutions have a hard time installing and setting up Fedora. Not out-of-the-box enough
- ISLE provides a Docker deployment of Islandora
- Phaidra also looking into Docker deployment
- Some institutions choose Fedora over an out-of-the-box solution in order to achieve a lot of flexibility
- How often to have users group meetings?
- An open forum to ask questions
- Meetings should occur at least once per year, or perhaps once every 6 months
- More discussion on problems and open issues, difficulties
- Another call after some initial experiences with migrations to discuss