Fedora Group Discussion

Attendees:

  • David Wilcox
  • Andrew Woods
  • Robin R.
  • Jim Tuttle
  • Rob Cart.
  • xx
  • xx
  • xx
  • Jon Dunn
  • Brian Schott
  • Tim McGeary
  • xx
  • Susan Lafferty
  • Joan (from Hong Kong)
  • Tom Cramer
  • Michael Klein
  • Peter Murray
  • xx
  • Debra
  • Tom (from ICPSR)
  • xx
  • John Doyle
  • Karim Boughida

Fedora 4 Pilot Projects

  • context: running pilots for several months now, second set
  • first set were beta prior to Fedora 4 release with 4 institutions
  • goal of next major Fedora 4 release due in June, support Fedora 3 to 4 migrations
  • features won’t be as important or many as they are supporting the migration efforts
  • “upgration” software upgrade and content migration
  • pilots are focused on taking small set of collections and migration to Fedora 4 to generate feedback, documents bugs, help develop migration tools, and help create documentation
  • there are other installations happening that aren’t upgration pilots but are still doing interesting things
  • results of survey
    • surveyed and responses from 54 institutions (only half were members)
    • mixed bag in terms of when/timeline for upgrading/migrating among respondents
    • everyone wants documentation of path from Fedora 3 to 4
    • everyone wants a tool to assist automating migration (configurable command line tool)
    • training for this was also of high interest (allocated 10k in budget to develop training curriculum)
    • survey helped validate efforts, features already in Fedora 4 or on the immediate roadmap and support is what they are trying to achieve
    • Andrew and David had conducted 4-1 hour training sessions/modules in the previous year to give a sense of the type of training that was needed (3 workshops, DC user group meeting, Islandora camp, Australia e-research conference)
    • 30-40 attendees to each
    • continue to do these 1-day workshops as they are a great way to reach out and bring in new interest
  • right now there are 5 pilot institutions: Columbia, UNSW, Simon Fraser, other, National Library of Wales
  • pilots kicked off in January and wrap up in April
  • Hydra and Islandora RDF-way of modeling data, content modeling, is the backbone of migrating content from Fedora 3 to Fedora 4, now called Portland Common Data Model (anyone doing data modeling around linked data could really use this)
    • big win to interoperability as everyone is modeling data in the same way under hood
    • very bottom up, community driven approach for this
  • Question: version of the data model that a non-developer could consume available?
  • Question: who is ready to move now? vs. who is waiting?
    • move objects as-is and then consider moving to portland data model (do migration and then remodel data)
    • Fedora 3 objects to fedora 4 containers/fedora 3 data streams to fedora 4 binaries
    • Rob notes that they don’t want do two data modeling phases in two years, so do a good bit of the modeling during the migration and then just update model post migration (i.e. take the hit now).
    • Susan notes that they are likely doing flat data migration and then focus on integrations of the repository later this year (for what the repository already integrates with).
    • Joan from Hong Kong is remodeling data now anyway, so are in a good position to adopt a lightweight, more accurate approach.
  • not sure how to handle books, older version of hydra, use of disseminators, need to keep current with Avalon (lots of projects/needs/timelines) that may not be able to make the move with the same desire to keep all in the same repository
    • need to minimize period of time running both Fedora repository versions “chasing tails until achieve escape velocity”
  • Lyrasis is running multiple 3.x repos for each members (consolidate infrastructure into one Fedora using hierarchical structure), don’t have bandwidth as it focused on CollectionSpace project, so looking at others for experience in moving/consolidating
  • generally, Fedora is moving to standards-based approach so that plugging “repository services” is easier, creating standard connections so that there synergies between Hydra and Islandora, etc.
  • consensus is that the PCM might be the most important development in Fedora community
    • PCM = relationship model help build how objects are going to relate to one another with RDF assertions on how they can relate
  • part of the goal of the pilot projects is to document processes, goals, what data looks like in Fedora 3 to 4, etc. and then there will be summary document made afterward for each
  • pilots have been successful in identifying gaps, creates documents, generates reports, so it is likely that pilot phases will continue as it seems appropriate
  • John Doyle asks: Portland model exists in mailing list, for someone like NLM who is not Hydra/Islandora, how can they consider this and know how to apply?
  • Susan asks: this sounds very North American, how can we make sure this is international
  • Tom notes that there is a distillation of this ongoing discussion and mailing list details on the wiki
    • additionally making sure an international perspective is shared on the mailing list to build consensus will be vital to make sure that what is developed is international

Fedora for Future

  • originally conceived Fedora 4 as three year plan
  • time to think about what comes next, years 4-9
  • Tom from ICPSR notes the needs for data analytics and curating processes would be of interest, ways to integrate with other Fedora’s data, etc.
  • deposit, enrich, and enhance over time
  • Brian asked: to what extent has versioning been discussed (as in his repo he has 4-5 versions of data)
  • Andrew responds: Fedora 4 does support versions other question is if you have Fedora 3 or homegrown dams with versions, migration path is recreating versions in Fedora 4 (roll through them), presumably you want to create original timestamps, real creation timestamp that can’t modify, so need to add additional description to capture original timestamp
  • no versions with federated external content is a limitation (also known as projection)
  • IIIF is being discussed as being baked in and Brian thinks this a good thing to move forward with international operability
  • Tom from ICPSR questions how closely Fedora 4 complies with OAIS framework? (TDR iso standard)
  • can we build a reference implementation that ties to the OAIS model, this is an interesting opportunity to map out this standard
  • best practices for compliance to this would a good community collaborative effort
  • Jon Dunn notes that they have some users who have a requirement for a certified repository (typically archives that had previously had physical content), something to reassure this audience
  • Brian notes that this is good for internal users, but will be much more important for external users (such as feds, etc.)
  • 2% of annual budget once (18-month process)
  • Jon notes things like an audit service, along side more common tools
  • UMich went through entire process of OAIS with their repository which was a long process just to perform this analysis
  • Peter notes synchronizing resource capability (take advantage of federation tools) would be nice to have
  • content transportation/sharing?
  • Rob notes that Columbia is going to be working with local DPLA service hub, so that will be regular process to propagate out regularly
  • Robin notes that the tooling that we use to disseminate content won’t be part of Fedora core
  • performance, throughput, and scaling will be what matters at the Fedora core/clustering
  • messaging, notifications, etc.
  • other things that people are hearing/needing
  • Susan notes metrics, metrics, metrics, prove the impact of the repository, ways for people to enhance academic profile, justify investment in repository
  • enhance formal metrics, perhaps push out to systems like VIVO
  • Brian is hearing a lot more about open access within state of CA; corollary is hearing from users that the barrier to entry must be lowered, so working with CPL and others to solve this issue
  • if Fedora could facilitate ease of depositing content into open access
  • a possibility is SWORD and there is a project underway for developing a plug-in to Fedora 4
  • when people say repository, they mean MANY different things
  • Jon notes a big interest in how to make Fedora fit better into research storage infrastructure (backend), but more on front-end when research is more in “scratch” environment
    • projection, asynchronous storage support, models for how to make that frictionless
  • Tim McG. notes that asynchronous is key
  • Rob notes that access control issues, and then layer on top is curatorial aspect (add metadata to research earlier in the process) so that it can be moved into repository earlier
    • gray area in middle of “active storage” where two sides are meeting
  • SIDORA seemed to be at edge of this problem
    • this has been heavily customized and are now working to standardize on the current version of Islandora over Fedora 3
    • this system is one that researchers love and has gotten a lot of traction
  • Robin notes that Seed/cede project is doing similar work
  • Fedora 4 native linked data capacity and the triple store discussion
    • Stanford is looking for open annotation (converting marc to bit frame and loading into Fedora and see what happens)
  • some conversations for using Fedora as linked data platform at UVA as well
  • opportunity for Fedora working group around linked data
  • Fedora is an RDF store and in Stanford’s case there are a lot of triples
  • Jonathan notes that the VIVO community has experience with triple stores

Fedora Sustainability

  • what does sustainability mean?
  • more sustainable or at the appropriate level
  • what are the important values to sustain over time?
  • expediency does not foster sustainability (what are we doing not doing that impact sustainability)
  • would a unified data model help with creating a sustainable product over time?
  • are we making expedient choices in software development and how it impacts long term of platform?
  • Robin notes that a long term choice at UVA has been dedicating resources to the project for a long term, has gained experience and been able to lead development, committed and supportive to evolve the project to UVA’s benefit, has improved soft skill of those participating
  • Carl notes needs to be segment users, identify needs/requirements, what do we (Fedora) do to meet those needs (align product with market needs), sustainability is bigger user base
  • Tom from ICPSR changing mindset a bit within large institutions about “moving people resources” focus to support community project vs. supporting local homegrown systems
  • Jon D. notes lots of projects competing for internal resources, better messaging from DuraSpace at higher levels of leadership around what is really needed to sustain Fedora 4 would help
  • membership comes with benefits and responsibilities, what are our collective responsibilities to sustain the project?
    • define the services that Fedora provides (come to agreement), align implementation to existing standards, reference implementation, take pieces of implementation and swap out pieces supported by other communities (example given is databases), so that limited development resources can move up the stack to create more services that make Fedora more useful
  • Brian posits the idea of meeting the needs of the market (non-members) who need migration services (likely even to have the migration handled for them), generate revenue, and feed that back to open source side
  • how to deal with pipeline of tasks with limited development resources
  • Tom notes expanding market vs. losing control/gaining feedback
  • need to have key stakeholders to sustain project
  • Carl notes possibility to rally around a project that he can pitch to administration (with visible outcomes)
  • sustainability is a marathon not a sprint
  • people gather together to identified problem
    • Fedora has two efforts: specific feature development sprints and more ongoing maintenance sprints
  • the leadership has to assist providing this leadership and insight for the future roadmap (as resources are likely mostly coming from this group and a bit outside)
  • Susan notes that need to keep an eye on what stakeholders want, so that can remain sustainable so that whatever is built is useable and aligned with national/international trends
  • is there a way instead of money to contribute, need to revisit messaging
  • governance was non-expedient effort, took time, but was absolutely necessary, journey of governance has just begun, still in transition period
  • keep an eye on how the organization is being led, ways governance model could be updated, revised, to make sure representation is correct, BUILD TRUST (do we trust this system)
    • openness, transparency, representation, so governance is part of that process
  • survey was very well received/responded to, these sorts of these specific targeted surveys are worthwhile where there is a need (and David’s position handles some of this)
  • No labels