Governance/Committers Call

17/03/2022


Discussion Topic: How to best begin collecting information on Fedora user installations


Attendees:

Daniel Bernstein

Arran Griffith

Tim Shearer

Jennifer Gilbert

Jared Whiklo

Demian Katz

Robin Ruggaber

Clavin Xu

Mike Ritter

Ben Pennell

Thomas Bernhart

Additional input sent to Arran from:

Scott Prater

Jakov Vezic

Notes:

Known concerns:

  • Privacy - specifically in Europe and UK
  • Community - want to do things to build the community and not tear it down by doing something that they don’t agree with
  • Definitions of what data is
    • Ie. IP address is it personal data?
  • Additional stakeholder groups
    • It falls deep in the stack, so how do Islandora/Samvera installs know we’re there and how do we engage with them to get the information
  • Who has access to this data?
    • Can sister communities come and ask for this data?
    • Where does it live?


  • Importance of having tech community involved in this conversation is one of our key points of interest so that we know what is possible and what is not


  • Jakov’s notes
    • Tech side of view - totally doable
    • If it was “opt-in” we could just provide a pop up
    • Dockerized instances make this more difficult because 
      • Could provide a flag to override
    • Ben: Our fedora instances are on servers which are probably blocked by firewalls from phoning home
    • Could end up with a large number of entries that are random because it’s a self-reporting system
      • Maybe need to put parameters on data collection over time to track active installations
    • IP address possibly not as useful as we think nowadays
      • User supplied contact and organizational information and making it easy for people to know where to go to join the community seem more useful than IPs?
    • Sounds like a good suggestion because we are simply inserting ourselves in to the installation process
    • Upgrade process - would you need to re-opt-in? Or could we just carry it over?


  • Fedora instance reporting tool - separate .jar file
    • Stands alone and you run it independently


  • Jared - had this discussion with Tech group and with Islandora
    • Many institutions will not want something running without notice
    • Firewalls will block getting this information in the first place
    • Language - places where english is not a first language may not know
      • Would want to provide explicit reasoning behind why 
    • Maybe have levels of information that people can offer
      • Ie. a base level of info and then people can offer up additional levels as they feel comfortable (more than one person liked this idea)


  • Thomas - want to have an option for people who WANT to provide this information


  • Can we leverage downloads?
    • Put in something less intrusive maybe something form-based with user input
    • We can tie a link or opt-in at new downloads (like a reminder)


  • Maybe having a form in the fedora admin ui which displays on the landing page and as a prompt in the header until filled out the first time (maybe it links out to a form hosted by LYRASIS if fedora can’t communicate with the internet). Then maybe there could be an easy to use, optional data release option in the UI and API.


  • Click through vs. input would be easier to get people to do
    • Pre-filled information


  • How would we be able to determine the difference between someone’s personal web browser vs where Fedora is running


  • Fedora distributed through Maven
  • People also download it off of the releases page in GitHub


  • Having prompts in the fedora ui would be a long term reminder. Maybe there could be something that displays on GET requests to the root of the repository too.


  • We do realize that we are looking toward the future and we recognize that we may never know what we have in past versions


  • What data does not present risk?
    • Version
    • Continent


  • Scott’s notes
    • Many, if maybe not a majority, of the Fedora 3 repositories out in the wild have been inherited by IT groups, digital archivists and collections managers who may not be plugged into the community via email lists, etc.
    • So I'd recommend an outreach effort that focused on publicizing Fedora 6 and migration help outside the usual channels
      • posting to forums like code4lib, the dspace forum (many institutions that run DSpace also run Fedora), NDSA lists, PASIG lists, Samvera, Islandora, etc.
      • any place where a repo manager, of any skill level, might be listening.
    • Digital curation listserv as well


  • End question to consider: what is the risk to the program if we don’t collect information?
  • No labels