Date:

Attendees: Huda, Lynette, Tim, Jason, Simeon

Regrets: Steven, Greg

Discovery (WP3)

  • https://github.com/LD4P/discovery/projects/2 for issues etc. 
  • Draft of a discovery plan: https://docs.google.com/document/d/1zKYW7FQVVNvyd0XjjW0qWznX9PC3jbmOE6Kz_yygPjs/edit?usp=sharing
  • Research: how to go from knowledge graph to an index
  • DASH! (Displaying Authorities Seamlessly Here)
    • Dashboard design meeting kickoff notes
    • User reps D&A meeting: Expect next follow-up in August (Slides: from user reps meeting 2021-04-09 and result was "not no")
    • https://docs.google.com/document/d/1PgQi3xobsPhr9DUHU_YGeimL1OjNiiTdkiNWb36r3Gg/edit
    • Usability testing and followup for DASH: Usability results
      • Usability results, a few little things to finish up
      • GitHub issues
      • 2021-09-24: Huda emailed Lenora again and mentioned option for folks to email her feedback directly.  Have not heard back yet (sent email yesterday)
      • 2021-10-01: Last email discussion: Lenora said she would try to get feedback from user reps.  We have requested feedback by 10/11 since the next D&A spring starts 11/1. 
    • 2020-10-1: Separately, need to finish documentation of work done so far and set up demo video.
  • BANG! (Bibliographic Aspects Newly GUI'd)
    • Jamboard link
    • Expect to include Works. Need to do something beyond what we already have live from the OCLC concordance data.
    • 2021-10-01: Experimented with various queries.  For retrieving sets of related ISBNS, queried for following relationship: An opus that has two separate works, where each work has an instance with an ISBN.  Parsed the results to create a CSV where each line starts with an ISBN and is followed by all the others related ISBNS (based on the query above).  Set up front-end code that takes ISBN from catalog, looks for any line with that ISBN and returns the entire set, then does an ISBN field "OR" query to the Solr index to return any matches with their titles. 
      • Questions to explore:
        • What is the goal of an Opus? And is this conceptually useful for users?
        • What is the data quality (both of the Opus data and connection via ISBN)?
        • What is the gap in data for CUL?
        • What is a good UI for display of this data? And should there be different UI for translations vs. based on, etc.?
      • Next steps: Huda will look for more specific relationships in the data (e.g. LCCN matching). Huda/Tim to explore UI options. Also look for definitions of opuses and hubs
    • 2021-10-08:
      • Steven: I began to gather properties in the PCC data for different instances of the same work. Shared with Huda, and will work to fine tune it next week. Re: PCC non-RDA Entities, the spreadsheet is almost ready to send to Kevin for feedback; we're mostly working on examples to help clarify the scope/range of our definitions.
      • Tim looked at SPARQL queries for different types of work
      • Asked ShareVDE slack regarding example about instances that don't look like equivalent ones (different works by Tony Ousler).
        • Answer: "When the bibliographic records do not have the tag 240 (uniform title) the matching criteria for Opus is made using tag 245 $a. So, this is the reason why your bibliographic records are grouped under the same opus "Tony Oursler" ... The matching criteria during Opus creation does not take into account values inside $b subfield of tag 245."
        • Generally, they are meant to be "functionally equivalent"
        • Our conclusion is that this grouping is an artifact of a lax query (e.g. ignoring subtitle in $b), the MARC is appropriate. We likely need to understand how commonly misleading groupings like this are made. Huda will follow up with additional questions about the cases where there isn't a uniform title (240)
      • LCCN data processing and display:
        • Did processing similar to that for ISBN for LCCN: Get all LCCNs where an Opus has two works and the works have instances with LCCNs.  Have incorporated that into the application in the same way
      • To do: Make the UI look similar for the "related works" section as far as types of information returned for isbns, lccns, and maybe related works. Also look at problematic cases (e.g "Geography" shouldn't be related to "Hamlet") to see why this is happening.
  • DAG Calls
    • 2021-10-1: Next two meetings will be usability/user research focused: 10/12 with LINCS, 10/26 with Harvard image research/IIIF D4H user research.

Linked-Data Authority Support (WP2)

  • Qa Sinopia Collaboration – Support and evolve QA+cache instance for use with Sinopia
    • 2021-10-08 - Met with Stanford with topics...
      • There have been a number of updates in support of Sinopia.  This includes returning tagged literals as tagged literals (e.g. "milk"@en) instead of moving the tag into the string (e.g. "milk@en"); updating Sinopia's auth config to expose LOCVOCABS Script subauth (Steven's PR #3129 adds this to Sinopia); testing of Jim Hahn's publisher city select list in lookup-int.  There was an issue with all direct access auths due to a problem with Let's Encrypt root certificate.  This was an AWS issue and the hotfix was applied which cleared the error. 
      • Remaining work requested by Sinopia includes case insensitivity which is ready and will be deployed with a reindex of the cache; cache search documentation; common naming of OCLC direct and cache subauth names to reduce user confusion; explore and fix results coming from multiple authorities. 
      • With the release of ShareVDE 2.0, we spent time discussing the impact, questions, and what is next in moving it forward.  The immediate first step is for Dave to ingest the update into the cache.
  • Best Practices for Authoritative Data working group (focus on Change Management)
    • 2021-10-08
      • Working on describing the Notification system.  Reviewing IIIF Activity Streams documentation to see what style we want to use for the final document.  I find this document easier to read than the vocabulary organization.
      • Reviewed LOC activity streams
      • Input requested:  I would like to have a domain for the documentation and the context definitions (e.g. IIIF context).  I'm hesitant to put this under LD4P because the audience and participants go beyond LD4P.  Potential name: Authoritative Data Frameworks.  This seems general enough to cover change management and other tools.
        • Simeon will ask LD4 Steering about possible use of the LD4 github (https://github.com/ld4) as a home for vocabulary terms associated with the WG AS terms 
      • Steven: This week I worked with Lynette to clarify the concern with RWOs; I wrote something up, that took into account what LOC and Getty are doing, while trying to make a distinction that the RDF is what is updated, not the RWOs. My understanding of the Getty implementation isn't perfect, and after working with Lynette, it seems they aren't bothering to worry about the ActivityStream declaring that a Person was updated. Their URIs for the objects and concepts for places/people/orgs/etc. look jumbled to my eye. here: https://github.com/LD4P/sinopia_editor/pull/3129/commits/2275a08a31867c20af269cd76557ef090ee38e32. 
  • Cache Containerization Plan - Develop a sustainable solution that others can deploy
    • 2021-10-08
      • Removed AWS ECR repos that we won't be using.  Added private repos for -int and -stg.  The github action scripts to auto-update these images on push to dev and main branches, respectively, are functional.  The script to push to the public repo on release is pending resolution of login to the public ECR repo which uses a different login system.
      • Identified several items that need to be extracted out from the container to allow for site specific configurations of the app (e.g. show/hide graphs, dis/enable performance stats gathering, etc.) and additions of new non-config driven authorities (e.g. wikidata, Sinopia's property auths, publisher cities select list).  I am currently working on extracting the configurations into environment variables which fits into the current architecture of the container setup.  I will work with Greg on how best to address the extensions.  Part of it will fit into the current authorities volume.  The part that drives the listing of the non-config driven authorities in the AuthorityList requires override of a class. We don't have this setup.
      • We have a plan in place for moving the containerization process forward.  The state of the plan so far is...
        • DONE - Lynette will clean up the images in ECR
        • PARTIALLY DONE - Lynette will get the github-actions deploying images.
          • DONE for the private repos. Still working on the public repo.
        • IN PROCESS - Lynette will update the env file to allow for initializers to draw their values from that file.
        • Then (likely next week) Greg can walk Lynette through setting up -int.  This should identify…
          • what can be done by a moderate privileged user and what has to be done by a sys opts user
          • make changes/additions to documentation as needed
        • Then Lynette will setup -stg except where sys opts privileges are required to proceed.

Other Topics

  • Sinolio - Sinopia-FOLIO
    • 2021-10-01 Sprint review this afternoon. Recent meeting to try to better describe Sinopia-FOLIO connection in a diagram
  • OCLC Linked Data / Entities Advisory Group
    • 2021-09-24 Steven will follow-up on testing for this round
  • PCC 
    • 2021-10-01: Task Group on Non-RDA Entities - definitions firmed up, agreed on non-URI codes, reflecting those in the spreadsheet over the next week, and working with Kevin on specific issues converting to RDF
  • Authorities in FOLIO
    • Hope to include URIs as part of Cornell FOLIO migration, possible LD4P work
    • 2021-09-17: Met last week. Frances has database she is using for discovery. She is going to start looking at strategy for parsing weekly Peter Ward files and updating her database. will then test whether that database can serve as source. Meeting every-other-week.

Upcoming meetings

  • https://kula.uvic.ca/index.php/kula/announcement/view/1 .  Call for Proposals - Special Issue: "The Metadata Issue: Metadata as Knowledge".  
  • SWIB (11/29-12/3) virtual again this year - No proposals submitted
  • Virtual Blacklight Summit in 8-10 November. Last year had an institutional update from Cornell; can do this again. And/or can propose a session on the LD4P discovery work. Informal CFP has gone out.
    • Huda Khan has filled out form to offer institutionals demo and possibly demo work
    • Demo videos due by 11/3

Next Meeting(s), anyone out?:

  • 2021-10-15 Simeon will be out, Jason will run