Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Date:

Attendees:  Huda, Lynette, Jason, Steven, Simeon

Regrets:  Greg

Discovery (WP3)

  • https://github.com/LD4P/discovery/projects/2 for issues etc. 
  • Draft of a discovery plan: https://docs.google.com/document/d/1zKYW7FQVVNvyd0XjjW0qWznX9PC3jbmOE6Kz_yygPjs/edit?usp=sharing
  • Research: how to go from knowledge graph to an index
  • DASH! (Displaying Authorities Seamlessly Here)
    • 2022-01-12
      • Work and video done. Huda is finishing up some documentation including how index is set up and where connections made
    • 2022-01-21 
      • Only things left are formatting of writeup and finishing off documentation
  • BANG! (Bibliographic Aspects Newly GUI'd)
    • Jamboard link
    • Expect to include Works. Need to do something beyond what we already have live from the OCLC concordance data.
    • 2021-12-10
      • To dos: Still need to use the work to work relationships data (which is based on a partial query run) (i.e. related, etc.) and bring that to the page.  Will try to get that done by end of year. 
        • Using Wikidata relationships between item and IMDB to get appropriate IMDB link 
        • Design work/review options for displaying information to users (works/instances and how/whether to modify the current catalog view focus on item)
    • 2022-01-1421
      • Starting data source writeup
        • Overview tries to get at counts for information. 
        • Started some diagrams
        • Deep thought seems to perform better for some of these queries so used that for some of the counts
      • Decided to take work to work relationship ISBNs to see which pairs had matches in our catalog (i.e. work1 and work2 have a relationship, work1 has an instance i1, work2 has an instance i2, result = ISBNs for i1 and i2).  Saw overlap between ISBNs from different works and tracking that down with Steven.
      • Also: ran work to work relationship query again (where first work has instance with ISBN) to get complete result. Used this approach because trying to run a full query was taking forever. Second script tries to get instances with ISBNs for second works to create two groups of ISBNs and the predicate relating the two works
        • 33,395 lines in result (i.e. potential sets of related ISBNs)
        • But: Need to review to see how many of these include the same ISBN on both sides of the relationship (i.e. run yet another script)
        • And: How many of these matching groups will have catalog results
      • Will also run comparable queries on Stanford CKB data (the latest ShareVDE data)
      • Besides report, will work on bringing above work to work relationships,  Wikidata queries Tim and Steven had reviewed (for properties not usually found in our catalog) and IMDB connections into demo
      • Have focused on DASH recently
      • Curious about OpenAlex KB https://openalex.org/ and https://blog.ourresearch.org/openalex-launch/. "We’ve now built around a simple new five-entity model: works, authors, venues (journals and repositories), institutions, and concepts."
      • Possibly create report describing all of the data sources examine, affordances, limitations, overlap, etc. Steven notes significance of describing relationships and where useful data is found in model.
      • Design issues for works data - e.g. how to show to users different types of instance without one type dominating, how to build indexes, cataloging practices
  • DAG Calls
    • 2022-01-14 - Next call next Tuesday, will focus on topic roadmap and then talk about data (where does it come from and how does it show in discovery, and what other sources could be used?)21 - Roadmap review and slide to talk about library systems (very high level) and what data we get where (last slide has comments)
    • Upcoming (in progress): Crossover episodes (DAG/WAG) (announcements forthcoming). Also session with archival focus: 3/1.

Linked-Data Authority Support (WP2)

  • Qa Sinopia Collaboration
    • 2022-01-21 
      • Met this week to to talk about new vocab requests.  General consensus is that there are unlikely to be many requests because we've already hit the big ones.  Known current requests that are a priority include new ISNI rdf download and Homosaurus.  There are existing requests that we probably should check in on to see if they are still needed and then determine priority (i.e. RBMS, ORCID, DCMI, CCL, GAMECIP, OLAC's Videogame Genre)
        • Steven will review additional suggestions and discuss with Nancy
      • Dave has already created a triple store of the new ISNI RDF download including a Fuseki end point.  Steven is going to did the review for extended context, https://github.com/LD4P/qa_server/issues/14#issuecomment-1018554436.  It is in Dave's court for generating the index.  Want support for searching by URI.
      • Dave is looking to switch indexing from Lucene to Elastic Search.  He wants to do this before starting the containerization process.
        • Lynette to discuss ongoing maintenance issues for ES vs Lucene, we have mostly Solr expertise in CUL
      • Dave believes he fixed a race condition that was causing the 500 errors.  I'm not seeing the EOFError in the logs, but I am seeing RDF::Graph#load failure: IOError.  So this may be something else going on.
      • I reported suspicious access to SYSOPTS (e.g. requests to wp, python, .env, etc.)  More info in at DLITSYS-4557. No response from SYSOPTS yet.
  • Best Practices for Authoritative Data working group (focus on Change Management) 
    • 2022-01-21
      • We meet next Monday.
      • There was discussion in Slack about the usage of Add vs. Create.  I believe we have settled on Create indicating the entity is brand new and Add indicating that the entity wasn't available and now it is (e.g. permission change, temporary removal reinstated, etc.)  I expect Remove and Delete to follow a similar pattern.
  • Containerization
    • 2022-01-21 
      • Greg out for a month. 
      • I checked in with Justin about progress.  He stated... "I never got it to work correctly.  I think the problem came down to networking."
      • Next up is containerization of cache.  Probably starting in Feb.

...

  • Github branch renaming
    • See Renaming of LD4P Repositories – cul/blacklight-cornell now uses main but still has the old master branch too
    • Expect to be able switch this in LD4P as part of D&A work
  • Sinolio - Sinopia-FOLIO
    • 2021-12-17 - Work Cycle finished, sprint video out
  • OCLC Linked Data / Entities Advisory Group
    • 2021-12-10 OCLC presented at bigheads meeting this week, in testing
    ISNI - made a soft launch of LOD this week. Shhh... they are providing data dumps now, dereferenceable LOD sometime in 2022. https://isni.org/page/linked-data/
  • PCC 
    • 2021-1001-27 21 Definitions and non-RDA examples in MARC examples continued to shift, but should be able to finalize the spreadsheet that generates the RDF today.final report to POCO (hopefully) to be submitted next week
    • 2022-01-14 Nothing new to report.
  • Authorities in FOLIO
    • 2022-01-14 Working on "deletes" workflow (actually deprecation with replacement process for references). Current workflow uses browse in Blacklight and benefits from links into FOLIO 
  • Upcoming presentations: Discussion on knowledge panels and Cornell Wikidata integration1/24 ARLIS/NA x Wikidata, and catalog form and function IG presentation March 11, 3-4pm (hope to also get Steven (smile))

...

Next Meeting(s), anyone out?:

  • 2022-01-28 -  Lynette probable out with Samvera Dev Congress