You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Date:

Attendees: Huda, Jason, Steven, Simeon

Regrets: Greg, Lynette

Discovery (WP3)

  • https://github.com/LD4P/discovery/projects/2 for issues etc. 
  • Draft of a discovery plan: https://docs.google.com/document/d/1zKYW7FQVVNvyd0XjjW0qWznX9PC3jbmOE6Kz_yygPjs/edit?usp=sharing
  • Research: how to go from knowledge graph to an index
  • BANG! (Bibliographic Aspects Newly GUI'd)
    • Jamboard link
    • Expect to include Works. Need to do something beyond what we already have live from the OCLC concordance data.
    • References/bibliography list (beginning)
    • 2022-03-04
      • ** Huda will add notes re: 10K sample
        • Sampling method: Using the first 10,000 hubs returned from the id.loc.gov search API. 

        • Overarching question: Can hubs provide relationships between catalog items?
        • Analysis: For every hub that has > 1 work, get ISBN groups.  For groups which have > 1 ISBN, record how many times querying the Cornell catalog with those ISBNs provides > 1 catalog result.

        • Results:

          • Total: 202 sets of ISBNs where each set > 1 ISBN, comprising of a total of 750 unique ISBNs total. 

          • Catalog matches:  26 ISBN groups which resulted in > 1 catalog result. These catalog matching ISBN groups comprise of 130 unique ISBNs.
      • Scripting hub-to-hub analysis, uni-directional. Almost done with process of getting ISBNs back... and will then query against catalog to allow us to say something like "out of 10K, there were ## ISBN groupings one can find using translation property of which ## yielded cataloging hit"
      • Since doing so many ajax queries, thinking about doing visualization
      • Need to go to LC Hub analysis – to look at LCCNs in addition to ISBNs
      • Hoping to be done with analysis in 2 weeks
      • Lots of presentation prep
      • To do: look at OCLC WorkId relationships to compare / identify groupings and results (in 2015 we had N number of clusters over the entire catalog).
    • 2022-03-11
      • Had 200-300 people attend presentation on Wednesday, lots of questions (people fascinated by the infrastructure and our ability to make changes, FOLIO interest, role of cataloger in wikidata, fact vs contestable information, questions about mechanics)
      • Doing another presentation today (6th since January!), links on LD4 presentation page
      • BANG! work: Used 10,000 hub sample to look at relationships between hubs. Have added preliminary tables to data sources doc/report
        • Total: 110 ISBN sets related via hub to hub relationships, 22 result in >1 catalog matches (i.e. at least 2 catalog records related b/c two hubs are related)
          • 110 ISBN sets cover 381 ISBNs, 22 matching sets have 78 total ISBNs
          • By relationship: hasTranslation accounted for 92 of the total 110 ISBN sets, relatedTo accounted for 18
      • BANG! preliminary design-ish/data questions link
  • DAG Calls
    • 2022-03-11:
      • Archives discovery will be presented on 3/15
  • Document started re: Comments, Questions and Suggestions offered during the myriad of presentations provided. Huda will add link here

Linked-Data Authority Support (WP2)

  • Qa Sinopia Collaboration
    • 2022-03-04 
      • No meeting with Stanford this week.
      • homosaurus - Dave has in triple store.  Steven defined context and validations. Lynette configured QA.  Still getting graph read error, so I don't think the index has been built yet.  I have a message out to Dave.  Once that is done, it should just work. Final step is to configure in Sinopia
      • Dave still plans to try using LOC or Getty activity streams to update the cache.  This is proof of concept.  It may prove insufficient as none of the feeds include patches.  But makes for a good exploration.
      • Dave still needs to fix of total_number_found to make pagination work to get pagination working again in Sinopia.  Once that is done, it will just feed through to Sinopia without any additional work.
      • End of March, Huda and I will be meeting with OCLC to discuss the API and how it fits with our work.
  • Best Practices for Authoritative Data working group (focus on Change Management) 
    • 2022-03-04
      • Updates in the recommendations document include steps for producers to create an activity stream for all 3 use cases.  Will be looking for feedback on that at the next meeting.
      • I'm working to incorporate feedback given so far.
      • There is still a question about date handling and what we want to recommend.  Options are endTime, startTime, published, updated.  Once this is resolved, it will be fairly easy to expand the notifications examples out to the partial and full cache examples.
      • Feedback wanted via GH Issues
  • Containerization
    • 2022-03-04
      • Dave, Greg, and Lynette met to plan next steps for containerizing the cache.
      • Dave is exploring putting the war file in a container.  
      • Dave handed some Lucene index files to host; he will get prototype working with his war file and those index files; will take from there.
      • Work on templates? Possibly some work can be done ahead of time.
      • Longer-term concern: generating the indexes... not containerizing the cache

Other Topics.

  • Sinolio - Sinopia-FOLIO
    • 2021-12-17 - Work Cycle finished, sprint video out
  • OCLC Linked Data / Entities Advisory Group
    • 2021-12-10 OCLC presented at bigheads meeting this week, in testing
  • PCC 
    • 2021-01-21 Definitions and non-RDA final report to POCO (hopefully) to be submitted next week
    • 2022-01-14 Nothing new to report.
  • Authorities in FOLIO
    • 2022-02-18 Mary met with team and making progress with deletes, Frances is getting experience building out the index with hope to have and API Nick can work against by end February. Steven has diagram of vision

Upcoming meetings/presentations

Next Meeting(s), anyone out?:

2022-03-18 - Lynette may not attend as she is on another sprint.

  • No labels