Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Date:

Attendees: Jason, Simeon, Steven, Huda

Regrets: 

Discovery (WP3)

  • https://github.com/LD4P/discovery/projects/2 for issues etc. 
  • Draft of a discovery plan: https://docs.google.com/document/d/1zKYW7FQVVNvyd0XjjW0qWznX9PC3jbmOE6Kz_yygPjs/edit?usp=sharing
  • Research: how to go from knowledge graph to an index
  • DASH! (Displaying Authorities Seamlessly Here)
    • Dashboard design meeting kickoff notes
    • User reps D&A meeting: Expect next follow-up in August (Slides: from user reps meeting 2021-04-09 and result was "not no")
    • https://docs.google.com/document/d/1PgQi3xobsPhr9DUHU_YGeimL1OjNiiTdkiNWb36r3Gg/edit
    • Usability testing and followup for DASH: Usability results
      • Usability results, a few little things to finish up
      • GitHub issues
      • 2021-10-15: Feedback from User Reps, shared by Lenora: "In general, the catalog and LCSH content feels reliable, whereas some of the external data appears arbitrary and unbalanced, which makes us uncomfortable, especially for reference and instruction librarians. Exceptions to this are the photos and biographical information - everyone seemed fine with that content."  Slides with comments here: https://docs.google.com/presentation/d/12raOmGmBpG3DdLG-xqRYkcFe7HkPeehB8z9ozNSyokc/edit#slide=id.p.  Final slide has recommendations. Comments in slides discusses what was liked, not liked, problematic. Two classes of problems: timeline doesn't load properly; unclear whether zoomed in - but wouldn't likely change overall recommendations. Other concern: cool but is this all of the works? is this misleading? what's up with influences? Liked: images and biographical information. Cannot solve data problem. Four years ago, were not alright with images... so that change is nice... and implementing would be fairly easy, albeit need discussion with Tim and D&A.
        • Next step: next D&A sprint: 1st 2 weeks of November, incl. 2 meetings with user groups. Show this as part of D&A sprint during the first sprint meeting. Chunk of work can get done before the sprint begins. Tim does not believe there is a need to do another round of review. Start to move this into production and let them comment in sprint. Knowledge panel and author page where we add image and biographical info, which includes library holdings page. Bottom wouldn't have tabs but would have general browsing. Same lay out but does not include tabs with unliked content.
        • Need to work on way turn off individual data points/images for specific problematic data/image; per property / per URI basis for now. Possible approach: exclusions YAML file
    • 2020-10-1: Separately, need to finish documentation of work done so far and set up demo video.
  • BANG! (Bibliographic Aspects Newly GUI'd)
    • Jamboard link
    • Expect to include Works. Need to do something beyond what we already have live from the OCLC concordance data.
    • 2021-10-01: Experimented with various queries.  For retrieving sets of related ISBNs, queried for following relationship: An opus that has two separate works, where each work has an instance with an ISBN.  Parsed the results to create a CSV where each line starts with an ISBN and is followed by all the others related ISBNS (based on the query above).  Set up front-end code that takes ISBN from catalog, looks for any line with that ISBN and returns the entire set, then does an ISBN field "OR" query to the Solr index to return any matches with their titles. 
      • Questions to explore:
        • What is the goal of an Opus? And is this conceptually useful for users?
        • What is the data quality (both of the Opus data and connection via ISBN)?
        • What is the gap in data for CUL?
        • What is a good UI for display of this data? And should there be different UI for translations vs. based on, etc.?
      • Next steps: Huda will look for more specific relationships in the data (e.g. LCCN matching). Huda/Tim to explore UI options. Also look for definitions of opuses and hubs
    • 2021-10-08:
      • Steven: I began to gather properties in the PCC data for different instances of the same work. Shared with Huda, and will work to fine tune it next week. Re: PCC non-RDA Entities, the spreadsheet is almost ready to send to Kevin for feedback; we're mostly working on examples to help clarify the scope/range of our definitions.
      • Tim looked at SPARQL queries for different types of work
      • Asked ShareVDE slack regarding example about instances that don't look like equivalent ones (different works by Tony Ousler).
        • Answer: "When the bibliographic records do not have the tag 240 (uniform title) the matching criteria for Opus is made using tag 245 $a. So, this is the reason why your bibliographic records are grouped under the same opus "Tony Oursler" ... The matching criteria during Opus creation does not take into account values inside $b subfield of tag 245."
        • Generally, they are meant to be "functionally equivalent"
        • Our conclusion is that this grouping is an artifact of a lax query (e.g. ignoring subtitle in $b), the MARC is appropriate. We likely need to understand how commonly misleading groupings like this are made. Huda will follow up with additional questions about the cases where there isn't a uniform title (240)
      • LCCN data processing and display:
        • Did processing similar to that for ISBN for LCCN: Get all LCCNs where an Opus has two works and the works have instances with LCCNs.  Have incorporated that into the application in the same way
      • To do: Make the UI look similar for the "related works" section as far as types of information returned for isbns, lccns, and maybe related works. Also look at problematic cases (e.g "Geography" shouldn't be related to "Hamlet") to see why this is happening.
    • 2021-10-15:
      • Info display for related ISBNs and LCCNs is mostly working. Two pieces of confusion: online/at-library. Huda will reach out to Frances
      • Had examples where Work was not the object of any statements. Not many - fewer than other Opera related examples. Need to see what shows up when approaching query with other properties. Biggest challenge right now is how long the queries take and whether they'll time out. Huda has been requested to add something to a SHARE-VDE forum; might be seeking help to form the question. In data_feedback SVDE Slack channel, Steven asked questions re: Stanford data for QA work.
      • Robustness of data? In Stanford data, has to jump through a few nodes to get to data that should have been low-hanging. Opus was aggregating all labels from Instances; mostly Instance had a title... Work and Opus had rdfs:label. Didn't investigate label coverage much but was looking at relationships. Possible we use the links/relatedness but we go to SOLR index once we have the match.
  • DAG Calls
    • 2021-10-1: Next two meetings will be usability/user research focused: 10/12 with LINCS, 10/26 with Harvard image research/IIIF D4H user research.
    • 2021-10-15: Kim Martin & her student sgave interesting presentation. Reaching out to ask whether any archival linked data. Huda will ask Elizabeth Russey Roke.

...

Next Meeting(s), anyone out?:

  • 2021-10-22: Lynette; Steven & Jason will drop off at 10 for the communities of practice meeting. Check whether we can meet at 9am again.29