Page History

Date: 12 Jun 2020

Attendees: John, Jason, Lynette, Tim, Simeon, Huda

Regrets: Steven

Agenda & Notes

Review actions from 2020-06-05 Cornell LD4P2 Meeting notes

Simeon Warner - Need to update Zooms for future meetings
- 2020-05-29 do they show for everyone? – except Simeon!
- 2020-06-12 links seem OK
Steven Folsom - To work on prioritizing pending authority requests pending in the QA Issues
- 2020-06-05 Still to do. Steven will reach out to the requesting folks to check and do the to review prioritization, then give guidance to check in with Dave if there should be change in his work order
- 2020-06-12 Still to do
John Skiles Skinner - To move the nectarguide code into the LD4P github organization organization
- 2020-06-12 Still to do
Huda Khan - Schedule usability work. Will forward tasks draft to group and to Kevin/Usability Working group today.
- 2020-06-12 Met with Kevin and usability group on Monday. Agreement to do think-aloud and have reps interested in participating, might also get folks from access services. Plan to get tests done by June 25 giving time to create a report
Huda Khan - To make slides and such to go through indexing approach, compared with client-side only, lessons learned, etc.
- 2020-06-12 slides, but not these slides yet

Status updates and planning

Enhanced Discovery - WHAM! (see also https://wiki.duraspace.org/x/sJI7Bg and https://github.com/LD4P/discovery/projects/1)
- See: Organizing doc and Pseudonym thought. Updates also on running notes page
- Updates from Dev chat meeting
- Usability - Document with testing plans. Some initial brainstorming. Plan on sharing tasks (perhaps with related screenshots) with Kevin to get some feedback. Will remind regarding putting our usability work on the agenda of the Usability Working Group for next week (week of 2020-06-08).
- Indexing - Huda worked on trying out the index-heavy approach for handling pseudonym/see also matching and display on the dev VM. Employed a two-phase approach (i.e. index populated first with information and then updated in a second pass.) Tim tried out the front-end with the dev VM index and appears to work. Currently, populating main suggestion index with pseudonym info (there were only ~100 there before), and after checking status of that, will run the index approach script. Once the main suggest index is setup, Tim can try out the front-end code.
- Tim had to update the front-end controller code b/c some of the content of the fields changed and the result set for a query has changed (i.e. what matches/doesn't). Just restarted work on rspec tests for expected results
- Slides with tasks and examples
- Huda: Got to work on those other slides that explain indexing. (To Do above). Will also check errors at the Solr level which means errors when trying to ingest Solr documents generated by the indexing processing scripts.
- Tim did work to update for index changes and some code consolidation from JS to controller, will now get back to the tests
- John has been porting Tim's work into Blacklight. Hoping to maintain the same experience but some things might change a little using native Blacklight featuresNext steps: Usability, Updating suggest index with index-only approach scripts, Finishing up RSpec tests (depend on suggest index updates being completed), Packaging.
Authority Lookups for Sinopia (Lookup infrastructure: https://github.com/LD4P/qa_server/projects/2, Authority requests: https://github.com/LD4P/qa_server/projects/1)
- QA performance
  - 2020-05-22 Dave is working through LoC authorities with new caching scheme, expect to hear about now
  - 2020-06-12 Taking a back seat to accuracy work
- QA accuracy
  - 2020-05-29
    - Pretest results (see image in QA/Sinopia weekly notes). Summary 60 tests run with 26 failing. Of the 26, 4 were close to passing with the position off by 1 or 2. The other 22, the expected subject URI wasn't in the results at all. Geography data is especially bad at providing unexpected results regardless of whether it is coming from the cache or direct from the authority. 10 or the 22 failing tests were searching for a place.
    - Dave has updated the indexing for all LOC, AGROVOC, NALT, Getty, dbpedia, MeSH, RDA, CERL, and Ligatus. These are ready for a post test run to see what impact it has on the test results.
  - 2020-06-05
    - Organized issues related to accuracy groups into a project board. Separated issues tagged as Cache Indexing into those that are related to new authorities and those that have examples of queries that are requesting accuracy tests or describe queries that did not return results as expected.
    - Used Issue #201 to create new accuracy tests specifically for queries that have an expected result that has diacritics.
    - Ran accuracy tests for LOC_NAMES_RWO:person. Comparison of Results for current production indexing and proposed indexing scheme. → Clearly shows that the proposed indexing scheme is not an improvement overall
  - 2020-06-12
    - E. Lynette Rayle Worked with Dave on refining the indexing for LCNAF. Comparison of Results for current production indexing and proposed indexing scheme updated to include intermediate and latest results. At this point, there are 3 out of 24 failing tests for LCNAF. 1) same failure as original due to other legitimate matches ranking higher, 2) special character that is considered an alpha is missing, so it doesn't match, 3) a match moved down in the ranking past the expected position (not sure why)
    - Steven Folsom I’ve worked on some tests and moved some issues to completed here https://github.com/LD4P/qa_server/projects/3, where I thought the only hangup was tests. Those issues that remain in the Failing Queries Columns, I’ve added comments to. Also, I finally have a first pass at a draft for the workshop for the LD4 Conference here: https://github.com/sfolsom/LD4_2020_Conference_Linked_Data_WorkShop. Comments welcome, but no pressure. I’m still working with the Conference Organizers to firm up workshop slack, email, and office hour communication.
- Authorities
Linked Data API Working Group
- Starting first Monday in June and then every other week for 4 months, logistics put into place
- 2020-05-29 - Added initial working group documents...
  - Current Approaches to Providing Search APIs
  - Needs of Consumers and Applications
- 2020-06-05
  - First meeting is next Monday
- 2020-06-12
  - First meeting was last Monday (meeting notes). Brainstorming fell into several categories of topics: Caching (and management of cached data), Reconciliation, Accuracy (of results and of selection) (including extended context and ordering of results), API Approach, Data Structure, Scalability.
  - Creating a list of terminology with definitions. Discussions happening in Slack about terminology definitions including a question about the definitions for Reconciliation and Entity Resolution. I used definitions that were used in the first LD4L workshop which defined Reconciliation connecting Things-to-Things and Entity Resolution as connecting Strings-to-Things, but it was suggested that these are actually the same thing.
LD4P3 Planning
- Stanford functional requirements document: https://docs.google.com/document/d/18H6zYGwKuCg3SZqm9Q_cxkZThcdmBjknE6HdtQ-RRzk/edit#heading=h.4fu64x8jzm6e
  - Possible relationship to entity management work in FOLIO
  - How much full LD vs linky-MARC?
  - CUL LTS working group is hoping to add URIs to MARC before FOLIO migration as a substitute for some of the heading management we do in Voyager in other ways. Perhaps won't be done ahead of switch but as part of the data migration. Considering how to update/maintain going forward
  - 2020-06-05 Agree that doing some analysis/write-up for the Cornell context would be good, ACTION for July
- Greg has been working on setting up a copy of Cornell Blacklight
Meetings (see LD4P2 Cornell Meeting Attendances)
- LD4 Conference 2020
  - Expecting presenters to be contacted soon
- SWIB20 online
  - Deadline for proposals 13 July 2020
- rdfs:seeAlso Conferences Related to Linked Data in Libraries
Next meetings
- ...

Page tree

Versions Compared

Old Version 1

New Version 2

Key

Review actions from 2020-06-05 Cornell LD4P2 Meeting notes

Status updates and planning