Will revisit question in January; Chicago supposed to have report by now... regardless, understanding collaboration in January when better understand resources and what to test
2019-12-20 Expecting update in January
Issues/Blockers (for time since call at 10)
Lynette: Connection time outs were happening due to increased requests (fixed); cache layer is returning a lot of 500s - Dave is working on this. No blocker - work itself is happening. Adding IP address to logging to identify whether same source is hitting us with same request.
John: getting documentation from the Internet Archive. Needs to know whether there is an API to full text search the books. Recommend look into HathiTrust (Michelle Paolillo is CUL rep for Hathi and can point you to documentation).
Tim: no blockers or issues
Huda: looking at semantic search to define data requirements. no blockers
Status updates and planning
BAM!
We watched and it was awesome!
Plan to have an open meeting in mid-February to which we encourage D&A and other folks to attend – March 3, 2-3:30pm in Mann 102 and should Zoom it too
John is looking at how we might do better with searches that currently yield zero results. Experience to date suggests that full-text approaches are the most promising. Have looked at archive.org, hathi and google books. So far, Google books is the only service with an available API. In discussion with Hathi and we might prefer that, or perhaps mix sources. Question of analysis of Google vs. Hathi results – certainly different but we don't have a good sense of how
Estimate is that less than 1% of search results that yield zero results, most are legitimate queries on obscure topics rather than simple typos
Blacklight has some "did you mean?" facility but it isn't good so we and every else turns it off
Tim is looking at autocomplete functionality against VIAF, VIAF via Dave, and FAST
Huda is looking at results from search of subject terms (including alternate labels) in Dave, and then considering broader and narrower terms in order to offer more subject categories to look at. Thinking of looking at subject headings obtained from results from catalog search
STARTING with items not in DIscogs but AWAITING more work in Sinopia to import data. Sinopia work cycle 2 (through December 6) will we hope include the ability to read in RDF back from Trellis. We hope that we can leverage this to import RDF from a lookup in Discogs or ShareVDE.
Work is going on. Questions about where to put data and roles, and how many abstractions of work are required (e.g. for performance and for composition). Lookups are too slow and too buggy