Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Attendees: Lynette, Greg, Huda, Simeon, Jason

Regrets: Tim, Steven, Jason

Last meeting: 2021-02-19 Cornell LD4P3 Meeting notes

...

  • Qa Sinopia Collaboration – Support and evolve QA+cache instance for use with QA
    • 2021-02-26:
      • There was a couple of updates on ShareVDE.  Dave has just about completed the indexing of the first of 6 parts.  There is an issue with URIs including things like double quotes which is not allowed.  Dave is fixing during ingest and providing feedback to ShareVDE so they can fix it on their end.  Once this is done, he will ingest the other 5 parts.  Steven is exploring the data to determine what extended context we want to extract from the data.  Once that is done, Lynette will create the QA configuration.  Sinopia wants to access the data by searching for the desired entity.  This would be done using QA.  Once selected, they want to populate an entity in Sinopia with the "full" ShareVDE record.  "Full" is in quotes, because knowing the edges of the graph that defines full can be open to interpretation.  Retrieving the full entity can be done in one of three ways: 1) through a fetch call to QA by passing in the URI, 2) direct call to the cache to fetch the graph related to a single URI, 3) direct call to ShareVDE to fetch the graph related to a single URI.  Which approach we will use is TBD.
      • We explored future topics where we are with containerization and the next steps (see discussion in below), potential topics for the next working group charter (see discussion below), and Sinopia lookup modal UI. Sinopia team is still working on prioritization for next workcycle.
  • Search API Best Practices for Authoritative Data working group
    • 2021-02-26: 
      • The potential topics for the second working group include:  change management, language processing, linked data approaches, and moving user stories to specific recommendations.
      • The announcement of the ending of the first charter and links to the cataloger user stories prioritization survey and catagorized categorized user story summary documents went out last Monday.  There were also links to a survey for general feedback and a second for prioritizing the next charter's topic.  Each has a 2 week window for completion.  The feedback survey has 1 response that discusses the importance of extended context.  The topic survey has 4 responses.  Currently, language processing and moving user stories to specific recommendations are tied for first.  Change management is a close second.  But with only 4 responses, it is too early to tell.  I will send out a reminder on Monday to the same communities as the announcement to try and get more responses.
  • Cache Containerization Plan - Develop a sustainable solution that others can deploy
    • 2021-02-19 Greg completed CloudFormation template that allows someone to spin up a QA service in AWS easily. About 500 lines of template code that brings this very close to being a turnkey solution (in services-ci branch).Greg notes pre-reqs for spinning this up: S3 bucket for configs etc. which could be added to another template.
      • When complete Lynette will test, then ask Dave to test, then ask Stanford folks. Greg will also create a demo screencast.
      • What about replacing the current QA setup with this new approach? Would need to check authority configuration and correct setup for load. Lynette notes need to copy over the DB to retain history
      • Next steps
        • start to look at containerize Dave's setup. Two steps: 1) code to serve from cache, 2) indexing process
        • think about instructions for a vanilla linux server setup
    • 2021-02-26
      • Cache containerization discussion in QA-Sinopia meeting: We mostly talked about the next steps for the cache creating two containers: 1) container for API requests to retrieve cached data, 2) container to ingest data downloads and creation of the Lucene index.  This is fairly straight forward in the current approach of a full-data dump and ingest.  It is expected that there will be some complexities to resolve in how to update indices when change management techniques are deployed by authority providers that allow for incremental updates.  We punted that discussion until later when the format of change management streams is defined.  Stanford was asked their preferred deploy platform and they indicated that AWS was preferred.  
      • Greg will work with Dave when he starts work on containers and tester and sounding board
      • CloudFormation - Greg has written templates and Lynette is going to test these out (will document time taken). Hope to find anything missing in template or documentation, perhaps some permissions issues will be revealed too that will allow documentation of critical permissions
      • Next Greg will look at prerequisites that need to be set up and work to template these in a helper template

Developing Cornell's functional requirements in order to move toward linked data

  • C.f. Stanford functional requirements document: https://docs.google.com/document/d/18H6zYGwKuCg3SZqm9Q_cxkZThcdmBjknE6HdtQ-RRzk/edit#heading=h.4fu64x8jzm6e
  • What does success look like? And then how do we get there? 
  • Miro board (diagramming): https://miro.com/app/board/o9J_lfXUUj8=/ 
  • Notes space: https://docs.google.com/document/d/1TVPBFak7DkfjBptKl-pCMWQnOaiWHB0XCHswiB3Fr9g/edit?usp=sharing
  • 2021-02-05 discussion
    • Purpose? Vision for mid-term (3-5 years) transition to support linked-data at Cornell. May include things we don't yet have or cannot yet do, but not long-term vision of post-MARC environment
    • Important to understand sources of truth (primary data) and where there is derivative data
    • Imagine landscape with items described in multiple formats including at least MARC, BF, DC (eCommons), JSTOR
    • Imagine all items indexed and discoverable via D&A
    • Functions of "Aggregated index, allowing pivoting & ETL"
      • Includes current functionality of Frances' indexing
      • Does it include any editing?
      • Is there interaction with CULAR?
      • Includes indexing associated with DCP
    • What interfaces or functionality do we expect for the connecting lines?
    • Do we need a diagram for now (or at least July 1, 2021 with Voyager gone)?
  • 2021-02-26 Jason plans to update diagram and create narrative around it, hope to discuss next week

Other Topics

  • OCLC Linked Data / Entities Advisory Group
    • Request for UI and API testing from Jan 25
    • Lynette has Cornell key (a WSKEY) for testing
    • Call discussed seeding of data. Data for person includes VIAF and other sources;  place includes geonames. Steven, Huda, Jason and Lynette signed up for user testing
    • 2021-02-19 Huda finished UI testing (Seymour Schwartz for the win). Involved assessment of amount of information presented. Lynette hasn't got response to query about access key, interesting in testing new search in API as well as CRUD facilities
    • 2021-02-26 Lynette didn't get response with key for testing, will ping again to see whether this is still possible
  • PCC - Sinopia collaboration
    • 2021-02-05 Charge to form a new group for documentation, mentoring etc is under reviews
  • PCC Task Group on Non-RDA Entities
    • 2021-01-15 PCC reviewed proposal but no decisions made yet, looking at description wrt cataloger use, discussion will continue
  • Default branch name - Working through repositories in Renaming of LD4P Repositories2021-02-19
    • Created Renaming of LD4P Repositories page to identify Cornell repos, provide instructions, and track progress.
    • ACTION - Huda to look at blacklight and discovery repos - 2021-02-26 not yet
    • ACTION - Steven/Jason to look at HipHip Lynette notes that Stanford have already dealt with their repositories- 2021-02-26 agree to change branch name and archive the repo, E. Lynette Rayle will do this

Upcoming meetings

  • https://kula.uvic.ca/index.php/kula/announcement/view/1 .  Call for Proposals - Special Issue: "The Metadata Issue: Metadata as Knowledge".  Due January 31, 2021 (abstract 300-500 words).  Includes "The use of linked open data to facilitate the interaction between metadata and bodies of knowledge" and "Cultural heritage organization (libraries, archives, galleries, and museums) and academic projects that contribute to or leverage open knowledge platforms such as Wikidata"
  • code4lib - Expecting to attend: Huda, Steven, Lynette
  • Lynette doing a QA presentation at Samvera partner call in June

...