Page History

Date: 26 Feb 2021

Attendees: Lynette, Greg, Huda, Simeon

Regrets: Tim, Steven

Last meeting: 2021-02-19 Cornell LD4P3 Meeting notes

...

Qa Sinopia Collaboration – Support and evolve QA+cache instance for use with QA
- 2021-02-19:26:
  - There was a couple of updates on ShareVDE. Dave has just about completed the indexing of the first of 6 parts. There is an issue with URIs including things like double quotes which is not allowed
  - No meeting. Dave is working on the ShareVDE PCC data. "Share-VDE release is in six parts, each with numerous nq (quad) files, each with numerous records. Translating Part1 to a single nt (triple) file results in a 2.7GB file, ready for loading into the triplestore. Parts 2-6 are currently in conversion." Checking with Dave to see what was in the quad position that is being lost in the conversion to n-triples.
  - We still have 5 issues that are under exploration. Dave says he is close on fixing the GETTY_TGN and ULAN issue where subject URIs should be the RWO version with `-place` and `-agent` appended, respectively.
  - fixing during ingest and providing feedback to ShareVDE so they can fix it on their end. Once this is done, he will ingest the other 5 parts. Steven is exploring the data to determine what extended context we want to extract from the data. Once that is done, Lynette will create the QA configuration. Sinopia wants to access the data by searching for the desired entity. This would be done using QA. Once selected, they want to populate an entity in Sinopia with the "full" ShareVDE record. "Full" is in quotes, because knowing the edges of the graph that defines full can be open to interpretation. Retrieving the full entity can be done in one of three ways: 1) through a fetch call to QA by passing in the URI, 2) direct call to the cache to fetch the graph related to a single URI, 3) direct call to ShareVDE to fetch the graph related to a single URI. Which approach we will use is TBD.
  - We explored future topics where we are with containerization and the next steps (see discussion in below), potential topics for the next working group charter (see discussion below), and Sinopia lookup modal UI. Sinopia team is still working on prioritization for next workcycleStill waiting for a reply from ShareVDE to Vivian's request to restart meetings.
Search API Best Practices for Authoritative Data working group
- 2021-02-19: 26:
  - The potential topics for the second working group include: change management, language processing, linked data approaches, and moving user stories to specific recommendations.
  - The announcement of the ending of the first charter and links to the cataloger user stories prioritization survey and catagorized user story summary documents went out last Monday. There were also links to a survey for general feedback and a second for prioritizing the next charter's topic. Each has a 2 week window for completion. The feedback survey has 1 response that discusses the importance of extended context. The topic survey has 4 responses. Currently, language processing and moving user stories to specific recommendations are tied for first. Change management is a close second. But with only 4 responses, it is too early to tell. I will send out a reminder on Monday to the same communities as the announcement to try and get more responses.
  - Group is officially ended and documents tidied up
  - Announcement includes survey for feedback and survey for topics for next group. Will give 2 weeks to respond
  - ACTIONSteven Folsom and E. Lynette Rayle will send out announcement on Monday. Plan is to send to PCC list, LD4P3 list, LD4 #general slack, Samvera Community list, Samvera #general slack, ShareVDE #aims_sg slack, several authorities (e.g. Getty, MeSH Bio-portal, etc.)
Cache Containerization Plan - Develop a sustainable solution that others can deploy
- 2021-02-19 Greg completed CloudFormation template that allows someone to spin up a QA service in AWS easily. About 500 lines of template code that brings this very close to being a turnkey solution (in services-ci branch).Greg notes pre-reqs for spinning this up: S3 bucket for configs etc. which could be added to another template.
  - When complete Lynette will test, then ask Dave to test, then ask Stanford folks. Greg will also create a demo screencast.
  - What about replacing the current QA setup with this new approach? Would need to check authority configuration and correct setup for load. Lynette notes need to copy over the DB to retain history
  - Next steps
    - start to look at containerize Dave's setup. Two steps: 1) code to serve from cache, 2) indexing process
    - think about instructions for a vanilla linux server setup
- 2021-02-26
  - Cache containerization discussion in QA-Sinopia meeting: We mostly talked about the next steps for the cache creating two containers: 1) container for API requests to retrieve cached data, 2) container to ingest data downloads and creation of the Lucene index. This is fairly straight forward in the current approach of a full-data dump and ingest. It is expected that there will be some complexities to resolve in how to update indices when change management techniques are deployed by authority providers that allow for incremental updates. We punted that discussion until later when the format of change management streams is defined. Stanford was asked their preferred deploy platform and they indicated that AWS was preferred.

Developing Cornell's functional requirements in order to move toward linked data

...

Page tree

Versions Compared

Old Version 2

New Version 3

Key

Last meeting: 2021-02-19 Cornell LD4P3 Meeting notes

Developing Cornell's functional requirements in order to move toward linked data