Date

29 Jun 2021

Call-in Information

Time: 11:00 am, Eastern Time (New York, GMT-04:00)

To join the online meeting:

Go to: https://lyrasis.zoom.us/my/vivo1?pwd=a2Q3RUVKVkN2dkNHV3FUaFRtLzhGdz09
- Passcode: 351860
One tap mobile:
- US: +16699006833,,9358074182# or +19292056099,,9358074182#
Or Telephone:
- US: +1 669 900 6833 or +1 929 205 6099 or 877 853 5257
- Meeting ID: 935 807 4182
International numbers available: https://zoom.us/u/aeANHanzED

Slack

https://vivo-project.slack.com
- Self-register at: http://bit.ly/vivo-slack

Attendees

Indicating note-taker

Agenda

Conference debrief
JIRA/GitHub issues
Reviewing 2019-01 Architectural Fly-in Summary#201901ArchitecturalFlyinSummary-Ingest
Moving Scholars closer to the core : continuing discussion from last committers' meeting
1. "win/win" opportunity: Scholars and VIVO both eliminate some complexity
2. converting Scholars SPARQL queries to VIVO DocumentModifiers
3. replacing URIFinders with fast, reliable Solr lookups
Prioritizing future development items:
1. quick wins / items for a more rapid release
2. collaborative items for future sprints
3. (Add/edit at will) spreadsheet: https://docs.google.com/spreadsheets/d/103P9P4v6yUBSb5BnVaK40NoGx1fIYyL8uaHKUubZWbE/edit?usp=sharing
VIVO in a Box current document for feedback:

Future topics

Prioritizing and planning post-1.12 development
Forward-looking topics:
1. frameworks: Spring / Spring Boot / alternatives
2. Horizontal scalability
3. Deployment
4. Configuration : files / environment variables / GUI settings
5. Editing / form handling
6. Adding custom theming without customizing build
Post-release priorities
1. Ingest / Kafka
2. Advanced Role Management
3. Moving Scholars closer to core - next steps
Vitro JMS messaging approaches - redux
1. Which architectural pattern should we take?
2. What should the body of the messages be
Incremental development initiatives
1. Unable to locate Jira server for this macro. It may be due to Application Link configuration.
2. Unable to locate Jira server for this macro. It may be due to Application Link configuration.
3. Integration test opportunities with the switch to TDB - requires startup/shutdown of external Solr ..via Maven

Tickets

Status of In-Review tickets

type	key	summary	assignee	reporter	priority	status	resolution	created	updated	due
Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Notes

Draft notes on Google Drive

Item 0: Release of 1.12

Release is half ready… Ralph will try to publish to the sonatype repositories today
Official announcement, is there any difference from the alpha announcement?
Ralph: Pretty much the same
Wiki needs to be updated so the 1.12 page says it is the current release rather than future
Also need to go to Jira and close out 1.12 release and open up next release number. Reassign anything tied to 1.12 to the new release. Ralph will do this as part of his routine.
William: Merge into main from dependabot broke tests 4 days ago.
After Sonatype is sorted out should CI on Github work? Ralph: Yes
Ralph: old version of jUnit identified by dependabot as having a security vulnerability. https://github.com/vivo-project/Vitro/pull/191
Error from build:
1. Tests in error: testGetAllPossiblePropInstForIndividual(edu.cornell.mannlib.vitro.webapp.dao.jena.PropertyInstanceDaoJenaTest)
  getVClassesForPropertyTest(edu.cornell.mannlib.vitro.webapp.dao.jena.VClassDaoTest)
  modelIsolation(edu.cornell.mannlib.vitro.webapp.dao.jena.VClassDaoTest)
  testPreventInvalidRestrictionsOnDeletion(edu.cornell.mannlib.vitro.webapp.dao.jena.JenaBaseDaoTest)
  correctValues(edu.cornell.mannlib.vitro.webapp.dao.jena.VClassJenaTest)
  testTBoxModel(edu.cornell.mannlib.vitro.webapp.dao.jena.OntModelSegementationTest)

Conference debrief
1. Second highest attended VIVO conference of all time. Go us!
2. Brian: Keynote about OpenAIRE was eye opening.
  1. There’s a ton of data. Scope is worldwide. They are deduplicating, disambiguating all that stuff.
  2. Are their disambiguation tools available? Yes!
  3. Relevance to VIVO in a box? Open aire could be a good candidate to offload some of the work.
  4. Microsoft academic is closing down but they should be in a position to replace them with minimal impact
  5. Don: They have their own ontology but they also use others. Any collaboration from the ontology group? Group: No, unfortunately
  6. Ralph: There really haven’t been any groups that show they can crawl the data Microsoft Academic was crawling
  7. Don: Would be nice to not have to learn every data structure for every open source out there. E.g. Datacite and Unpaywall.
  8. Don: Q, does OpenAIRE have a worldwide perspective, or is it euro-centric?
  9. Brian: Seems like plenty of coverage of US universities
  10. Brian: Haven’t used SPARQL endpoint, but their search API results included disambiguation work. Looks like some interesting infrastructure behind the scenes that would be good to take advantage of.
  11. Don: Enjoyed the ETL track. The common denominator is the SPARQL transform. Proposal, can we standardize the SPARQL transforms for first class objects (e.g. people). Along the lines of the shapes concept.
    1. Michel: That is what VIVO proxy tries to do. Can directly communicate with VIVO UI. He reverse engineered the HTML communication with browser and automated it.
    2. If you put something like swagger in front, you can communicate with normal rest concepts, such as a JSON transform.
    3. William: Is this proxy dependant on the current user interface? Is that adding a coupling?
    4. Michel: Yes, true. The end user will not be concerned… but must update the proxy.
Reviewing 2019-01 Architectural Fly-in Summary#201901ArchitecturalFlyinSummary-Ingest
1. Closest thing we have to a roadmap at this point.
2. Still seems accurate.
3. We have multiple people attacking from different directions.
4. How do we take advantage of current work and also avoid redundant effort?
5. Ingest is one of main topics. Document has set of 10 requirements, including import must support both RDF and JSON.
6. Note in line number 5, could use models used in Freemarker UI (ie what Michel is doing).
7. Scroll down, next point is the UI. Freemarker will (supposedly) be deprecated by VIVO project. Idea being the front end would be generated using same models used by the ingest side.
8. Q from Don, is TAMU using GraphQL? William: No, that was Duke’s initiative. TAMU’s approach allows lazy loading of large datasets (GraphQL requires that be done on the back end so maybe some performance issues with that idea).
9. Brian: Should we still have GraphQL? William: We still have the API. Don: I think Duke is using it and CU is interested but we aren’t trailblazers.
10. William: I really like the idea of entity-centric import vs triple-centric. But it’s more complicated than that. Say you ingest authors and pubs. Then there’s another process that matches pubs to authors… is that manual, or triple-centric?
11. Michel: Large number of triples required for a person. If a person has a position, that’s ~88 statements in RDF. VIVO is generating 8-10 individuals to describe that. It requires 1-2 days of work to create that ETL.
12. Don: All the context nodes… What the hell is that?! We need a high level doc for the non-ontologists. Must re-learn the ontology every couple years when he revisits.
13. William: Would be nice to have a one-stop shop for all the triples necessary to create everything we need to talk about in VIVO, but it’s a massive effort.
14. Ralph: We should have a definitive statement of ‘this is what VIVO supports’ to limit organizations expanding their own structure.
15. Brian: Extending the ontology has always been tricky. It’s not desirable, often results in things being created that don’t make sense semantically. But one of the early appearls of Vitro was that it is flexible. There is a power in that we don’t want to lose.
16. Don: This reference spec, we all know the target (the VIVO ontology) somewhere before there for ingest and extract, how about a JSON document that fits that still covers the robustness of the VIVO ontology (e.g. author order).
17. William: JSON schema describes the documents holding the data. It must be extensible.
18. Huda: I think in addition to shapes for entities, we'd need shapes for relationships (if such a thing exists)
19. What William seems to be referring to speaks to two areas: (a) defining the relationships that need to be made and (b) the workflow or process for making them after the main entities have been created
20. William: Yeah, that and a versioned working document/s of these entities and relationships rdf.
21. Brian: The real win would be hiding all the high level ontological stuff that nobody but ontologists understand.
22. Brian: Can potentially start this week trying to draft some JSON schemes we can begin from.
23. Michel: It’s not useful to have something in the ontology if it’s not visible in the UI. We don’t want to replicate the structure of the ontology in JSON. We must replicate the UI.
24. William: We produce the UI json, makes sense. Next step, I’m not envisioning yet. We need to create API endpoints that accept this JSON. Where does the mapping to RDF exist? Is it editable?
25. Michel: We can take a string and convert to java objects.
26. William: Yes, but where is the mapping? Is a file? Group: It has to be.
27. Is the proxy in a place that the community can contribute? Michel: Yes. It is easy to do a sprint to improve/expand because it was done incrementally. It could also be used as a consumer for Kafka streams. Also, it’s a decoupled service (ie we could eventually remove Freemarker).
28. William: Swagger UI is not the ingest tool.
29. Brian: Should we schedule a sprint to work on this? Michel: Yep, sure. First step, open the code.
  1. How about availability?
  2. Michel: August will be out
  3. Brian: Unavailable 2nd half of July
  4. William: can’t give a time, but would like to participate
30. Data ingest group will be interested in this.
31. William: Should there be a separate repository for the JSON transformation documents?

Space shortcuts

Page tree

Date

Call-in Information

Slack

Attendees

Agenda

Future topics

Tickets

Notes

Space shortcuts

Page tree

2021-06-29 - VIVO Development IG

Date

Call-in Information

Slack

Attendees

Agenda

Future topics

Tickets

Notes