...
Attendees
Indicating note-taker
- Ralph O'Flinn
- Jim Blake
- Huda Khan
- Tim Worrall
- Don Elsborg Kitio Fofack
- Andrew Woods
- Mike Conlon
- Christian Hauschke
- Alex Viggio
- Brian Lowe
Agenda
Agenda
Report from the field: ElasticSearch instead of Solr (Jim Blake )Mailing list queries
- Documenting ingest approaches
- CU Boulder: VIVO-Harvester (direct to Jena)
Cornell: home-grown (sparql-update)
Brian Lowe: home-grown, or extensions (sparql-update)
??- Sept sprint planning
- Active tickets:
(pending response - Benjamin Gross )Jira server DuraSpace JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key VIVO-1501
(Muhammad Javed - to review)Jira server DuraSpace JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key VIVO-1524
(Kitio Fofack - where does this stand?)Jira server DuraSpace JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key VIVO-1451
- Planning for a demo and walk-through of:
(needs reviewers)Jira server DuraSpace JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key VIVO-1436 - Modularizing VIVO
- Search index
- Triplestore
- Frontend UI
Jira server DuraSpace JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key VIVO-1443
Notes
ElasticSearch instead of Solr
- Swapping in ElasticSearch 6.3 for Solr.
- JIm demonstrated VIVO 1.8 running elasticSearch.
Notes
...
- Downloaded elasticSearch last week and spent a couple of days creating a VIVO driver.
- Needs:
- Docs
- Smoke test
- Functional testing
- Improved snippets
- Code improvements
- Unit tests
- Automatic initialization of the index
- Why Elasticsearch
- Create options for sys admins.
- Some sites are already bought in to ElasticSearch. They love it.
- Certainly should not run in parallel. Should be implemented as part of VIVO.
- Put more data in the index
- Elastic has nested fields that keep their relationships publication uris on the author’s record. Publication names.
- Does the current version of Solr have these features?
- Ownership characteristics – both seem okay? VIVO rather small by elasticSearch standards?
- ElasticSearch has an Apache license
- Both ElasticSearch and Solr are based on underlying Lucene technology
- See In case this helps (not sure if the article is 'good' or not but here you go): https://www.searchtechnologies.com/blog/solr-vs-elasticsearch-top-open-source-search
- And https://sematext.com/blog/solr-vs-elasticsearch-differences/ and https://db-engines.com/en/ranking/search+engine (rank Elasticsearch #1 based on their ranking methodology)
- Perhaps not a search for “best” but which is best in a particular environment and for a particular application
- After Solr 5, (current is Solr 7) Solr is a free-standing application, like ElasticSearch.
- Explore nested documents in elasticSearch. Current VIVO interface
- Developed in 1.8 since the ant environment is so much more productive than the 1.9/1.10 Maven environment.
- Are there touch points with product evolution?
- Does the nested doc capability, and the ability to have ElasticSearch in the architecture resonate with product evolution?
- Product Evolution is looking at GraphQL for its API capabilities.
- How do we see applications related to VIVO being installed? By the installer?
- For try out, a jar?
- For try out, a VM?
- For production, apps must be installed? As we require MySQL and Tomcat now? Solr or ElasticSearch in the future?
Ingest Approaches
- Add ingest tools to the table in the apps and tools catalog
Topics for September Sprint
- abox/tbox topic
- ElasticSearch
- Internationalization
- Decoupling?
Planning for walk-through of large pull-request
- Graham’s pull request is a big one. May need some additional hands.
Previous Actions
- Don Elsborg to document "firsttime" resolution in CU BOulder wiki, circulate this doc to email list and discuss as a team how to integrate this in VIVO documentation
- moved initialTBoxAnnotations.n3 back to firsttime. Need to edit this to change a few labels. eg authors → "CU Boulder Authors"\
- Had some changes in propertygroups.rdf, in 1.7 this was moved to firsttime. Left it there now
Actions
- ..Alex Viggio will bring news of Elasticsearch instead of Solr up with Product Evolution. Might there be consequences for the September sprint.