Agenda

Discussion about sprints' topics
1. https://docs.google.com/document/d/1hJSWAa3ENoFOYyp0GyvDqBdehra3AmFBAD9X2dX3cSo/edit?usp=sharing

Notes

Discussion about sprints' topics

https://docs.google.com/document/d/1hJSWAa3ENoFOYyp0GyvDqBdehra3AmFBAD9X2dX3cSo/edit?usp=sharing

Dragan: Lets discuss springs’ topics. Last time we started discussion: One idea is to start with Dynamic API, but other topics also could be interesting. I am not sure if i18n is a good candidate for the next sprint topic, although it should be finished. I hope if you have VIVO instance in French and not 100% aligned with all labels (in accordance with local dialect of French) it is not discourage you from using VIVO. Language completion would be a nice feature. So for me i18n is functional, but not completed.

Brian: I think most critical incomplete thing related to autocompletion. Search index is not internationalized. The problem with sorting in Solr. Sorting results are out of order. It not a big task, but it would be nice to have official solution to this problem related to i18n.

Dragan: It doesn’t sound as a lot of materials for a sprint.

Brian: It is not on the list and could be done in parallel, it doesn’t require sprint for this.

Dragan: For searching you can define different preprocessing rules. If in german interface should Solr apply German rules?

Brian: For now we have only English specific filters. But in addition to this appropriate processing in Solr. Added custom DocumentModifier to store in appropriate fields for language instead of storing only English. Need to then be able to sort on appropriate field when browsing. Have one stock Solr schema from VIVO solr project and put off dynamic updates using Solr API, but may need to look at dynamic Solr API updates to add configurations dynamically. Need to review design to see how best to handle that.

Dragan: For tokenizing and filtering, may be easier for some languages (e.g. English, French, German, Spanish) but perhaps not as easy for languages like Serbian. Conclusion: Important to make some strategy/define some roadmap but may be possible to implement what we need without a sprint.

Data ingest tasks are always important and some part of VIVO in a box also covers data ingest. Georgy mentioned that dynamic API may also be useful for data ingest.

Georgy: Also added Solr queries to dynamic API document because there may be use cases for this.

Dragan: Data ingest may be good topic for sprint. Seems like may be more realistic to have dynamic API as the first sprint topic, while data ingest may be later. When is it feasible to have roadmap for data ingest?

Ralph: Directions have changed recently. Went back and reviewed what the Data Ingest task force had been doing. Plan is to start back at beginning of year and look directly at VIVO in a box and some of the new directions we may want to take with Solr, etc.

Dragan: Can expect progress in first half of next year, and could be topic for sprint in May?

Ralph: April/May makes sense

Dragan: Dynamic API may take us two sprints.

Dragan: Many requests from institutions regarding customization.

Brian: Might be interesting sprint topic if we get new people on board. Implementing it right now under old front-end.

Dragan: A few lines about documentation for VIVO installation. Might be good to spend one sprint on this documentation. For updating the tests/improvement of tests. But this may not be a popular sprint. The work needs to be done but not sure if this is a sprint.

Decoupling VIVO ontology and platform: Separate ontology into own repository and to make platform independent of changes in the ontology. Might be that dynamic API could also help us. Might be an interesting topic. Would be a candidate after Dynamic API, which would help with custom entry forms. Would localize ontology for local needs.

Brian: Dynamic API seems like it would go a long way for decoupling ontology-specific editing logic from the editing and getting rid of JAVA generator classes. Later, could look through list views, templates, etc. Those look like more easily definable tasks. Also have a standard way of adding data to the triple store.

Dragan: Cross-site linking and searching. Seems like it could be a primary motivation for using VIVO. Candidate for a sprint, but question is when would this happen.

Brian: Strategic high level planning questions to resolve here. Thorny issue . At the beginning of the NIH project, this was a topic that was discussed. Now, need to consider what should be addressed strategically.

William: Need to decompose into search, discovery, queries. Decentralization is good, but not fan of harvesting since it seems counter to the linked data idea.

Dragan: Harvesting from multiple sites and then generating a new search, but that seems like it’s not using semantic web paradigms.

William: Michel had workflow/process ideas on being able to query multiple endpoints.

Dragan: May have some dilemmas regarding federated search. Centralized or decentralized? Privacy concerns? Public interface for reasoning, searching, etc. in addition to public interface. This topic seems like a candidate but may not be a sprint candidate in the nearer future.

Also there is a Box folder with ideas: “VIVO post 1.12 development priorities” (https://docs.google.com/spreadsheets/d/103P9P4v6yUBSb5BnVaK40NoGx1fIYyL8uaHKUubZWbE/edit#gid=0&range=64:72). Does this list include a lot of boring tasks or those that require high expertise from implementers? Section in this document called “semantic web features”.

Huda: What does line 70 “Indexer module to weight and index the concepts from the SKO’s reasoner” refer to?

Brian: SKOS. Prior row talks about being able to take advantage of SKOS narrower/broader relationships. This row talks about modifying search ranking/relevance using SKOS information.

Dragan: Are any of these topics in this document urgent enough that they cannot wait until next year?

Brian: TDB bug in line 67 may be a good improvement to work on in the short-term and would reduce the questions on slack. The other rows don’t seem like super high priority for the majority of the community.

Dragan: Complexity not too high?

Brian: For row 67, not high . 68 (hard coded SPARQL queries in list views vs ontology-driven) medium complexity.

Dragan: Working with DSpace. Would not be a sprint, but a pilot project. Needs to be presented at leadership group 12/15. Realistic that it would be work we could look at in January.

Integration with repositories task: Not sprint candidate and would wait to work on it.

Exporting data in different formats: OpenAIRE. Any other requests for other formats? Christian said that some institutions in Eastern Europe would want to adopt VIVO but need different formats. First step, would be CERIF to VIVO mapping. Not sure if this is a sprint candidate, if only OpenAIRE. If there are additional formats and we want to make an architecture for enabling people to specify which format they want, would be useful to consider as increases interoperability of system and data.

Brian: Hard to evaluate importance in vacuum. If large number of institutions need a solution to make their data available to OpenAIRE, could be an important driver of adoption even if just one format.

Dragan: Testing? Need explicit sprint? Not a perfect candidate.

VIVO in a box: Focus has moved a little in the last couple of months.

Brian: Benjamin Gross articulated initial VIVO in a box idea was considered as work that included minimal work to include different portions, but scope creep happened. Not sure whether we wait to fix everything we want to fix and then that becomes VIVO in a box, or take what we have and add together.

Dragan: Decoupling front and back end. Not sure it needs to be the first sprint next year, probably later. Need to define more concrete tasks in this area.

Expert finder: some VIVO installations actually center “looking for expert” on the front page.

Completion of i18n tasks: Should be implemented but not in a sprint

Sprint schedule: 4 per year

February: Dynamic API based on ontology. Good to have something prepared for sprint. Have plan for implementation before the sprint. Making code for proof of concept.

Georgy: Discussed with Brian. Have good thoughts about implementation approach. Make it extensible and not rigid. Make it testable.

Please feel free to add your comments and thoughts: https://docs.google.com/document/d/1hJSWAa3ENoFOYyp0GyvDqBdehra3AmFBAD9X2dX3cSo/edit

Draft notes on Google Drive

Task List

Dragan Ivanovic to align topics section of the Roadmap 1.x document with the meeting discussion
Dragan Ivanovic to align VIVO development priorities with the meeting discussion
All - Add comments to Georgy's document

Space shortcuts

Page tree

Versions Compared

Old Version 1

New Version Current

Key

Agenda

Notes

Task List

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 1

New Version Current

Key

Agenda

Notes

Task List