Deprecated. This material represents early efforts and may be of interest to historians. It doe not describe current VIVO efforts.

Attendees

Jonathan Corson-Rikert
Brian Caruso
Brian Lowe
Jim Blake
Jonathan Markow
Andrew Woods 

Call-in Details

Call-in toll-free number (US/Canada): 1-855-244-8681 
Call-in toll number (US/Canada): 1-650-479-3207
Access code: 646 242 414

Agenda

  1. Identifying the technical tasks required to bring a new implementation of the vivosearch.org prototype online as a cloud service
  2. Estimate the developer resources and time required

Discussion

Wiki updates
  1. Brian has done some wiki gardening
  2. Moved content from other space into VIVOSearch
Goals and Approach Review

wiki page

  1. Review in light of stages/phases
  2. If we do the same thing as before, users will notice disambiguation issues
  3. How to resolve the disambiguation issue?
    • Come up with a community approach
      • Rely on partner organizations to map URLs of the same user
  4. Last VIVO management meeting...
    • VIVOSearch is promising, but involves considerable work to get there
    • This will need to be explained to people
    • Existing VIVOSearch should be considered a prototype
    • This communication could encourage CTSAs to contribute
  5. Need to create estimates of what effort will take to get to first phase
    • Indicate what functionality falls into each phase
  6. Features of search
    • full text - yes
    • semantic - future
    • faceted - yes
    • complex queries - future
    • people - yes
    • publications, organizations, etc - yes
  7. Description of approach
    • ok
  8. Approach to building index
    • Same as what is currently in place
    • Current VIVO does not have a direct way to get institutional URIs
    • VIVO used to get RDF for each URI, then make subsequent requests as needed
      • Can investigate new approaches
    • Policy questions
      • How much data do we want to get from each resource (e.g. people)
      • This is the kind of thing that needs to be asked of the institutions
      • Suggestion to collect these tasks in a spreadsheet
        • Include time estimates, and outstanding questions
      • How to determine when external resources have changed
  9. Keeping the index up-to-date
    • Hope is that the approach is same as building the index, with different input
  10. Alternatives
    • Possibly can raise with broader community
  11. Technology choices
    • HTTP for retrieving RDF, yes
    • What is the adoption of SPARQL in the community
    • It may be nice to demonstrate that a SPARQL endpoint is not needed to enable interesting results
    • Solr, seems reasonable for now
      • Considering having Solr in one place versus distributed Solr (master/slaves)
    • Web interface: drupal with solrsearch.js
      • Most work is on clientside with js
      • This continues to be appealing
      • We have limited insight into this component
      • Suggestion to create list of default technologies, criteria, and alternatives
    • Hadoop is currently reasonable choice
    • Ruby (blacklight/hydra) or Drupal?
      • The js pattern allows from minimal reliance on Drupal
    • Need a mock-up of the UI to inform design of solr index
    • BootStrap is an interesting js framework to consider
    • Drupal upgrade cycle can be onerous
Tasks Review

wiki page

  1. Brian Caruso has limited available in 2 week windows
  2. Need to clearly define what goes into phase 1
  3. Request to create comprehensive planning spreadsheet
    • Column for time estimates, phase, issues, etc
  4. Need to meet more regularly
Next steps
  1. ...
  • No labels