Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. There are some parts of the technology stack that are suggested by the goal of indexing data from VIVO. 
    • Using HTTP requests for RDF to gather data from the sites is the most direct approach.
    • Most other options for gathering data from the VIVO sites would need additional coding.
  2. In general we would go with Solr for the search index because of we have experience with it, because of its documentation, because of it distributed features and because it is mature.
  3. As of 2012 vivosearch.org uses Drupal and solrsearch javascript libraries.  The js libraries allow the development of the search UI with only client side interaction ( https://github.com/evolvingweb/ajax-solr ).
    • This choice could be revisited for the multi-site VIVO search project. 
  4. In order to scale the process out we were planing to use Hadoop to manage parallel tasks. 
    • Many approaches to the problem of indexing linked data from VIVO sites would be embarrassingly paralleled.

...