Deprecated. This material represents early efforts and may be of interest to historians. It doe not describe current VIVO efforts.

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Task NameTime Est. hours% DoneAssigneeLink to section
Setup Development Environment80 1
Get URIs for Institution320 2
Get data for an individual URI320 3
Mockup of Search UI240 4
Create Solr Doc from data for URI400 5
Working search UI prototype400 6

baisc multi-node Hadoop cluster on IaaS

400 7

automated and scripted cluster on IaaS

400 8
Data validation code for Institution's data800 9
     

1 Setup development environment

Setup development server.

Document single node Hadoop setup.

Development Solr service setup.

Ant/Ivy build

Git repository

Wiki/git README documentation.

2 Develop code to build list of URIs to index for Institution from standard 1.5.1 VIVO instance

There is code to parse Catalyst pages to URIs (CatalystPageToURIs.java) and to parse the JSON from VIVO ( ParseDataSErviceJson.java).  There is code to do the discovery of URIs for Catalyst and VIVO in LinkedDataIndexer/src/main/scal/edu/cornell/indexbuildere/discovery in VivoUriDiscoveryWorker.scala and CatalystDiscveryWorker.scala. These files could be used as examples but they depend heavily on the akka framework which we'd like to move away from.

3 Develop code to gather data required for an individual URI

See UrisForDataExpansion.java for an example of how this was done in the prototype.

4 Mockup of search UI

5 Develop code to build and index Solr document from data for URI

This depends on Mockup of the search UI in order to develop the schema for the Solr index.

SolrDocWorker.scala uses the DocumentModifier from the Vitro code to generate a Solr document from a model for a URI. We may want to reuse this approach.  Much of this code is found in LinkedDataIndexer/src/main/java/edu/cornell/mannlib/vitro/webapp/search/solr.  There can be found a new translate that works well without the webapp context at MultiSiteIndexToDoc.java and new DocumentModifiers that are needed for multi site indexing.

6 Working Prototype of Search UI

Make tech decisions about serving search UI and about how the UI client will communicate with the Solr service.

7 Explore multi-node Hadoop cluster deployed to IaaS   

8 Scripted deploy of multi-node hadoop cluster on IaaS

9 Data Validation code for institution's data

  • No labels