Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Change directory to example-scripts/example-pubmed
  • Edit the pubmedfetch.config.xml file
    • Set the email parameter to your email address
    • Set the termSearch to your search. The search term is the same syntax as found on pubmed.org
    • For more information on these parameters and their use, please see PubmedFetch
  • Edit the vivo.model.xml file
  • Edit changenamespace-authors.config.xml, changenamespace-authorship.config.xml, changenamespace-journal.config.xml, and changenamespace-publication.config.xml files and set the namespace parameters in each one to be your vivo namespace
    • For more information on these parameters and their use, please see ChangeNamespace
  • Edit the run-pubmed.sh file and set the HARVESTER_INSTALL_DIR= to be the directory you unpacked the harvester in
  • Run bash run-pubmed.sh
  • Restart tomcat and apache2. You may also need to force the index to rebuild to see the new data. The index can be rebuilt by issuing the following URL in a browser:http://your.vivo.address/vivo/SearchIndex. This will require site admin permission, and prompt you to login if your not already.

...

Once you are ready to run a large dataset, it is advisable to the record storage from files to a database. Although this will make it harder to find individual records, speed and performance will be increased during the fetch and translate stage. To do so:

  • Edit the nano raw-records.config.xml to use TDB, which is a semantic data store

    No Format
    <RecordHandler>
            <Param name="rhClass">org.vivoweb.harvester.util.repo.JenaRecordHandler</Param>
            <Param name="type">tdb</Param>
            <Param name="dbDir">data/raw-records</Param>
    </RecordHandler>
    
  • Edit the translated-records.config.xml to use TDB, which is a semantic data store

    No Format
    
    <RecordHandler>
            <Param name="rhClass">org.vivoweb.harvester.util.repo.JenaRecordHandler</Param>
            <Param name="type">tdb</Param>
            <Param name="dbDir">data/translated-records</Param>
    </RecordHandler>