Overview

This installation procedure can be used to install the Image Reproducible Harvest application on a Debian Lenny operating system by a system administrator. This utility must be installed on the same server as the Harvester in order to function. This installation includes instructions for configuring the Harvester to allowd for Image Harvesting.

Installation

Prerequisites

  • Debian Lenny or Newer Operating System (Linux OS)
  • VIVO 1.2.x installed and configured
  • Harvester latest version Installed and configured to communicate with the VIVO database.
  • Mail transfer agent (IE. Exim4) configured for sending outbound mail
  • ImageMagick 6.6.2-6 or newer installed

UF-Specific Prerequisites

  • Apache ActiveMQ 5.4.x Installed (Note: Version 5.5.x may cause incompatibility errors) [See Procedures in Addendum for Instructions]
  • Gator1 Public Certificate Imported to the JRE Keystore [See Procedures in Addendum for Instructions]
  • People in VIVO harvested by PeopleSoft Harvester or People in VIVO with UFID’s.

Download the Image Harvest package from the VIVO SourceForge web site Example Harvest Scripts Downloads

Image Harvest Configuration

  • cd /usr/share/vivo/harvester/example-scripts/example-images
  • Ensure that /usr/share/vivo/harvester/example-scripts/example-images/vivo.model.xml settings match setting in VIVO’s deploy.properties file
    • nano vivo.model.xml
    • Set the username in: USERNAME
    • Set the password in: PASSWORD
  • Move required files
    • cd /usr/share/vivo/harvester/example-scripts/example-images/
    • mv harvester-image harvester.jar /usr/share/vivo/harvester/bin
    • mv activemq-all-5.5.0.jar /usr/share/vivo/harvester/bin/dependency
  • Configure image.sh shell script
    • nano image.sh
    • EMAIL_RECIPIENT=youremailaddress@ufl.edu
    • set MAX_FECTHED variable to number of Images required to be fetched in a single harvest or set it to 0(zero) to fetch all images from the queue
  • Configure run-image shell script
    • nano run-image.sh
    • VIVO_LOCATION_IN_TOMCAT_DIR=”Provide the path of vivo located in your tomcat”
    • HARVESTER_INSTALL_DIR=”Provide path to harvester installation directory”
  • Configure ActiveMQ connection
    • nano system.properties
    • set “vivo-server-url “ to the sever on which Vivo is running . For example if is runnig on localhost add “vivo-server-url=http://localhost:8080/vivo/
    • Set connection information username/password as provided by ActiveMQ provider.

Harvest Images

  • Execute pre-harvest analytics [See addendum for example SPARQL query]
  • sudo bash image.sh
  • Wait for console output to state “Successfully Harvested Images” (This may take several minutes depending on the number of people in VIVO)
  • Check console output for Harvest Execution time.
  • Review images in VIVO web application
  • Review Harvester log file in /usr/share/vivo/harvester/example-scripts/example-images/logs/example-images.DATETIME.log the images to.
  • Execute post-harvest analytics [See addendum for example SPARQL query]
  • Review email-log . Individual users can be varified for the associated image by clicking on the URL provided in the email

Schedule as CRON Job

Configuration Changes for running the image Harvest with out using ActiveMQ

  • In order to execute the image ingest with out activemq please follow the below instructions:
  • In the image.sh script comment the line:

    #harvester-image -p $HARVESTER_INSTALL_DIR/example-scripts/example-images [This files pull images from activemq]

  • Create "images" directory in example-images and move the images to it with out any extensions i.e it should only contains images with just UFID's.(For example 11112222 instead of 11112222.jpeg)

Addendum

Procedures

Install Root Certificate from UF BSD Site

  • Open Firefox on any computer and browse to https://bsd-dev-activemq.bsd.ufl.edu:61617
  • You will now probably see a dialog box warning you about the certificate. Click on Add Exception --> Now click on the 'View' --> Details -> Export and save it as a certificate.txt file
  • Create a ~/certificate.txt file on the server you are installing the image harvester on and paste the certificate text into the file and save.
  • cd /usr/lib/jvm/java-6-sun/jre/lib/security
  • keytool -import -keystore cacerts -file ~/certificate.txt
    • Default password is “changeit”
  • Type “yes” and Enter to import certificate
  • Verify the private root certificate has been added by executing “keytool -list -v -keystore cacerts”

Application Directory Structure

  • ../example-images/
    • image.sh- is source directory for getting images.Its is populated from Gator one database using JMS, ActiveMQ
    • fullImages- contains full images required by ViVO
    • thumbnails -contains thumbnail Images required by VIVO
    • system.properties - contians ServerURL , UserName/Password , ActiveMQ queue Name.
    • backup - contains images that needs to be harvested in the next harvester run.
    • data - contains data that s used by the harvester
    • images - it’s the input to the image harvester.
    • logs - contains logs, of each harvester that is run
    • upload - contains images that needs to be uploaded in to VIVO. upload contains following sub directories
      • mainImages - contains full sized images of the people that are to be ingested in to VIVO
      • thumbnails - contains thumbnails of the people that are to be ingested in to VIVO
    • other configuration files that are used are:
      • diff-subtractions.config.xml
      • image-to-vivo.xsl
      • model.xml
      • raw-records.config.xml run-image.sh
      • xsltranslator.config.xml
      • harvested-data.model.xml
      • score-data.model.xml
      • translated-records.config.xml
      • vivo.model.xml
      • diff-additions.config.xml images
      • match-roles.config.xml
      • previous-harvest.model.xml
      • score-people.config.xml
      • ufids.txt
      • vivo.Override.xml

Analytics

  • Total number of people in VIVO
    **

    SELECT count(?person)
    where
    {
    ?person rdf:type foaf:Person .
    }

  • Total number of people with UFID's in VIVO
    **

    SELECT count(?URI)
    WHERE
    {
    ?URI rdf:type foaf:Person .
    ?URI ufVivo:ufid ?UFID .
    }

  • Total number of people with out UFID's in VIVO
    **

    SELECT count(?u)
    WHERE
    {
    ?u rdf:type foaf:Person .
    OPTIONAL {?u ufVivo:ufid ?y . }
    FILTER (!bound(?y))
    }

  • No labels