Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Harvester

Match.java takes a model generated by Score and renames matches, creates links, or removes literals based on the associated scores.

...

Short Option

Long Option

Parameter Value Map

Description

Required

i

inputJena-config

CONFIG_FILE

inputJena JENA configuration filename

true

I

inputOverride

override the JENA_PARAM of inputJena jena model config using VALUE

false

o

output-config

CONFIG_FILE

outputConfig JENA configuration filename

true

V

vivoOverride

override the JENA_PARAM of vivoJena jena model config using VALUE

false

s

score-config

CONFIG_FILE

score data JENA configuration filename

true

S

scoreOverride

override the JENA_PARAM of score jena model config using VALUE

false

t

threshold

THRESHOLD

match records with a score over THRESHOLD

true

l

link

link the two matched entities together using INPUT_TO_VIVO_PREDICATE and INPUT_TO_VIVO_PREDICATE

false

r

rename

 

rename or remove the matched entity from scoring

false

c

clear-type-and-literals

 

clear all rdf:type and literal values out of the nodes matched

false

Usage

No Format

//from the env file
Match="java $OPTS -Dprocess-task=Match org.vivoweb.harvester.score.Match"

//from the script file
SCOREINPUT="-i $H2MODEL -ImodelName=$MODELNAME -IdbUrl=$MODELDBURL -IcheckEmpty=$CHECKEMPTY"
SCOREDATA="-s $H2MODEL -SmodelName=$SCOREDATANAME -SdbUrl=$SCOREDATADBURL -ScheckEmpty=$CHECKEMPTY"
MATCHOUTPUT="-o $H2MODEL -OmodelName=$MATCHEDNAME -OdbUrl=$MATCHEDDBURL -OcheckEmpty=$CHECKEMPTY"
MATCHTHRESHOLD = 1.0

$Match $SCOREINPUT $SCOREDATA $MATCHOUTPUT -t $MATCHTHRESHOLD -r -c

...

The match class runs a sparql query on the score data. This can help access the score data for other purposes.

No Format

?sInput = Input URI
?sVivo  = Vivo URI

PREFIX scoreValue: <http://vivoweb.org/harvester/scoreValue/>
SELECT DISTINCT ?sVivo ?sInput (sum(?weightValue) AS ?sum)
WHERE {
  ?s scoreValue:InputRes ?sInput . 
  ?s scoreValue:VivoRes ?sVivo .
  ?s scoreValue:hasScoreValue ?value .
  ?value scoreValue:WeightedScore ?weightValue .
}
GROUP BY ?sVivo ?sInput 
HAVING (?sum >= threshold ) 
ORDER BY ?sInput