(See also Jon Corson-Rikert's examples.)
The ingest workflow language was added in August 2008 as a simple way of scripting actions that would otherwise require manual interaction with the Ingest Tools page. At the time it was imagined that sequences of ingest tool actions might be saved as a workflow and edited through the GUI, but this functionality was never implemented.
...
RDF workflow descriptions use the following namespaces:
No Format |
---|
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix w: <http://vitro.mannlib.cornell.edu/ns/vitro/rdfIngestWorkflow#> .
@prefix s: <http://vitro.mannlib.cornell.edu/ns/vitro/0.7/sparql#> .
@prefix vitro: <http://vitro.mannlib.cornell.edu/ns/vitro/0.7#> .
@prefix ex: <http://example.org/myWorkflow#> .
|
...
Each workflow begins with a resource representing that workflow itself:
No Format |
---|
ex:MyFirstWorkflow
a w:Workflow ;
rdfs:label "My First Workflow" ;
w:firstStep ex:step1 .
|
...
A workflow is just a linked list of workflow steps. Each workflow step has a label and a property pointing to the action that is to performed at this step. This arrangement means that a particular action can be defined once and performed at multiple different steps in a workflow.
No Format |
---|
ex:MyFirstWorkflow
a w:Workflow ;
rdfs:label "My First Workflow" ;
w:firstStep ex:step1 .
ex:step1
a w:WorkflowStep ;
rdfs:label "Do something first" ;
w:action ex:someAction ;
w:nextStep ex:step2 .
ex:step2
a w:WorkflowStep ;
rdfs:label "Now do something else" ;
w:action ex:anotherAction ;
w:nextStep ex:step3 .
ex:step3
a w:WorkflowStep ;
rdfs:label "Do a third thing" ;
w:action ex:aThirdAction .
|
...
There are eight different types of actions, which use different properties to specify their parameters:
No Format |
---|
w:ClearModelAction
w:AddModelsAction
w:SubtractModelsAction
w:ExecuteSparqlConstructAction
w:SmushResourcesAction
w:NameBlankNodesAction
w:SplitPropertyValuesAction
w:ProcessPropertyValueStringsAction
|
Actions work on one or more RDF models, which must be visible to the ingest tools either in the default database or by using the "Connect DB" page. Models are represented as resources of type w:Model:
No Format |
---|
ex:myWorkingModel
a w:Model ;
rdfs:label "working model" ;
w:modelName "working model" .
|
...
When actions are not reused across multiple workflow steps, it can be convenient to describe them using blank nodes. This avoids having to assign a separate URI to each action, and also allows them to be written inline:
No Format |
---|
crw:CreateFullNetIDs
a w:WorkflowStep ;
rdfs:label "Create Cornell NetID property"@en-US ;
w:action [
a w:SPARQLCONSTRUCTAction ;
w:sourceModel crw:cheResponseModel ;
w:destinationModel crw:cheResponseModel ;
w:sparqlQuery
[ s:queryStr """ PREFIX che: <http://vitro.mannlib.cornell.edu/ns/ingest/CHE#>
PREFIX vivo: <http://vivo.library.cornell.edu/ns/0.1#>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
CONSTRUCT {
?s vivo:CornellemailnetId ?o
} WHERE {
?s che:cheresponse_Netid ?o
}""" ]
] .
|
...
parameters: w:sourceModel
example:
No Format |
---|
crw:ClearWorkingModel
a w:ClearModelAction ;
rdfs:label "clear working model" ;
w:sourceModel ex:workingModel .
|
...
Multiple source models may be specified for this action, by adding additional statements using w:sourceModel .
example:
No Format |
---|
crw:CreateFullNetIDs
a w:WorkflowStep ;
rdfs:label "Create Cornell NetID property"@en-US ;
w:action [
a w:SPARQLCONSTRUCTAction ;
w:sourceModel crw:cheResponseModel ;
w:sourceModel crw:anotherModel ;
w:destinationModel crw:cheResponseModel ;
w:sparqlQuery
[ s:queryStr """ PREFIX che: <http://vitro.mannlib.cornell.edu/ns/ingest/CHE#>
PREFIX vivo: <http://vivo.library.cornell.edu/ns/0.1#>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
CONSTRUCT {
?s vivo:CornellemailnetId ?o
} WHERE {
?s che:cheresponse_Netid ?o
}""" ]
] .
|
...
The value of the w:smushOnProperty statement is a literal containing the URI of the property on which to smush, e.g.
No Format |
---|
ex:smushAction
a w:SmushResourcesAction ;
w:sourceModel ex:model1 ;
w:destinationModel ex:model2 ;
w:smushOnProperty "http://www.example.org/ontology/employeeID" .
|
...
The value of the w:uriPrefix statement is a literal containing the namespace for the URIs plus the initial non-numeric portion of the local name.
For example,
No Format |
---|
w:uriPrefix "http://example.org/individual/n"
|
will cause the action to rename blank nodes with URIs that look like
No Format |
---|
<http://example.org/individual/n23568>
<http://example.org/individual/n41>
<http://example.org/individual/n9156662>
|
...
The value of the w:originalProperty is a literal containing the URI of the property with the delimited values and the value of the w:newProperty is a literal containing the URI of the property to be used for each split value.
example:
No Format |
---|
ex:SplitValues
a w:SplitPropertyValuesAction ;
w:sourceModel ex:model1 ;
w:destinationModel ex:model2 ;
w:originalProperty "http://example.org/ontology/departmentIDs" ;
w:newProperty "http://example.org/ontology/departmentID" ;
w:splitRegex "," .
|
This will transform the following RDF in the source model
No Format |
---|
ex:something
<http://example.org/ontology/departmentIDs> "CHEM, BIO, BIO-PL, FOODS" .
|
into the following RDF in the destination model
No Format |
---|
ex:something
<http://example.org/ontology/departmentID> "CHEM" ;
<http://example.org/ontology/departmentID> "BIO" ;
<http://example.org/ontology/departmentID> "BIO-PL" ;
<http://example.org/ontology/departmentID> "FOODS" .
|
...
parameters: w:sourceModel, w:destinationModel, w:originalProperty, w:newProperty, w:processorClass, w:processorMethod
example:
No Format |
---|
crw:AppendCornellEdu
a w:WorkflowStep ;
rdfs:label "Append @Cornell.edu to net ids"@en-US ;
w:action [
a w:ProcessPropertyValueStringsAction ;
w:sourceModel crw:cheResponseModel ;
w:destinationModel crw:cheResponseModel ;
w:originalProperty [
a w:Literal ;
w:literalValue "http://vivo.library.cornell.edu/ns/0.1#CornellemailnetId"
] ;
w:newProperty [
a w:Literal ;
w:literalValue "http://vivo.library.cornell.edu/ns/0.1#CornellemailnetId"
] ;
w:processorClass [
a w:Literal ;
w:literalValue "edu.cornell.mannlib.vitro.bjl23.ingest.hr.HRCornellEmailProcessor"
] ;
w:processorMethod [
a w:Literal ;
w:literalValue "process"
]
] .
|