...
The directory .
/DSpace-VIVO/test/org
.vivoweb.dspacevivo.etlexample
/script/
contains bash scripts specific to the execution of the VIVO population scenario from DemoDSpace-6 and DemoDSpace-7. These scripts can inspire the design of scripts needed for harvesting proprietary DSpace instances
Script name nomenclature
The table below presents the list of scripts whose name is built according to the prefix-function_name.sh nomenclature
Code Block |
---|
00-env.sh flush_data_dspace.sh func-skip-first-line.sh map-document-with-author-to-vivo.sh produce-list-of-persons.sh
clean-all-transformation-directory.sh func-capitalize-each-fisrt-caracter.sh func-sort-list.sh map-expertise-and-item-to-a-person-to-vivo.sh transformation-map-dc_type.sh
ETL-migration-DSpace-VIVO.sh func-clean-begin-ending-whitespace.sh get-vivo-bibo-label.sh map-expertise-to-vivo.sh transform-map-expertise-and-item-to-a-person-to-vivo.sh
extract-dspace6.sh func-encode_string_to_expertise.sh load-data-doc_type-to-vivo.sh map-name-to-vivo-person.sh transform-map-vivo-doc-type.sh
extract-dspace7.sh func-encode_string_to_i18n_lowercase.sh load-data-expertises-to-vivo.sh map-vivo-doc-type.sh transform-map-vivo-expertises.sh
extract-dspace.sh func-encode_string_to_uid.sh load-data-person-expertise-to-vivo.sh mvn_install_example.sh transform-map-vivo-person.sh
flush_data_dspace6.sh func-encode_string_to_vivo-URI.sh load-data-person-to-vivo.sh produce-list-of-expertise.sh
flush_data_dspace7.sh func-remove-brace-to-uri.sh load-data-to-vivo.sh produce-list-of-itemtype.sh |
The table below shows the meaning of the prefixes:
Prefix | Description |
---|---|
extract- | Script prefixes for the data extraction step |
transform- | Script prefixes for the data transformation step |
load- | Script prefixes for the data loading step |
func- | Generic functions |
map- | Script for mapping DSpace data to the VIVO vocabulary. These scripts contain the SPARQL constuct queries needed for the mapping |
produce- | Production scripts for the various lists needed for ETL processes |
Specific scripts
The directory also contains scripts dedicated to specific actions
mvn_install_example.sh
Script used to compile Java programs
00-env.sh
This file is used to define the environment variables needed to run the extract/transform/load (ETL) process of dspace2vivo. Each script includes (source) this file
The code block below shows the list and meaning of the environment variables necessary for proper execution of the scripts
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
#!/bin/bash ################################################################### # Script Name : 00-env.sh # Description : This file is used to define the environment variables # needed to run the extract/transform/load (ETL) # process of dspace2vivo # Args : # Author : Michel Héon PhD # Institution : Université du Québec à Montréal # Copyright : Université du Québec à Montréal (c) 2022 # Email : heon.michel@uqam.ca ################################################################### # Scripts root directory export LOC_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd -P)" ################################################################### # Root installation directory of the different dspace2vivo packages export INSTALLER_DIR=$(cd $LOC_SCRIPT_DIR/../../../releng/org.vivoweb.dspacevivo.installer ; pwd -P) ################################################################### # Project root variables source $INSTALLER_DIR/00-env.sh ################################################################### # Executable and script path needed to run dspace2VIVO PATH=$LOC_SCRIPT_DIR:$PATH ################################################################### # Working directory of scripts export WORKDIR=$(cd $LOC_SCRIPT_DIR/../; pwd -P) ################################################################### # Directory of resources needed to configure the expected operation of the scripts export RESSOURCESDIR=$(cd $WORKDIR/src/main/resources ; pwd -P) ################################################################### # Directory containing the correspondence files between DSpace values and VIVO values export MAPPING_DATA_DIR=$(cd $RESSOURCESDIR/mapping_data ; pwd -P) ################################################################### # Resource directories after compilation. This directory is modified at each compilation (Do not edit) export RESSOURCES_TARGET_DIR=$(cd $WORKDIR/target/classes ; pwd -P) ################################################################### # Directory containing the queries necessary for the execution of SPARQL export QUERY_DIR=$(cd $RESSOURCESDIR/query ; pwd -P) ################################################################### # Repositories containing transient data from the extract/transform/load process export DATA_DIR=$(cd $WORKDIR/data ; pwd -P) export DATA_DEMO6_DIR=$(cd $WORKDIR/data_src_dspace6 ; pwd -P) export DATA_DEMO7_DIR=$(cd $WORKDIR/data_src_dspace7 ; pwd -P) ################################################################### # Data transition sub-directories for each step of the ETL process export ETL_DIR_EXTRACT=$DATA_DIR/extract export ETL_DIR_TRANSFORM=$DATA_DIR/transform export ETL_DIR_TRANSFORM_DOC_TYPE=$(cd ${ETL_DIR_TRANSFORM}_doc_type ; pwd -P) export ETL_DIR_TRANSFORM_PERSON=$(cd ${ETL_DIR_TRANSFORM}_person ; pwd -P) export ETL_DIR_TRANSFORM_EXPERTISES=$(cd ${ETL_DIR_TRANSFORM}_expertises ; pwd -P) export ETL_DIR_TRANSFORM_PERSON_EXPERTISES=$(cd ${ETL_DIR_TRANSFORM}_person_expertises ; pwd -P) |
...