Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The directory ./DSpace-VIVO/test/org.vivoweb.dspacevivo.etlexample/script/ contains bash scripts specific to the execution of the VIVO population scenario from DemoDSpace-6 and DemoDSpace-7. These scripts can inspire the design of scripts needed for harvesting proprietary DSpace instances

Script name nomenclature

The table below presents the list of scripts whose name is built according to the prefix-function_name.sh nomenclature

Code Block
00-env.sh                              flush_data_dspace.sh                     func-skip-first-line.sh                map-document-with-author-to-vivo.sh            produce-list-of-persons.sh
clean-all-transformation-directory.sh  func-capitalize-each-fisrt-caracter.sh   func-sort-list.sh                      map-expertise-and-item-to-a-person-to-vivo.sh  transformation-map-dc_type.sh
ETL-migration-DSpace-VIVO.sh           func-clean-begin-ending-whitespace.sh    get-vivo-bibo-label.sh                 map-expertise-to-vivo.sh                       transform-map-expertise-and-item-to-a-person-to-vivo.sh
extract-dspace6.sh                     func-encode_string_to_expertise.sh       load-data-doc_type-to-vivo.sh          map-name-to-vivo-person.sh                     transform-map-vivo-doc-type.sh
extract-dspace7.sh                     func-encode_string_to_i18n_lowercase.sh  load-data-expertises-to-vivo.sh        map-vivo-doc-type.sh                           transform-map-vivo-expertises.sh
extract-dspace.sh                      func-encode_string_to_uid.sh             load-data-person-expertise-to-vivo.sh  mvn_install_example.sh                         transform-map-vivo-person.sh
flush_data_dspace6.sh                  func-encode_string_to_vivo-URI.sh        load-data-person-to-vivo.sh            produce-list-of-expertise.sh
flush_data_dspace7.sh                  func-remove-brace-to-uri.sh              load-data-to-vivo.sh                   produce-list-of-itemtype.sh

The table below shows the meaning of the prefixes:

PrefixDescription
extract-Script prefixes for the data extraction step
transform-Script prefixes for the data transformation step
load-Script prefixes for the data loading step
func-Generic functions
map-Script for mapping DSpace data to the VIVO vocabulary. These scripts contain the SPARQL constuct queries needed for the mapping
produce-Production scripts for the various lists needed for ETL processes

Specific scripts

The directory also contains scripts dedicated to specific actions

mvn_install_example.sh 

Script used to compile Java programs 

00-env.sh

This file is used to define the environment variables needed to run the extract/transform/load (ETL) process of dspace2vivo. Each script includes (source) this file

The code block below shows the list and meaning of the environment variables necessary for proper execution of the scripts


Code Block
languagebash
title00-env.sh
linenumberstrue
collapsetrue
#!/bin/bash

###################################################################
# Script Name   : 00-env.sh
# Description   : This file is used to define the environment variables 
#                 needed to run the extract/transform/load (ETL) 
#                 process of dspace2vivo
# Args          : 
# Author        : Michel Héon PhD
# Institution   : Université du Québec à Montréal
# Copyright     : Université du Québec à Montréal (c) 2022
# Email         : heon.michel@uqam.ca
###################################################################
# Scripts root directory
export LOC_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd -P)"

###################################################################
# Root installation directory of the different dspace2vivo packages
export INSTALLER_DIR=$(cd $LOC_SCRIPT_DIR/../../../releng/org.vivoweb.dspacevivo.installer ; pwd -P)

###################################################################
# Project root variables
source $INSTALLER_DIR/00-env.sh

###################################################################
# Executable and script path needed to run dspace2VIVO
PATH=$LOC_SCRIPT_DIR:$PATH

###################################################################
# Working directory of scripts
export WORKDIR=$(cd $LOC_SCRIPT_DIR/../; pwd -P)

###################################################################
# Directory of resources needed to configure the expected operation of the scripts
export RESSOURCESDIR=$(cd $WORKDIR/src/main/resources ; pwd -P)

###################################################################
# Directory containing the correspondence files between DSpace values and VIVO values
export MAPPING_DATA_DIR=$(cd $RESSOURCESDIR/mapping_data ; pwd -P)

###################################################################
# Resource directories after compilation. This directory is modified at each compilation (Do not edit)
export RESSOURCES_TARGET_DIR=$(cd $WORKDIR/target/classes ; pwd -P)

###################################################################
# Directory containing the queries necessary for the execution of SPARQL
export QUERY_DIR=$(cd $RESSOURCESDIR/query ; pwd -P)

###################################################################
# Repositories containing transient data from the extract/transform/load process
export DATA_DIR=$(cd $WORKDIR/data ; pwd -P)
export DATA_DEMO6_DIR=$(cd $WORKDIR/data_src_dspace6 ; pwd -P)
export DATA_DEMO7_DIR=$(cd $WORKDIR/data_src_dspace7 ; pwd -P)

###################################################################
# Data transition sub-directories for each step of the ETL process
export ETL_DIR_EXTRACT=$DATA_DIR/extract
export ETL_DIR_TRANSFORM=$DATA_DIR/transform
export ETL_DIR_TRANSFORM_DOC_TYPE=$(cd ${ETL_DIR_TRANSFORM}_doc_type ; pwd -P)
export ETL_DIR_TRANSFORM_PERSON=$(cd ${ETL_DIR_TRANSFORM}_person ; pwd -P)
export ETL_DIR_TRANSFORM_EXPERTISES=$(cd ${ETL_DIR_TRANSFORM}_expertises ; pwd -P)
export ETL_DIR_TRANSFORM_PERSON_EXPERTISES=$(cd ${ETL_DIR_TRANSFORM}_person_expertises ; pwd -P)

...