...
Step name and description | Commands |
---|
Setting up project |
Code Block |
---|
| mkdir -p ~/dspace-vivo-prj/00-GIT
cd ~/dspace-vivo-prj/00-GIT |
|
Retrieve the DV-IP source code |
Code Block |
---|
| git clone --depth 1 --branch Beta-1.1 https://github.com/vivo-community/DSpace-VIVO |
|
Install Solr + Tomcat |
Code Block |
---|
| ./DSpace-VIVO/releng/org.vivoweb.dspacevivo.installer/00-INIT/install-tomcat-solr-app.sh |
|
Installing/compiling VIVO |
Code Block |
---|
| ./DSpace-VIVO/releng/org.vivoweb.dspacevivo.installer/01-VIVO/vivo-git-clone.sh
./DSpace-VIVO/bundles/org.vivoweb.dspacevivo/script/vivo-compile-and-deploy-for-tomcat.sh |
|
Anchor |
---|
| Run - Start/Stop VIVO |
---|
| Run - Start/Stop VIVO |
---|
|
Run - Start/Stop VIVO
|
Code Block |
---|
language | bash |
---|
title | Starting VIVO |
---|
| sourcecd ./DSpace-VIVO/bundles/org.vivoweb.dspacevivo/script
source ./00-env.sh
solr-start.sh
tomcat-start.sh
|
Code Block |
---|
language | bash |
---|
title | To show VIVO in a Web Browser (http://localhost:8080/vivo-dspace/) |
---|
| browse-vivo.sh |
Code Block |
---|
language | bash |
---|
title | For stopping VIVO |
---|
| tomcat-stop.sh
solr-stop.sh
|
|
Setting up the necessary resources for running Dspace
...
Installing the migration utilities
...
Step name and description | Commands |
---|
Validate that the OS contains all the necessary commands to run the dspace2vivo scripts |
Code Block |
---|
language | bash |
---|
title | Excute 'ls' command from $GIT_REPO | Run the script to validate the required applications being installed |
---|
| ./DSpace-VIVO/releng/org.vivoweb.dspacevivo.installer/99-OTHER_TOOLS/validate-syscmd-config.sh
|
Code Block |
---|
language | bash |
---|
title | Result summary |
---|
| adduser ok!
ant ok!
as ok!
at ok!
awk ok!
basename ok!
bash ok!
cat ok!
chmod ok!
chown ok!
chroot ok!
clear ok!
convert ok!
cp ok!
curl ok!
cut ok!
... |
| Validate that all necessary GIT projects are cloned and properly deployed |
Code Block |
---|
language | bash |
---|
title | To present the applications to be installed |
---|
| ./DSpace-VIVO/releng/org.vivoweb.dspacevivo.installer/99-OTHER_TOOLS/validate-syscmd-config.sh | grep NOT |
|
To identify the package to install for a given application, simply type the command on the command line and run the proposal offered by the system |
Code Block |
---|
| $ as
Command 'as' not found, but can be installed with:
sudo apt install binutils
ubuntu@ip-172-22-10-100:~/dspace-vivo-prj/00-GIT$ sudo apt install binutils
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed: |
|
Validate that all necessary GIT projects are cloned and properly deployed | Excute 'ls' command from $GIT_REPO | ls -l
total 24
drwxrwxr-x 6 heon heon 4096 mai 20 14:04 data-format-translator
drwxrwxr-x 7 heon heon 4096 mai 20 11:02 DSpace-VIVO
drwxrwxr-x 9 heon heon 4096 mai 20 11:08 Vitro
drwxrwxr-x 11 heon heon 4096 mai 20 11:08 Vitro-languages
drwxrwxr-x 10 heon heon 4096 mai 20 11:08 VIVO
drwxrwxr-x 11 heon heon 4096 mai 20 11:08 VIVO-languages
Code Block |
---|
language | bash |
---|
title | Execute Excute 'ls' command from $GIT_REPO in deploy directory |
---|
| ls -dl ./DSpace-VIVO/deploy/*/l
total 24
drwxrwxr-x 9 6 heon heon 4096 mai 20 1114:0704 ./DSpace-VIVO/deploy/app-solr/data-format-translator
drwxrwxr-x 9 7 heon heon 4096 mai 20 11:0702 ./DSpace-VIVO/deploy/app-tomcat/
drwxrwxr-x 2 9 heon heon 4096 mai 20 1411:05 ./DSpace-VIVO/deploy/lib/08 Vitro
drwxrwxr-x 711 heon heon 4096 mai 20 1411:0408 ./DSpace-VIVO/deploy/translator/Vitro-languages
drwxrwxr-x 910 heon heon 4096 mai 20 11:1308 ./DSpace-VIVO/deploy/vivo-home/ |
Test the utilities to make sure they are workingVIVO
drwxrwxr-x 11 heon heon 4096 mai 20 11:08 VIVO-languages |
Code Block |
---|
language | bash |
---|
title | Setting up environment variables in your session (From Execute 'ls' from $GIT_REPO )in deploy directory |
---|
| $ls source-dl ./DSpace-VIVO/bundles/org.vivoweb.dspacevivo/script/00-env.sh | Code Block |
---|
language | bash |
---|
title | Validate Solr |
---|
| $ solr-start.sh
Waiting up to 180 seconds to see Solr running on port 8983 [|]
Started Solr server on port 8983 (pid=1741315). Happy searching!
$ solr-status.sh
Found 1 Solr nodes:
Solr process 56366 running on port 8983
{
"solr_home":"xxxxxxx/00-GITdeploy/*/
drwxrwxr-x 9 heon heon 4096 mai 20 11:07 ./DSpace-VIVO/deploy/app-solr/
drwxrwxr-x 9 heon heon 4096 mai 20 11:07 ./DSpace-VIVO/deploy/app-tomcat/
drwxrwxr-x 2 heon heon 4096 mai 20 14:05 ./DSpace-VIVO/deploy/lib/
drwxrwxr-x 7 heon heon 4096 mai 20 14:04 ./DSpace-VIVO/deploy/translator/
drwxrwxr-x 9 heon heon 4096 mai 20 11:13 ./DSpace-VIVO/deploy/appvivo-solr/server/solr",
"version":"8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:50:55",
"startTime":"2022-05-19T15:15:10.534Z",
"uptime":"0 days, 17 hours, 25 minutes, 10 seconds",
"memory":"151 MB (%29.5) of 512 MB"}home/ |
|
Test the utilities to make sure they are working |
Code Block |
---|
language | bash |
---|
title | Setting up environment variables in your session (From $GIT_REPO) |
---|
| source ./DSpace-VIVO/bundles/org.vivoweb.dspacevivo/script/00-env.sh |
Code Block |
---|
language | bash |
---|
title | Validate Solr |
---|
| solr-start.sh
Waiting up to 180 seconds to see Solr running on port 8983 [|]
Started Solr server on port 8983 (pid=1741315). Happy searching!
solr-status.sh
Found 1 Solr nodes:
Solr process 56366 running on port 8983
{
"solr_home":" | Code Block |
---|
language | bash |
---|
title | Validate Tomcat |
---|
| $ tomcat-start.sh
Using CATALINA_BASE: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat
Using CATALINA_HOME: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat
Using CATALINA_TMPDIR: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-solr/server/solr",
"version":"8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:50:55",
"startTime":"2022-05-19T15:15:10.534Z",
"uptime":"0 days, 17 hours, 25 minutes, 10 seconds",
"memory":"151 MB (%29.5) of 512 MB"} |
Code Block |
---|
language | bash |
---|
title | Validate Tomcat |
---|
| tomcat-start.sh
Using CATALINA_BASE: xxxxxxxtomcat/temp
Using JRE_HOME: /opt/jdk-11.0.9
Using CLASSPATH: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat/bin/tomcat-juli.jar
Using CATALINA_OPTSHOME:
Tomcat started. | Code Block |
---|
language | bash |
---|
title | Test Apache-Jena |
---|
| $ sparql -version 2>/dev/null
Jenaxxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat
Using CATALINA_TMPDIR: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat/temp
Using JRE_HOME: VERSION: 3.17.0
Jena: BUILD_DATE: 2020-11-25T19:40:23+0000 /opt/jdk-11.0.9
Using CLASSPATH: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat/bin/tomcat-juli.jar
Using CATALINA_OPTS:
Tomcat started. |
Code Block |
---|
language | bash |
---|
title | Test Apache-Jena |
---|
| sparql -version 2>/dev/null
Jena: VERSION: 3.17.0
Jena: BUILD_DATE: 2020-11-25T19:40:23+0000 |
|
Anchor |
---|
| Visual confirmation in your web browser |
---|
| Visual confirmation in your web browser |
---|
|
Visual confirmation in your web browser
...
Code Block |
---|
language | bash |
---|
title | Run migrating proccessprocess |
---|
|
./DSpace-VIVO/test/org.vivoweb.dspacevivo.etlexample/script/mvn_install_example.sh |
...
Code Block |
---|
language | bash |
---|
title | Run migrating proccessprocess |
---|
|
./DSpace-VIVO/test/org.vivoweb.dspacevivo.etlexample/script/ETL-migration-DSpace-VIVO.sh |
...
This section aims to present different technical aspects for an extended use of this example
Script directories
The directory .
/DSpace-VIVO/test/org
.vivoweb.dspacevivo.etlexample
/script/
contains bash scripts specific to the execution of the VIVO population scenario from DemoDSpace-6 and DemoDSpace-7. These scripts can inspire the design of scripts needed for harvesting proprietary DSpace instances
Script name nomenclature
The table below presents the list of scripts whose name is built according to the prefix-function_name.sh nomenclature
Code Block |
---|
00-env.sh |
Code Block |
---|
language | bash |
---|
title | 00-env.sh |
---|
linenumbers | true |
---|
|
#!/bin/bash
###################################################################
# Script Name : 00-env.sh
# Description : This file is used to define the environment variables
# needed to run the extract/transform/load (ETL)
#flush_data_dspace.sh func-skip-first-line.sh process of dspace2vivo
# Args :
# Author map-document-with-author-to-vivo.sh : Michel Héon PhD
# Institution : Université du Québec à Montréal
# Copyright : Université du Québec à Montréal (c) 2022
# Email : heon.michel@uqam.ca
###################################################################
# Scripts root directory
export LOC_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd -P)"
###################################################################
# Root installation directory of the different dspace2vivo packages
export INSTALLER_DIR=$(cd $LOC_SCRIPT_DIR/../../../releng/org.vivoweb.dspacevivo.installer ; pwd -P)
###################################################################
# Project root variables
source $INSTALLER_DIR/00-env.sh
###################################################################
# Executable and script path needed to run dspace2VIVO
PATH=$LOC_SCRIPT_DIR:$PATH
###################################################################
# Working directory of scripts
export WORKDIR=$(cd $LOC_SCRIPT_DIR/../; pwd -P)
###################################################################
# Directory of resources needed to configure the expected operation of the scripts
export RESSOURCESDIR=$(cd $WORKDIR/src/main/resources ; pwd -P)
###################################################################
# Directory containing the correspondence files between DSpace values and VIVO values
export MAPPING_DATA_DIR=$(cd $RESSOURCESDIR/mapping_data ; pwd -P)
###################################################################
# Resource directories after compilation. This directory is modified at each compilation (Do not edit)
export RESSOURCES_TARGET_DIR=$(cd $WORKDIR/target/classes ; pwd -P)
###################################################################
# Directory containing the queries necessary for the execution of SPARQL
export QUERY_DIR=$(cd $RESSOURCESDIR/query ; pwd -P)
###################################################################
# Repositories containing transient data from the extract/transform/load process
export DATA_DIR=$(cd $WORKDIR/data ; pwd -P)
export DATA_DEMO6_DIR=$(cd $WORKDIR/data_src_dspace6 ; pwd -P)
export DATA_DEMO7_DIR=$(cd $WORKDIR/data_src_dspace7 ; pwd -P)
###################################################################
# Data transition sub-directories for each step of the ETL process
export ETL_DIR_EXTRACT=$DATA_DIR/extract
export ETL_DIR_TRANSFORM=$DATA_DIR/transform
export ETL_DIR_TRANSFORM_DOC_TYPE=$(cd ${ETL_DIR_TRANSFORM}_doc_type ; pwd -P)
export ETL_DIR_TRANSFORM_PERSON=$(cd ${ETL_DIR_TRANSFORM}_person ; pwd -P)
export ETL_DIR_TRANSFORM_EXPERTISES=$(cd ${ETL_DIR_TRANSFORM}_expertises ; pwd -P)
export ETL_DIR_TRANSFORM_PERSON_EXPERTISES=$(cd ${ETL_DIR_TRANSFORM}_person_expertises ; pwd -P)
produce-list-of-persons.sh
clean-all-transformation-directory.sh func-capitalize-each-fisrt-caracter.sh func-sort-list.sh map-expertise-and-item-to-a-person-to-vivo.sh transformation-map-dc_type.sh
ETL-migration-DSpace-VIVO.sh func-clean-begin-ending-whitespace.sh get-vivo-bibo-label.sh map-expertise-to-vivo.sh transform-map-expertise-and-item-to-a-person-to-vivo.sh
extract-dspace6.sh func-encode_string_to_expertise.sh load-data-doc_type-to-vivo.sh map-name-to-vivo-person.sh transform-map-vivo-doc-type.sh
extract-dspace7.sh func-encode_string_to_i18n_lowercase.sh load-data-expertises-to-vivo.sh map-vivo-doc-type.sh transform-map-vivo-expertises.sh
extract-dspace.sh func-encode_string_to_uid.sh load-data-person-expertise-to-vivo.sh mvn_install_example.sh transform-map-vivo-person.sh
flush_data_dspace6.sh func-encode_string_to_vivo-URI.sh load-data-person-to-vivo.sh produce-list-of-expertise.sh
flush_data_dspace7.sh func-remove-brace-to-uri.sh load-data-to-vivo.sh produce-list-of-itemtype.sh |
The table below shows the meaning of the prefixes:
Prefix | Description |
---|
extract- | Script prefixes for the data extraction step |
transform- | Script prefixes for the data transformation step |
load- | Script prefixes for the data loading step |
func- | Generic functions |
map- | Script for mapping DSpace data to the VIVO vocabulary. These scripts contain the SPARQL constuct queries needed for the mapping |
produce- | Production scripts for the various lists needed for ETL processes |
Specific scripts
The directory also contains scripts dedicated to specific actions
mvn_install_example.sh
Script used to compile Java programs
00-env.sh
This file is used to define the environment variables needed to run the extract/transform/load (ETL) process of dspace2vivo. Each script includes (source) this file
The code block below shows the list and meaning of the environment variables necessary for proper execution of the scripts
Code Block |
---|
language | bash |
---|
title | 00-env.sh content |
---|
linenumbers | true |
---|
collapse | true |
---|
|
#!/bin/bash
###################################################################
# Script Name : 00-env.sh
# Description : This file is used to define the environment variables
# needed to run the extract/transform/load (ETL)
# process of dspace2vivo
# Args :
# Author : Michel Héon PhD
# Institution : Université du Québec à Montréal
# Copyright : Université du Québec à Montréal (c) 2022
# Email : heon.michel@uqam.ca
###################################################################
# Scripts root directory
export LOC_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd -P)"
###################################################################
# Root installation directory of the different dspace2vivo packages
export INSTALLER_DIR=$(cd $LOC_SCRIPT_DIR/../../../releng/org.vivoweb.dspacevivo.installer ; pwd -P)
###################################################################
# Project root variables
source $INSTALLER_DIR/00-env.sh
###################################################################
# Executable and script path needed to run dspace2VIVO
PATH=$LOC_SCRIPT_DIR:$PATH
###################################################################
# Working directory of scripts
export WORKDIR=$(cd $LOC_SCRIPT_DIR/../; pwd -P)
###################################################################
# Directory of resources needed to configure the expected operation of the scripts
export RESSOURCESDIR=$(cd $WORKDIR/src/main/resources ; pwd -P)
###################################################################
# Directory containing the correspondence files between DSpace values and VIVO values
export MAPPING_DATA_DIR=$(cd $RESSOURCESDIR/mapping_data ; pwd -P)
###################################################################
# Resource directories after compilation. This directory is modified at each compilation (Do not edit)
export RESSOURCES_TARGET_DIR=$(cd $WORKDIR/target/classes ; pwd -P)
###################################################################
# Directory containing the queries necessary for the execution of SPARQL
export QUERY_DIR=$(cd $RESSOURCESDIR/query ; pwd -P)
###################################################################
# Repositories containing transient data from the extract/transform/load process
export DATA_DIR=$(cd $WORKDIR/data ; pwd -P)
export DATA_DEMO6_DIR=$(cd $WORKDIR/data_src_dspace6 ; pwd -P)
export DATA_DEMO7_DIR=$(cd $WORKDIR/data_src_dspace7 ; pwd -P)
###################################################################
# Data transition sub-directories for each step of the ETL process
export ETL_DIR_EXTRACT=$DATA_DIR/extract
export ETL_DIR_TRANSFORM=$DATA_DIR/transform
export ETL_DIR_TRANSFORM_DOC_TYPE=$(cd ${ETL_DIR_TRANSFORM}_doc_type ; pwd -P)
export ETL_DIR_TRANSFORM_PERSON=$(cd ${ETL_DIR_TRANSFORM}_person ; pwd -P)
export ETL_DIR_TRANSFORM_EXPERTISES=$(cd ${ETL_DIR_TRANSFORM}_expertises ; pwd -P)
export ETL_DIR_TRANSFORM_PERSON_EXPERTISES=$(cd ${ETL_DIR_TRANSFORM}_person_expertises ; pwd -P)
|
ETL-migration-DSpace-VIVO.sh
This script encapsulates the functions call allowing the migration of DSpace Demo(6&7) data into VIVO. It is the main script of the ETL process
Code Block |
---|
language | bash |
---|
title | ETL-migration-DSpace-VIVO |
---|
linenumbers | true |
---|
collapse | true |
---|
|
#!/bin/bash
###################################################################
# Script Name :
# Description : This script encapsulates the functions call allowing the migration of DSpace Demo(6&7) data into VIVO
# Args :
# Author : Michel Héon PhD
# Institution : Université du Québec à Montréal
# Copyright : Université du Québec à Montréal (c) 2022
# Email : heon.michel@uqam.ca
###################################################################
export SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd -P)"
source $SCRIPT_DIR/00-env.sh
cd $SCRIPT_DIR
###################################################################
# Clean and setup up data directories and properties
cp $RESSOURCESDIR/*.conf $RESSOURCES_TARGET_DIR
flush_data_dspace.sh 2>/dev/null
flush_data_dspace6.sh 2>/dev/null
flush_data_dspace7.sh 2>/dev/null
###################################################################
# Extract dspace(6-7) demo data
./extract-dspace6.sh
./extract-dspace7.sh
cp -r $DATA_DEMO6_DIR/* $DATA_DEMO7_DIR/* $DATA_DIR
###################################################################
# Produce all list
echo run produce-list-of-expertise.sh
produce-list-of-expertise.sh
###########################
echo run produce-list-of-itemtype.sh
produce-list-of-itemtype.sh
###########################
echo run produce-list-of-persons.sh
produce-list-of-persons.sh
###################################################################
# Process transformation and load to VIVO
load-data-to-vivo.sh
transform-map-vivo-doc-type.sh
load-data-doc_type-to-vivo.sh ; vivo-recomputeIndex.sh &
transform-map-vivo-person.sh
load-data-person-to-vivo.sh ; vivo-recomputeIndex.sh &
transform-map-vivo-expertises.sh
load-data-expertises-to-vivo.sh ; vivo-recomputeIndex.sh &
transform-map-expertise-and-item-to-a-person-to-vivo.sh
load-data-person-expertise-to-vivo.sh ; vivo-recomputeIndex.sh
###################################################################
# Done ETL Process
echo "Done!"
|
-- End Of Document --