This page presents the procedure for migrating data from DSpace to VIVO. It answers the use case of a VIVO instance in read-only mode used to present the metadata contained in DSpace
- The scenario to be realized by this procedure consists in developing the necessary steps in order to migrate the metadata of two DSpace instances (the DSpace-6 Demo instance and the DSpace-7 Demo instance) to a local VIVO instance
- At the end of this procedure, the experimenter should have a fully operational VIVO instance containing the metadata harvested from DSpace-6-Demo DSpace-7-Demo, both of which are available from the web.
- The experimenter will also have in his possession, the necessary information to harvest in VIVO the metadata of a DSpace instance that he will have chosen and that it is possible to harvest from an OAI-PMH endpoint
The migration process consists of 3 main phases as shown in the image :
Extraction : Harvest the Dspace repository to extract resources metadata
Transformation : The extracted data is mapped to a vocabulary compatible with the VIVO ontological model
Insertion: Insert the data as triples in the VIVO repository through its sparql API
![](/download/attachments/230828118/esquema.PNG?version=1&modificationDate=1654564086263&api=v2)
Title | Var Name | Var Value | Description |
---|
Project root directory | DVIP_HOME_PRJ | ~/dspace-vivo-prj | The value content is a suggestion |
Git root directory | GIT_REPO | $DVIP_HOME_PRJ/00-GIT | Directory containing extracted GIT projects |
Default VIVO login (username - password) | admin@vivo.org | Vivo1234. | To be used to log-in as a VIVO administrator |
local server URLs | SOLR | http://localhost:8983/solr/#/ |
|
| VIVO | http://localhost:8080/vivo-dspace/ |
|
- jdk 11
- maven 3.6.3
- Linux Ubuntu
- No solr or tomcat instance should be running on the computer
- Linux bash
Step name and description | Commands |
---|
Setting up project |
Code Block |
---|
| mkdir -p ~/dspace-vivo-prj/00-GIT |
|
Retrieve the DV-IP source code |
Code Block |
---|
| git clone https://github.com/vivo-community/DSpace-VIVO |
|
Install Solr + Tomcat |
Code Block |
---|
| ./DSpace-VIVO/releng/org.vivoweb.dspacevivo.installer/00-INIT/install-tomcat-solr-app.sh |
|
Installing/compiling VIVO |
Code Block |
---|
| ./DSpace-VIVO/releng/org.vivoweb.dspacevivo.installer/01-VIVO/vivo-git-clone.sh
./DSpace-VIVO/bundles/org.vivoweb.dspacevivo/script/vivo-compile-and-deploy-for-tomcat.sh |
|
Run - Start/Stop VIVO
|
Code Block |
---|
language | bash |
---|
title | Starting VIVO |
---|
| source ./DSpace-VIVO/bundles/org.vivoweb.dspacevivo/script/00-env.sh
solr-start.sh
tomcat-start.sh
|
Code Block |
---|
language | bash |
---|
title | To show VIVO in a Web Browser |
---|
| browse-vivo.sh |
Code Block |
---|
language | bash |
---|
title | For stopping VIVO |
---|
| tomcat-stop.sh
solr-stop.sh
|
|
Step name and description | Commands |
---|
Install Apache Jena and its other associated tools |
Code Block |
---|
| ./DSpace-VIVO/releng/org.vivoweb.dspacevivo.installer/99-OTHER_TOOLS/jena-git-clone-and-deploy.sh |
|
Compiling/Installing DSpace-VIVO-EXEMPLE and its code libraries |
Code Block |
---|
| ./DSpace-VIVO/test/org.vivoweb.dspacevivo.etlexample/script/mvn_install_example.sh |
|
The purpose of this step is to validate the correct installation of the components necessary for the scenario to proceed. Here is a series of command that can be executed along with their execution result allowing you to compare them with the result of your own installation
Step name and description | Commands |
---|
Validate that all necessary GIT projects are cloned and properly deployed |
Code Block |
---|
language | bash |
---|
title | Excute 'ls' command from $GIT_REPO |
---|
| ls -l
total 24
drwxrwxr-x 6 heon heon 4096 mai 20 14:04 data-format-translator
drwxrwxr-x 7 heon heon 4096 mai 20 11:02 DSpace-VIVO
drwxrwxr-x 9 heon heon 4096 mai 20 11:08 Vitro
drwxrwxr-x 11 heon heon 4096 mai 20 11:08 Vitro-languages
drwxrwxr-x 10 heon heon 4096 mai 20 11:08 VIVO
drwxrwxr-x 11 heon heon 4096 mai 20 11:08 VIVO-languages |
Code Block |
---|
language | bash |
---|
title | Execute 'ls' from $GIT_REPO in deploy directory |
---|
| ls -dl ./DSpace-VIVO/deploy/*/
drwxrwxr-x 9 heon heon 4096 mai 20 11:07 ./DSpace-VIVO/deploy/app-solr/
drwxrwxr-x 9 heon heon 4096 mai 20 11:07 ./DSpace-VIVO/deploy/app-tomcat/
drwxrwxr-x 2 heon heon 4096 mai 20 14:05 ./DSpace-VIVO/deploy/lib/
drwxrwxr-x 7 heon heon 4096 mai 20 14:04 ./DSpace-VIVO/deploy/translator/
drwxrwxr-x 9 heon heon 4096 mai 20 11:13 ./DSpace-VIVO/deploy/vivo-home/ |
|
Test the utilities to make sure they are working |
Code Block |
---|
language | bash |
---|
title | Setting up environment variables in your session (From $GIT_REPO) |
---|
| $ source DSpace-VIVO/bundles/org.vivoweb.dspacevivo/script/00-env.sh |
Code Block |
---|
language | bash |
---|
title | Validate Solr |
---|
| $ solr-start.sh
Waiting up to 180 seconds to see Solr running on port 8983 [|]
Started Solr server on port 8983 (pid=1741315). Happy searching!
$ solr-status.sh
Found 1 Solr nodes:
Solr process 56366 running on port 8983
{
"solr_home":"xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-solr/server/solr",
"version":"8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:50:55",
"startTime":"2022-05-19T15:15:10.534Z",
"uptime":"0 days, 17 hours, 25 minutes, 10 seconds",
"memory":"151 MB (%29.5) of 512 MB"} |
Code Block |
---|
language | bash |
---|
title | Validate Tomcat |
---|
| $ tomcat-start.sh
Using CATALINA_BASE: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat
Using CATALINA_HOME: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat
Using CATALINA_TMPDIR: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat/temp
Using JRE_HOME: /opt/jdk-11.0.9
Using CLASSPATH: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat/bin/tomcat-juli.jar
Using CATALINA_OPTS:
Tomcat started. |
Code Block |
---|
language | bash |
---|
title | Test Apache-Jena |
---|
| $ sparql -version 2>/dev/null
Jena: VERSION: 3.17.0
Jena: BUILD_DATE: 2020-11-25T19:40:23+0000 |
|
- This scenario performs the DSpace Items reading of the DSpace 6 and 7 demonstration sites.
- In order to achieve a complete extraction in a respectable time, the data harvesting parameters are pre-programmed to import 5 Items per demonstration site for a total of 10 Items.
- make sure Solr and Tomcat are running
Code Block |
---|
language | bash |
---|
title | Run this command from $GIT_REPO for starting migration |
---|
|
./DSpace-VIVO/test/org.vivoweb.dspacevivo.etlexample/script/ETL-migration-DSpace-VIVO.sh |
![](/download/attachments/230828118/image2022-5-20_16-9-9.png?version=1&modificationDate=1653073749465&api=v2)