This page presents the procedure for migrating data from DSpace to VIVO. It answers the use case of a VIVO instance in read-only mode used to present the metadata contained in DSpace
- The scenario to be realized by this procedure consists in developing the necessary steps in order to migrate the metadata of two DSpace instances (the DSpace-6 Demo instance and the DSpace-7 Demo instance) to a local VIVO instance
- At the end of this procedure, the experimenter should have a fully operational VIVO instance containing the metadata harvested from DSpace-6-Demo DSpace-7-Demo, both of which are available from the web.
- The experimenter will also have in his possession, the necessary information to harvest in VIVO the metadata of a DSpace instance that he will have chosen and that it is possible to harvest from an OAI-PMH endpoint
Useful addresses
Title | Var Name | Var Value | Description |
---|
Project root directory | DVIP_HOME_PRJ | ~/dspace-vivo-prj | The value content is a suggestion |
Git root directory | GIT_REPO | $DVIP_HOME_PRJ/00-GIT | Directory containing extracted GIT projects |
Default VIVO login (username - password) | admin@vivo.org | Vivo1234. | To be used to log-in as a VIVO administrator |
local server URLs | SOLR | http://localhost:8983/solr/#/ |
|
| VIVO | http://localhost:8080/vivo-dspace/ |
|
- jdk 11
- maven 3.6.3
- Linux Ubuntu
- No solr or tomcat instance should be running on the computer
- Linux bash
Step name and description | Commands |
---|
Setting up project |
Code Block |
---|
| mkdir -p ~/dspace-vivo-prj/00-GIT |
|
Retrieve the DV-IP source code |
Code Block |
---|
| git clone https://github.com/vivo-community/DSpace-VIVO |
|
Install Solr + Tomcat |
Code Block |
---|
| ./DSpace-VIVO/releng/org.vivoweb.dspacevivo.installer/00-INIT/install-tomcat-solr-app.sh |
|
Installing/compiling VIVO |
Code Block |
---|
| ./DSpace-VIVO/releng/org.vivoweb.dspacevivo.installer/01-VIVO/vivo-git-clone.sh
./DSpace-VIVO/bundles/org.vivoweb.dspacevivo/script/vivo-compile-and-deploy-for-tomcat.sh |
|
Run - Start/Stop VIVO
|
Code Block |
---|
language | bash |
---|
title | Starting VIVO |
---|
| source ./DSpace-VIVO/bundles/org.vivoweb.dspacevivo/script/00-env.sh
solr-start.sh
tomcat-start.sh
|
Code Block |
---|
language | bash |
---|
title | To show VIVO in a Web Browser |
---|
| browse-vivo.sh |
Code Block |
---|
language | bash |
---|
title | For stopping VIVO |
---|
| tomcat-stop.sh
solr-stop.sh
|
|
Step name and description | Commands |
---|
Install Apache Jena and its other associated tools |
Code Block |
---|
| ./DSpace-VIVO/releng/org.vivoweb.dspacevivo.installer/99-OTHER_TOOLS/jena-git-clone-and-deploy.sh |
|
Compiling/Installing DSpace-VIVO-EXEMPLE and its code libraries |
Code Block |
---|
| ./DSpace-VIVO/test/org.vivoweb.dspacevivo.etlexample/script/mvn_install_example.sh |
|
The purpose of this step is to validate the correct installation of the components necessary for the scenario to proceed. Here is a series of command that can be executed along with their execution result allowing you to compare them with the result of your own installation
Step name and description | Commands |
---|
Validate that all necessary GIT projects are cloned and properly deployed |
Code Block |
---|
language | bash |
---|
title | Excute 'ls' command from $GIT_REPO |
---|
| ls -l
total 24
drwxrwxr-x 6 heon heon 4096 mai 20 14:04 data-format-translator
drwxrwxr-x 7 heon heon 4096 mai 20 11:02 DSpace-VIVO
drwxrwxr-x 9 heon heon 4096 mai 20 11:08 Vitro
drwxrwxr-x 11 heon heon 4096 mai 20 11:08 Vitro-languages
drwxrwxr-x 10 heon heon 4096 mai 20 11:08 VIVO
drwxrwxr-x 11 heon heon 4096 mai 20 11:08 VIVO-languages |
Code Block |
---|
language | bash |
---|
title | Execute 'ls' from $GIT_REPO in deploy directory |
---|
| ls -dl ./DSpace-VIVO/deploy/*/
drwxrwxr-x 9 heon heon 4096 mai 20 11:07 ./DSpace-VIVO/deploy/app-solr/
drwxrwxr-x 9 heon heon 4096 mai 20 11:07 ./DSpace-VIVO/deploy/app-tomcat/
drwxrwxr-x 2 heon heon 4096 mai 20 14:05 ./DSpace-VIVO/deploy/lib/
drwxrwxr-x 7 heon heon 4096 mai 20 14:04 ./DSpace-VIVO/deploy/translator/
drwxrwxr-x 9 heon heon 4096 mai 20 11:13 ./DSpace-VIVO/deploy/vivo-home/ |
|
Test the utilities to make sure they are working |
Code Block |
---|
language | bash |
---|
title | Setting up environment variables in your session (From $GIT_REPO) |
---|
| $ source DSpace-VIVO/bundles/org.vivoweb.dspacevivo/script/00-env.sh |
Code Block |
---|
language | bash |
---|
title | Validate Solr |
---|
| $ solr-start.sh
Waiting up to 180 seconds to see Solr running on port 8983 [|]
Started Solr server on port 8983 (pid=1741315). Happy searching!
$ solr-status.sh
Found 1 Solr nodes:
Solr process 56366 running on port 8983
{
"solr_home":"xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-solr/server/solr",
"version":"8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:50:55",
"startTime":"2022-05-19T15:15:10.534Z",
"uptime":"0 days, 17 hours, 25 minutes, 10 seconds",
"memory":"151 MB (%29.5) of 512 MB"} |
Code Block |
---|
language | bash |
---|
title | Validate Tomcat |
---|
| $ tomcat-start.sh
Using CATALINA_BASE: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat
Using CATALINA_HOME: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat
Using CATALINA_TMPDIR: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat/temp
Using JRE_HOME: /opt/jdk-11.0.9
Using CLASSPATH: xxxxxxx/00-GIT/DSpace-VIVO/deploy/app-tomcat/bin/tomcat-juli.jar
Using CATALINA_OPTS:
Tomcat started. |
Code Block |
---|
language | bash |
---|
title | Test Apache-Jena |
---|
| $ sparql -version 2>/dev/null
Jena: VERSION: 3.17.0
Jena: BUILD_DATE: 2020-11-25T19:40:23+0000 |
|
- This scenario performs the DSpace Items reading of the DSpace 6 and 7 demonstration sites.
- In order to achieve a complete extraction in a respectable time, the data harvesting parameters are pre-programmed to import 5 Items per demonstration site for a total of 10 Items.
The migration process consists of 3 main phases as shown in the image :
Extraction : Harvest the Dspace repository to extract resources metadata
Transformation : The extracted data is mapped to a vocabulary compatible with the VIVO ontological model
Insertion: Insert the data as triples in the VIVO repository through its sparql API
![](/download/attachments/230828118/esquema.PNG?version=1&modificationDate=1654564086263&api=v2)
Each of the phases is described below with the programs that are required to be executed to complete these tasks.
This phase begins the data migration process from Dspace to the VIVO platform. Make sure you have an instance of Dspace and VIVO running. To successfully complete this phase follow these steps :
Download the .jar file from the following link http:link_demo
Generate a config file with source dspace params:
Params | Description | Example |
---|
type | Type of API used for data extraction. Enable for now ( OAI, RESTv7 ) | OAI |
endpoint | API url direction. If the OAI type was selected, place the route of the OAI endpoint. On the other hand, if RESTv7 was selected, the rest API address. | https://api7.dspace.org/server/oai/request |
uriPrefix | The prefix of the link to the repository. Used to generate valid links to the source repository. | https://demo7.dspace.org/ |
username | User of the dspace platform with permissions to use the rest API. (RESTv7 Only). | admin |
password: | Password of the user (RESTv7 Only). | admin |
- Execute the java program with the config file path.
Code Block |
---|
language | bash |
---|
title | Run this command from $GIT_REPO for starting migration |
---|
|
java jar vivo "/path_to_config_file/config_file.config" |
- make sure Solr and Tomcat are running
Code Block |
---|
language | bash |
---|
title | Run this command from $GIT_REPO for starting migration |
---|
|
./DSpace-VIVO/test/org.vivoweb.dspacevivo.etlexample/script/ETL-migration-DSpace-VIVO.sh |
![](/download/attachments/230828118/image2022-5-20_16-9-9.png?version=1&modificationDate=1653073749465&api=v2)