Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Massive migration of data from DSpace to VIVO
    1. Further discussion about solution architecture
      1. integration of data in elements of VIVO graph (publications records, researchers, etc.)
    2. Reporting about implementation

Notes

  1. Jose reported his progress on exporting of DSpace data to middleware internal data structure defined for the needs of massive migration of data from DSpace to VIVO. Packages should be renamed to start with org.vivoweb.dspacevivo.
  2. Options for exporting data should be:
    1. All DSpace Items
    2. DSpace items modified or added in some time interval (start and end date and time)
    3. DSpace items belonging to one or to the list of certain collections (at least in one collection, but it might be that some item belongs to more collections)
    4. DSpace items belonging to one or to the list of certain communities, i.e. to collections linked with those communities
    5. DSpace items linked with a researcher or group (list) of researchers
    6. DSpace items belonging to an organization unit (affiliated researchers)
    7. DSpace items linked with a super-publication (journal), or event (conference)
    8. All DSpace communities
    9. Communities modified or added in some time interval (start and end date and time)
    10. All DSpace collections
    11. Collections modified or added in some time interval (start and end date and time)
    12. DSpace Items in accordance with a custom SPARQLQuery
    13. DSpace Communities in accordance with a custom SPARQLQuery
    14. DSpace Collections in accordance with a custom SPARQLQuery
  3. At the end of the first phase massive migration might be implemented as a console application (web interface is not mandatory, although it will be nice to have it)
  4. Michel reported his progress as well. 
  5. It is important that imported data are really integrated in the existing graph of VIVO and linked with existing entities. IDs of VIVO entities might be added in DSpace data, and vice versa.  
  6. We need a name for middleware internal data structure defined for the needs of massive migration of data from DSpace to VIVO
  7. Michel and Jose should think about time line, when the implementation of massive migration might be done. After that, the implementation of depositing files from VIVO UI is needed.
  8. We might move the weekly for one hour later (8am Eastern Time).

See:

  1. Architectural proposition
  2. About DSpace rdfizer
  3. About an Item in DSpace
  4. DSpace-VIVO GitHub Repo

Task List

  •  Dragan Ivanovic to define options for harvesting data from DSpace in a wiki page 
  •  All to think about the name for middleware internal data structure defined for the needs of massive migration of data from DSpace to VIVO
  •  Jose to continue working on his implementation in accordance with defined options for harvesting data
  •  Jose to rename packages in his implementation, Michel Héon to do the same for his contributions
  •  Michel Héon to think about stronger integration of migrated data in the VIVO graph, and how those data might be linked for synchronization in the case of modification
  •  Dragan Ivanovic to make consultation with Abhishek about moving meeting for one hour, and if Abhishek agrees, Dragan Ivanovic will send new ics file. 

Previous tasks 

  •  Michel Héon to prepare and share slides for discussion about architecture
  •  Michel Héon and Jose to set up environment, i.e. to install DSpace 7, DSpace 6 and VIVO 1.12.x, Abhishek to help them with DSpace if necessary

...