Contribute to the DSpace Development Fund

The newly established DSpace Development Fund supports the development of new features prioritized by DSpace Governance. For a list of planned features see the fund wiki page.

Attendees


Agenda/ Notes

This was the second meeting for the DSpace Entities Working Group and it had only one topic in the agenda:

  • Deep dive into how DSpace-CRIS manages Authors and Author Profiles (Researcher Pages)


This presentation was prepared by Andrea Bollini and is the answer to a set of questions based on this document file:

https://docs.google.com/document/d/1UEX2Tn38Zpz1qlr58cbJKNgymVR8xcPRHcZYHDfvxKA

The questions were grouped in three topics:

  • Data Model
  • Data API
  • Front end


The link for the presentation:
https://www.slideshare.net/4Science/dspacecris-technical-answers-for-the-entities-working-group


Discussion


  • Data Model:
    • The entity Author within DSpace-CRIS is called Researcher Profile or Researcher Page
    • The Data model is dynamic and it's manipulated using the interface or an EXCEL document
    • DSpace-CRIS uses UUID underneath and presents identifiers like "rp0001" for a ResercherProfile ("rp" is the prefix for a researcher profile)
    • Jdyna_values table is similar to the metadatavalue table and it has columns to specify the data type that is been stored
      • Tim: The values are stored in different columns, only that type of fields?
        There are used String, File, Date type fields and Links for other entities
      • Tim: Does jdyna need to be modified for new entities?
        There is a Generic Entity for that. It is related in jydna_values. There is no need to change the database for new entities
    • Every first-class entity (Authors, Projects, Organization Unit) has a specific table.
    • Every other entities have a generic entity and don't need a specific table
    • It isn't possible to change the database directly for new Entities instances. This is assured by the User Interface.
    • CRIS_DO_TP ("DO" means Dynamic Object) this table is the base for dynamic entities

  • Data API:
    • DSpace-CRIS uses Java Hibernate, works with Oracle, PostgreSQL
    • Data is replicated in Apache Solr, the same strategy as DSpace
    • Profiles are Data types? Can they be dynamically assigned?
      One profile is a dynamic object with a set of definitions. 
      One profile could be a Journal with a set of properties or attributes. 
    • Each property has a widget
    • Widget has a validation, it isn't dependent on the User Interface and ensures each data entry has validation in a certain type. You can't insert a string where you should insert a date, for instance
      • Tim:Attributes, Properties are the same thing?
        Yes
      • Profiles: is a set of properties and definitions
      • When we have a researcher profile we don't have a generic "Profile". Researcher Profile already has a set of definitions
      • A Profile only applies to dynamic objects
    • Permissions
      • It doesn't use the same solution that DSpace uses. Each Property has an active/inactive state. 
      • How permissions are managed?
        • The permissions are configured by the epersongroup that creates the entity. The administrator and also who he defines that can manage that field in an Entity.
      • When the property is inactive, in the public page, in search page results, it isn't showed. 
      • This active/inactive state is very useful for authority fields (to hide them)
      • Each property has a flag, to control if data is visible or not
      • Only the administrator can change attributes or widgets of a Profile 
    • REST API
      • DSpace-CRIS uses SOAP webservices, but they are planning to abandon it
      • REST API is in the DSpace-CRIS plans to replace SOAP WS
      • The REST API will support CRUD operations for all Entities

  • Front end/User Interface:
    • DSpace-CRIS is planning to also change to Angular Interface
    • DSpace-CRIS is also planning to keep the same features when migrating to DS7
    • Plans for using microformats and Signposting

  • João: Was this presentation useful? It's needed additional information? What to do next, what are the next steps?
    • Tim: it was useful to understand what is underneath. But Tim already had some notions.
    • Mark: I like the flexibility
    • Lieven: It's a different approach. But it raises some issues that must be considered like the duplicated solution for logging. If a merged solution is adopted, some DSpace aspects must be reviewed. It doesn't make sense to have two options for the same thing.
  • João: proposed one exercise to help better understanding of DSpace-CRIS. One Hand-on experience.
    • Andrea: The user experience will the completely transformed. It doesn't make any sense to make tests on the current version. There is a demo version online that people can use. Data model setup is a very complex procedure, it's trivial to make a new one.
    • It was proposed 2 or 3 groups to work on the configuration setup and the data model. And Bollini could create some exercises.
    • Tim: remembered that Bollini is making part of DS7 REST team and have time issues
    • Bollini: said that this process needs to go a little bit slower. It was an addional effort to prepair this presentation. He proposed to create a online Google Doc to collect some questions and after a 2 hours hand-on meeting. For that meeting He proposed local setups.
    • It was proposed wednesday, 22nd november at 15:00 UTC, for the third meeting. It will be held again in the same place, Zoom platform, and it will be a 2 hours meeting with DSpace-CRIS hands-on (locally installed). 
  • Bollini: It's important to watch older webinars like the COAR.



  • No labels

7 Comments

  1. You can watch the recorded meeting video:

     

  2. The statement: Data is replicated in Apache Solr, the same strategy as DSpace

    means that authors name, Orcid,  etc are stored in the SOLR Authority-Core  when validated?   (the same as Dspace v5 and V6)

    thanks

    1. Perhaps Andrea Bollini (4Science) is the best person to answer that question. I think Andrea, based on the DSpace-CRIS architecture (https://image.slidesharecdn.com/dspace-cris-tuhh-170927152223/95/dspacecris-technical-level-introduction-5-638.jpg?cb=1506538738) was answering a question if entities were searchable. He said that it uses the same strategy that DSpace. In DSpace you have the Discovery mechanism to interact with Solr - Discovery.

      1. Paulo Graça is right, the question was related to the capability to search CRIS entites: People, Projects, OrgUnits and whatever custom defined entity type (Journal, Events, Awards, etc.)

        emilio lorenzo you can find the answer about the SOLR Authority Core here: https://www.slideshare.net/4Science/dspacecris-technical-level-introduction (slide 90). DSpace-CRIS doesn't use the SOLRAuthority by default but it can be enabled if you want to use it for some reason. You don't need it for ORCID as DSpace-CRIS has it builtin support and you don't need it also for additional vocabularies as it is easier to define them as additional CRIS entities

  3. I would rather prefer (not even know if that is possible) a model where the multiples sources of asserting the identity of authors (CRIS, ORCID, VIAF, old-authority models, new ones, etc..) can use the common-ground of existing DSpace developments (and currently this is the SOLR-Authority core) . Not just discontinue that functionality. Perhaps offering them a transition to a new model, not a replacement.

    I feel confident that this solution would  be preferable in terms of credibility and sustainability of Dspace develoments.

    1. emilio lorenzo there is not really the concept of author in DSpace and this is why this WG was founded. It is not responsibility of the dspace authority to handle and store such information, the ChoiceAuthority interface, what at the end the SOLRAuthority is, is intend to only query elsewhere managed entities (ORCID: researcher profiles). Please note that the SOLRAuthority is probably not widely used as it was introduced to provide a light integration with ORCID only for the XMLUI, it is based on the ORCID v1.2 and an use case that is now deprecated by ORCID. It supports extensions but it was explicitly rejected the idea to use it as base for the researcher profile support in DSpace without a previous indepth comparison with the already existent and used dspace-cris solution. In any case, the SOLRAuthority is supported also in DSpace-CRIS as it is only an example of ChoiceAuthority implementation.

      It is easy to migrate data improperly stored in SOLR (I say improperly as SOLR should be not used as master storage of any information) to a DSpace-CRIS entity so that it is properly persisted and managed other than exposed for integration as a ChoiceAuthority.

      I completely agree with you, it is important to look to what already exist and is used by several institutions around the world as it mean acknowledge the efforts and resources put by the community over the course of years... and this is exactly what dspace-cris is since 2009.