JIRA Reference: https://jira.duraspace.org/browse/DS-456

Proposed: The original JIRA ticket advocates for creating easy upgrade scripts, likely in Java. This came up during DSpace Developers meeting on Jan 6, 2010 and this feature was selected as an important one by repository administrators and managers from EIFL network (in spring 2011).

Initial assessment: Relevant to the community; Medium-High or High priority

  • If you agree that this feature is important and have no additional comments, you can simply respond with a +1.
  • If you disagree but have no comments, a -1 works, and if you have no opinion at all, 0 is fine. (And encouraged, since that means we know you've had a chance to weigh in.)
  • If you do have comments or other ideas, you're not limited to the numbers, of course. So please do share your thoughts!
  • No labels

23 Comments

  1. +1

    This issue is difficult because DSpace (officially?) supports both linux & windows, as well as postgres and oracle as a Database. The task of installer scripts could be easier if done only for the most popular combinations (for example, linux/postgres)

  2. +1

    Update (from the DCAT Meeting Notes July 12, 2011)

    1. no developer assigned yet
    2. propose writing scripts for the most common database/operating sys
      1. Database: PostgreSQL 264, MySQL 43, Oracle 39, Other 6
      2. Operating Sys: Linux 224, Microsoft Windows 88, UNIX 28, Solaris 22, HPUX 2
    3. Michael to attach scripts to JIRA issue that they have developed
    4. hold community-wide discussion/solicitation of other upgrade scripts already written or interested parties
  3. 0

    I hate to admit it but this is technically over my head.

  4. +1.  I asked my DSpace developer, Jim Halliday, about this and he jumped at it as very important. That any 'simplification to the upgrade process would be greatly appreciated and not just for us!'

  5. Please have a look and edit/comment. Thanks!

    [Draft announcement] Community-wide discussion: easy upgrade scripts

    Via lists:

    Should we also use this page to collect information about the upgrade scripts available or should we set up a separate wiki page for this community-wide discussion?

    Dear DSpace developers and users,

    There is a general understanding within our community that any simplification to the upgrade process would be greatly appreciated.

    Has any of you written or come across any easy upgrade scripts? Please let us know.

    If there are no solutions available yet, would any of you volunteer to create easy upgrade scripts (e.g. 'dspace upgrade 1.7 1.8'), likely in Java? We can start writing scripts for the most common database/operating system which in our case is PostgreSQL/Linux.

    Number of users from DSpace Registry (http://www.dspace.org/whos-using-dspace/Repository-List.html)

    • Database: PostgreSQL 300, MySQL 58, Oracle 39, Other 9
    • Operating System Platform: Linux 224, Microsoft Windows 88, UNIX 28, Solaris 22, HPUX 2

    Thank you for your input!

    DCAT

  6. Here are the latest DSpace upgrade instructions:

    https://wiki.duraspace.org/display/DSDOC18/Upgrading+From+1.7.x+to+1.8.x

    A quick eagle (or ostridge, as I haven't actually done an upgrade myself) view of what could possibly be automated further:

    • (Diagnostic script): something that compares your DSpace to a standard installation and highlights anything that is out of the ordinary after which it can give you warnings if some upgrades won't work or might cause difficulties.
    • Creation of a full backup: a script that makes a full backup of DB, asset store, configuration and customizations. At this moment, you need different commands for all of them
      • This should not be too hard. given that customizations are put in the right place. (when done properly, all of your customized code would be in a version control system like SVN, CVS or GIT anyway, so there would be no de-facto additional need for backups)
      • If you have a big asset store, putting a full backup somewhere might require a lot of additional space. Script needs to check first if such space is available.
    • Downloading a new DSpace version
      • Can this be automated with something like apt-get on linux?
    • Merging customizations
      • Can get tricky if customizations are spread & not in the right places
      • Shouldn't be an issue with fully standard DSpaces
    • Build DSpace, stop tomcat, update DSpace
      • could all be automated as they are already scripts right now?
    • Update/Merge Dspace configurations
      • 1.8 saw a very drastic change in how configuration files are split up etc. Therefor it would be a LOT of work to fully automate this for 1.7 -> 1.8 upgrades. It should be a lot easier for 1.9: there are much more individual configuration files, so the chance that many of these files can just be fully re-used in a newer version is a lot bigger. (does this make sense?)
    • Generate browse & search indexes - deploy webapp - restart servlet container
      • These could possibly be automated as they are already straightforward commands/operations.
    1. I agree that further automation (per Bram's note above) or beginning to develop a "suite of scripts" approach would be a good start. Also, focusing on common configurations (as Iryna notes) also makes sense, and could provide an incentive driving users towards fewer configurations. It looks like PostgreSQL/Linux would be the first such focus, since that would help the majority of installations we know of.

      (Disclosure: We are among that majority.)

  7. +1 - .  As we run a hosted service, we have to rewrite the scripts anyway but any simplification on the scripts released would be very welcome from us

  8. +1 I think this will be really useful, in particular for institutions with few resources in terms of technical staff time. Ideally one script would be made that would enable going through the entire process responding to various prompts for config information - i.e. so simple that a repository manager may be able to do it. I would think that sort of solution would be quite far off though, and for now I think Bram's initial suggestions (although it would be good to confirm what scripts would be possible to create) to create scripts for the various stages of an upgrade sounds like a good suggestion of how to move forward. It wouldn't automate the entire process, but could be simplified considerably.

    Iryna, I think your message is good - although I have a couple of comments. Firstly, I'd add more details regarding why this would be useful, or have this been covered already in other online discussions? There isn't much detail about this above or on the issue. I am also wondering whether it would be worth to include a list of processes or examples processes (e.g. Bram's list above or something like that) that people may have automated?

    Also, I was wondering, would it be worth putting some work into making the asset store completely separate from the rest of the DSpace processes, like you would do with the stored content when upgrading on Mac? Or are there any good technical reasons for why this cannot be done? Maybe something to explore in a separate ticket perhaps?

    Best wishes,
    Elin

    1. Could you elaborate on your last comment Elin? As far as I know, the asset store is already pretty independent from DSpace and its structure hasn't really changed over the versions (as far as I know?). Or do you mean with dependent that the names of files & directories need to be consistent between database & the file system.

  9. +1 on the message and adding a little additional detail as Elin suggests.  A question about volunteers, though, does there need to be a sentence to the effect that anyone (with the proper skills, of course) can volunteer, that they don't have to be a committer?  I'm just wondering if someone who tends to lurk on the lists would hesitate to volunteer thinking that they're not really active in the development process.  It may not be a problem---just wondering. 

  10. Dear Bram, Jim, Claire, Elin, Amy, Valorie and Tim,

    Thanks a lot for your input! Below is a modified (very-very draft) version of the announcement that includes all your suggestions and information:

    Community-wide discussion: easy upgrade scripts

    Dear DSpace developers and users,

    There is a general understanding within our community that any simplification to the upgrade process would be greatly appreciated. Easy upgrade scripts will be useful, in particular for institutions with few resources in terms of technical staff time. Ideally one script could be made that would enable going through the entire process responding to various prompts for config information – i.e. so simple that a repository manager may be able to do it.

    We can start with further automation of the latest DSpace upgrade instructions – a "suite of scripts" approach. For example, based on the latest DSpace upgrade instructions: https://wiki.duraspace.org/display/DSDOC18/Upgrading+From+1.7.x+to+1.8.x the following could possibly be automated further:

    • Diagnostic script that compares your DSpace installation with a standard installation and highlights anything that is out of the ordinary after which it can give you warnings if some upgrades won't work or might cause difficulties.
    • Creation of a full backup: a script that makes a full backup of DB, asset store, configuration and customizations. At this moment, you need different commands for all of them.
      • This should not be too hard. Given that customizations are put in the right place (when done properly, all of your customized code would be in a version control system like SVN, CVS or GIT anyway, so there would be no de-facto additional need for backups).
      • If you have a big asset store, putting a full backup somewhere might require a lot of additional space. Script needs to check first if such space is available.
    • Downloading a new DSpace version – can this be automated with something like apt-get on Linux?
    • Merging customizations
      • Can get tricky if customizations are spread & not in the right places
      • Shouldn't be an issue with fully standard DSpaces
    • Build DSpace, stop tomcat, update DSpace: could all be automated as they are already scripts right now?
    • Update/Merge Dspace configurations
      • 1.8 saw a very drastic change in how configuration files are split up etc. Therefor it would be a LOT of work to fully automate this for 1.7 -> 1.8 upgrades. It should be a lot easier for 1.9: there are much more individual configuration files, so the chance that many of these files can just be fully re-used in a newer version is a lot bigger.
    • Generate browse & search indexes - deploy webapp - restart servlet container: these could possibly be automated as they are already straightforward commands/operations.

    This kind of approach wouldn't automate the entire process, but could simplify it considerably.

    Another approach could be through common configurations that could provide an incentive driving users towards fewer configurations.

    Has any of you written or come across any easy upgrade scripts? We know that the Agris (FAO) and Odin (UNESCO-IOC) communities are currently investigating whether they can perform updates/upgrades using their customized Windows-only (.exe) Installer for DSpace (AgriOcean – a customized version of DSpace). Once it is built, this code may be something that could be generalizable, at least for Windows-based installations of DSpace (more detailed are here https://jira.duraspace.org/browse/DS-456). Do you know about any other similar projects?

    If there are no solutions available yet, would any of you volunteer to create easy upgrade scripts (e.g. 'dspace upgrade 1.7 1.8'), likely in Java? We can start writing scripts for the most common database/operating system which in our case is PostgreSQL/Linux.

    Number of users from DSpace Registry (http://www.dspace.org/whos-using-dspace/Repository-List.html)

    • Database: PostgreSQL 300, MySQL 58, Oracle 39, Other 9
    • Operating System Platform: Linux 224, Microsoft Windows 88, UNIX 28, Solaris 22, HPUX 2

    Anyone (with the proper skills) can volunteer, you don't have to be a DSpace committer.

    Thank you for your input!

    DCAT

    What do you think about this - too long and detailed? Is it more clear now what we are looking for? 

    Thanks again!

    Best wishes,

    Iryna

    1. Hi Iryna,

      for a general announcement, you might want to drop the sub-bullets from the list of scripts. Here is what a trim-down might look like:

      • Diagnostic script that compares your DSpace installation with a standard installation and highlights anything that is out of the ordinary after which it can give you warnings if some upgrades won't work or might cause difficulties.
      • Creation of a full backup of DB, asset store, configurations and customizations
      • Downloading a new DSpace version
      • Merging customizations
      • Build DSpace, stop tomcat, update DSpace
      • Update/Merge Dspace configurations
      • Generate browse & search indexes - deploy webapp - restart servlet container
      1. thanks Bram! below updated (still very-very draft :)

        Dear DSpace developers and users,

        There is a general understanding within our community that any simplification to the upgrade process would be greatly appreciated. Easy upgrade scripts will be useful, in particular for institutions with few resources in terms of technical staff time. Ideally one script could be made that would enable going through the entire process responding to various prompts for config information – i.e. so simple that a repository manager may be able to do it.

        We can start with further automation of the latest DSpace upgrade instructions – a "suite of scripts" approach. For example, based on the latest DSpace upgrade instructions: [https://wiki.duraspace.org/display/DSDOC18/Upgrading+From+1.7.x+to+1.8.x|display/DSDOC18/Upgrading+From+1.7.x+to+1.8.x] the following could possibly be automated further:

        • Diagnostic script that compares your DSpace installation with a standard installation and highlights anything that is out of the ordinary after which it can give you warnings if some upgrades won't work or might cause difficulties.
        • Creation of a full backup of DB, asset store, configurations and customizations
        • Downloading a new DSpace version
        • Merging customizations
        • Build DSpace, stop tomcat, update DSpace
        • Update/Merge Dspace configurations
        • Generate browse & search indexes - deploy webapp - restart servlet container

        This kind of approach wouldn't automate the entire process, but could simplify it considerably.

        Another approach could be through common configurations that could provide an incentive driving users towards fewer configurations.

        Has any of you written or come across any easy upgrade scripts? We know that the Agris (FAO) and Odin (UNESCO-IOC) communities are currently investigating whether they can perform updates/upgrades using their customized Windows-only (.exe) Installer for DSpace (AgriOcean – a customized version of DSpace). Once it is built, this code may be something that could be generalizable, at least for Windows-based installations of DSpace (more detailed are here https://jira.duraspace.org/browse/DS-456). Do you know about any other similar projects?

        If there are no solutions available yet, would any of you volunteer to create easy upgrade scripts (e.g. 'dspace upgrade 1.7 1.8'), likely in Java? We can start writing scripts for the most common database/operating system which in our case is PostgreSQL/Linux.

        Number of users from DSpace Registry (http://www.dspace.org/whos-using-dspace/Repository-List.html)

        • Database: PostgreSQL 300, MySQL 58, Oracle 39, Other 9
        • Operating System Platform: Linux 224, Microsoft Windows 88, UNIX 28, Solaris 22, HPUX 2

        Anyone (with the proper skills) can volunteer, you don't have to be a DSpace committer.

        Thank you for your input!

        DCAT

  11. Marc Goovaerts, AgriOcean, explained his approach: 

    It is important to look first at our goals with the easy installer. In fact what users in our community expect is something like a software package that you install and you do not bother anymore about it. DSpace (and most of this type of software) has another approach. It starts from the source code and through different steps the code is build to a repository package. We analyzed how we could make this whole building process not necessary and how to deliver a finalized package. It has limitations in the way that the final package is very much defined. For our communities, it is in my opinion an advantage because of the standardization of the metadata standard and authority control aspects.

    We were not able (yet) to integrate all the different software in one package. People still have to install java - tomcat - postgresql separately. The installer  uses a builded dspace with a created database containing authority tables. Some basic configuration values are entered. The  builded dspace is put in a folder, relationswith tomcat and postgres are defined. I do not think that this fit in the DSpace philosophy. But I think that it can give some ideas.

    We will be confronted with the problem how to include new functionalites in existing AOD versions. We will have to find how the user defined postgresql, in which folder he put dspace and how he linked  the webapps to tomcat. Then it will be easy to upgrade the configuration, change and add files, and adapt the database structure (as long as Dspace keeps to the actual approach). Again, I believe that  this approach can be of interest for the Dspace community.  

    I thought first to include some of this information in the description, but it is probably easier to direct interested people to the webpages at FAO that explains the whole project (see below). But I thought that I had to give you the scope of it first.

  12. Updated text (with Marc's edits). If you don't have any edits/comments, I think we are ready to release this announcement.

    Community-wide discussion: easy upgrade scripts

    To:

    Dear DSpace developers and users,

    There is a general understanding within our community that any simplification to the upgrade process would be greatly appreciated. Easy upgrade scripts will be useful, in particular for institutions with few resources in terms of technical staff time. Ideally one script could be made that would enable going through the entire process responding to various prompts for config information – i.e. so simple that a repository manager may be able to do it. We suggest the following research approaches.

    Strategy 1: We can start with further automation of the latest DSpace upgrade instructions – a "suite of scripts" approach. For example, based on the latest DSpace upgrade instructions: [https://wiki.duraspace.org/display/DSDOC18/Upgrading+From+1.7.x+to+1.8.x|display/DSDOC18/Upgrading+From+1.7.x+to+1.8.x|\ the following could possibly be automated further:

    • Diagnostic script that compares your DSpace installation with a standard installation and highlights anything that is out of the ordinary after which it can give you warnings if some upgrades won't work or might cause difficulties.
    • Creation of a full backup of DB, asset store, configurations and customizations.
    • Downloading a new DSpace version.
    • Merging customizations.
    • Build DSpace, stop tomcat, update DSpace.
    • Update/Merge Dspaceconfigurations.
    • Generate browse & search indexes – deploy webapp – restart servlet container.

    This kind of approach wouldn't automate the entire process, but could simplify it considerably.

    Strategy 2: Another approach could be through common configurations that could provide an incentive driving users towards fewer configurations.

    Current activities: the Agris (FAO) and Odin (UNESCO-IOC) communities are currently investigating whether they can perform updates/upgrades using their customized Windows-only (.exe) Installer for DSpace (AgriOcean – a customized version of DSpace : for more information, go to http://aims.fao.org/agriocean-dspace). Once it is built, this code may be something that could be generalizable, at least for Windows-based installations of DSpace (more detailed are here https://jira.duraspace.org/browse/DS-456).

    Do you know about any other similar projects?

    Has any of you written or come across any easy upgrade scripts?

    Call for Volunteers: If there are no solutions available yet, would any of you volunteer to create easy upgrade scripts (e.g. 'dspace upgrade 1.7 1.8'), likely in Java? We can start writing scripts for the most common database/operating system which in our case is PostgreSQL/Linux (based on the numbers of users from DSpace Registry (http://www.dspace.org/whos-using-dspace/Repository-List.html): Database: PostgreSQL 306, MySQL 56, Oracle 41, Other 11; Operating System Platform: Linux 224, Microsoft Windows 88, UNIX 28, Solaris 22, HPUX 2).

    Requirements: Anyone (with the skills, expertise or experience) can volunteer, you don't have to be a DSpace committer.

    Thank you for your interest and input!

    DCAT

    1. Iryna -

      I think this msg looks good. I have a few tweaks below.

      I would add a short intro at the beginning to put some context around why DCAT is involved - maybe something like:

      As many of you know, the DSpace Community Advisory Team (DCAT) (https://wiki.duraspace.org/display/cmtygp/DSpace+Community+Advisory+Team) helps review the JIRA issues classified as new feature requests. The goal of the DCAT review is to help move the feature along -- by helping to flesh out requirements or ideas for implementation as well as recruiting developers to work on the request. DCAT generally selects the new feature requests which appear to be of broad interest in the community. DCAT discussions are documented and are always open to the entire community for comment/feedback (https://wiki.duraspace.org/display/dsforum/DCAT+Discussion+Forum). There are some new feature requests that would be very useful to discuss with the entire community on the mailing list rather than just in DCAT.  We are bringing one such request to you today:  Create easy upgrade scripts (https://jira.duraspace.org/browse/DS-456).

      Also, I would recommend inserting the title "Call for Feedback"  right before the question "Do you know about any other similar projects".

  13. Iryna,

    I think the text, with Valorie's suggested changes are good to go! Well done for bringing this forward!

    Elin

  14. Final version to be sent out today, thank you for all your comments and feedback!

    Community-wide discussion: easy upgrade scripts

    To:

    Dear DSpace developers and users,

    As many of you know, the DSpace Community Advisory Team (DCAT) (https://wiki.duraspace.org/display/cmtygp/DSpace+Community+Advisory+Team) helps review the JIRA issues classified as new feature requests. The goal of the DCAT review is to help move the feature along - by helping to flesh out requirements or ideas for implementation as well as recruiting developers to work on the request. DCAT generally selects the new feature requests which appear to be of broad interest in the community. DCAT discussions are documented and are always open to the entire community for comment/feedback (https://wiki.duraspace.org/display/dsforum/DCAT+Discussion+Forum). There are some new feature requests that would be very useful to discuss with the entire community on the mailing list rather than just in DCAT.  We are bringing one such request to you today: Create easy upgrade scripts (https://jira.duraspace.org/browse/DS-456).

    There is a general understanding within our community that any simplification to the upgrade process would be greatly appreciated. Easy upgrade scripts will be useful, in particular for institutions with few resources in terms of technical staff time. Ideally one script could be made that would enable going through the entire process responding to various prompts for config information – i.e. so simple that a repository manager may be able to do it. We suggest the following research approaches.

    Strategy 1: We can start with further automation of the latest DSpace upgrade instructions – a "suite of scripts" approach. For example, based on the latest DSpace upgrade instructions: (https://wiki.duraspace.org/display/DSDOC18/Upgrading+From+1.7.x+to+1.8.x) the following could possibly be automated further:

    • Diagnostic script that compares your DSpace installation with a standard installation and highlights anything that is out of the ordinary after which it can give you warnings if some upgrades won't work or might cause difficulties.
    • Creation of a full backup of DB, asset store, configurations and customizations.
    • Downloading a new DSpace version.
    • Merging customizations.
    • Build DSpace, stop tomcat, update DSpace.
    • Update/Merge Dspaceconfigurations.
    • Generate browse & search indexes – deploy webapp – restart servlet container.

    This kind of approach wouldn't automate the entire process, but could simplify it considerably.

    Strategy 2: Another approach could be through common configurations that could provide an incentive driving users towards fewer configurations.

    Current activities: the Agris (FAO) and Odin (UNESCO-IOC) communities are currently investigating whether they can perform updates/upgrades using their customized Windows-only (.exe) Installer for DSpace (AgriOcean – a customized version of DSpace : for more information, go to http://aims.fao.org/agriocean-dspace). Once it is built, this code may be something that could be generalizable, at least for Windows-based installations of DSpace (more detailed are here https://jira.duraspace.org/browse/DS-456).

    Call for Feedback: Do you know about any other similar projects? Has any of you written or come across any easy upgrade scripts?

    Call for Volunteers: If there are no solutions available yet, would any of you volunteer to create easy upgrade scripts (e.g. 'dspace upgrade 1.7 1.8'), likely in Java? We can start writing scripts for the most common database/operating system which in our case is PostgreSQL/Linux (based on the numbers of users from DSpace Registry (http://www.dspace.org/whos-using-dspace/Repository-List.html): Database: PostgreSQL 310, MySQL 58, Oracle 40, Other 12; Operating System Platform: Linux 224, Microsoft Windows 88, UNIX 28, Solaris 22, HPUX 2).

    Requirements: Anyone (with the skills, expertise or experience) can volunteer, you don't have to be a DSpace committer.

    Thank you for your interest and input!

    DCAT

  15. +1.

    I would put it with Highest priority.

    Some comments from my head for Strategy 1.

    1) I agree that it is best to implement it in Java (also may be some simple start scrips will be necessary for different operating systems, but all logic in Java).

    2) The hardest is to migrate database, usually. So here I would export all essential data to intermediate format (plain text, JSON, YAML, XML ... whatever human readable). For metadata we have now DIM dissemination/ingestion crosswalks as variant (I used it as intermediate step during batch import of metadata in other XML formats).

    We need to export things, like:

    • Users
    • Communities/Collections structure (with respective metadata)
    • Communities/Collections' access rights ... etc.
    • Items' metadata
    • Bitstreams can be left as is in [assetstore] and just copied (for now ...)

    After fresh install of new DSpace version, import all that staff using Data Access Layer API (recreate users, communities, collections, items ...). So no SQL backup/restore. This will give us flexibility to make necessary validations during migrations and we decouple from DB engine implementation (PostgreSQL versions, Oracle, MySQL ... whatever can be used now and in future).

    3) This functionality can be used for Backup/Restore of the repository later.

    Next big question - how to merge user's customization to DSpace :) here I don't have clear idea now.

    I, personally, need this feature (backup/restore as I described + update) for my repositories, so will work on that, but can't tell/promise any time frames at the moment.

     The Agris (FAO) and Odin (UNESCO-IOC) activity, you mention, under evaluation at the moment (I'm responsible for development) and we will change some approaches (we have requests to implement installer not only for windows). But we will use export/import of all metadata and structure/users to/from intermediate format, no SQL backup/restore procedures (it is too dependent on DB schema changes, version, DB engine etc.). We have some plans for June 2012 for this, actually.