Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. In DSpace 6x, tomcat does not properly restart after a statistics shard has been created.  
    1. Jira
      serverDuraSpace JIRA
      serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
      keyDS-3457
    2. This behavior has been verified at 3 instances (Georgetown, UCLA and ?). Tom Desair has volunteered to try to reproduce this error.
  2. In DSpace 5x and 6x, the owningComm field is corrupted by the sharding process 
    1. Jira
      serverDuraSpace JIRA
      serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
      keyDS-3436
    2. Tom Desair has created a fix: https://github.com/DSpace/DSpace/pull/1613  While testing this fix, many of the other issues listed here have been discovered.

...

  1. The shard process requires statistics records from a prior calendar year to be present.
    1. Proposal: Ensure that the statistics import/export tools allow for the creation of records from a prior year.
      1. See "Statistics Import/Export Tool Issues"
  2. Once the shard process has been run for records from a calendar year, the process cannot be re-run.
    1. Proposal: Allow the sharding process to append records into an existing shard (rather than failing) 
      1. Jira
        serverDuraSpace JIRA
        serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
        keyDS-3458
      2. DSpace 5x PR: https://github.com/DSpace/DSpace/pull/1625
      3. A DSpace 6x PR cannot be tested until DS-3457 is resolved

Statistics Import/Export Tool Issues

  1. The import tool often fails when attempting to import records due to _version issues.
    1. Proposal: Make the import process more tolerant during record ingest
      1. DSpace 5x PR: https://github.com/DSpace/DSpace/pull/1623/files
      2. DSpace 6x PR: https://github.com/DSpace/DSpace/pull/1624/files
  2. Make solr-import-statistics, solr-export-statistics, and solr-reindex-statistics easier to use
    1. Issues
          1. The import and export tool always assume that the main statistics repo is being processed making it difficult to successfully process an individual shard.
          2. The import tool often fails when attempting to import records due to _version issues.
          3. Error messages are confusing from these tools.
          4. The export and re-index tools often fail due to the presence of existing export files.
        1. Proposed Changes
          1. Proposal: Do not force the inclusion of a "-i statistics" parameter to the function.  Rather, set "-i statistics" as a default when no "-i" parameter is found. 
            1. Jira
              serverDuraSpace JIRA
              serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
              keyDS-3456
          1. DSpace 5x PR: https://github.com/DSpace/DSpace/pull/1623/files
          2. DSpace 6x PR: https://github.com/DSpace/DSpace/pull/1624/files
          1. Proposal: Make the import process more tolerant during record ingest
          2. Proposal: Make import/export failure messages more explicit.  Include the repository, the export file, and the reason for failure in error and log messages. 
            1. Jira
              serverDuraSpace JIRA
              serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
              keyDS-3456
          1. Dspace 5x PR: https://github.com/DSpace/DSpace/pull/1623/files
          2. DSpace 6x PR: https://github.com/DSpace/DSpace/pull/1624/files
          3. Proposal: Add a command line option allowing export files to be overwritten on export
          .
          1. DSpace 5x PR: https://github.com/DSpace/DSpace/pull/1623/files
          2. DSpace 6x PR: https://github.com/DSpace/DSpace/pull/1624/files
          3. Proposal: Add a command line option allowing export files to be overwritten on re-index
        2. Pull Requests
          1. DSpace 5x PR: https://github.com/DSpace/DSpace/pull/1623/files
          2. DSpace 6x PR: https://github.com/DSpace/DSpace/pull/1624/files
      1. When sharding, the destination repo name is off by one calendar year 
        1. Jira
          serverDuraSpace JIRA
          serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
          keyDS-3437
        2. Note that this issue has been found at Georgetown.  Tom Desair could not reproduce this issue.
      2. solr-reindex-statistics does not work for a shard 
        1. Jira
          serverDuraSpace JIRA
          serverIdc815ca92-fd23-34c2-8fe3-956808caf8c5
          keyDS-3464