...
- In DSpace 6x, tomcat does not properly restart after a statistics shard has been created.
Jira server DuraSpace JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key DS-3457 - This behavior has been verified at 3 instances (Georgetown, UCLA and ?). Tom Desair has volunteered to try to reproduce this error.
- In DSpace 5x and 6x, the owningComm field is corrupted by the sharding process
Jira server DuraSpace JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key DS-3436 - Tom Desair has created a fix: https://github.com/DSpace/DSpace/pull/1613 While testing this fix, many of the other issues listed here have been discovered.
...
- The shard process requires statistics records from a prior calendar year to be present.
- Proposal: Ensure that the statistics import/export tools allow for the creation of records from a prior year.
- See "Statistics Import/Export Tool Issues"
- Proposal: Ensure that the statistics import/export tools allow for the creation of records from a prior year.
- Once the shard process has been run for records from a calendar year, the process cannot be re-run.
- Proposal: Allow the sharding process to append records into an existing shard (rather than failing)
Jira server DuraSpace JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key DS-3458 - DSpace 5x PR: https://github.com/DSpace/DSpace/pull/1625
- A DSpace 6x PR cannot be tested until DS-3457 is resolved
- Proposal: Allow the sharding process to append records into an existing shard (rather than failing)
Statistics Import/Export Tool Issues
- The import tool often fails when attempting to import records due to _version issues.
- Proposal: Make the import process more tolerant during record ingest
- DSpace 5x PR: https://github.com/DSpace/DSpace/pull/1623/files
- DSpace 6x PR: https://github.com/DSpace/DSpace/pull/1624/files
- Proposal: Make the import process more tolerant during record ingest
- Make solr-import-statistics, solr-export-statistics, and solr-reindex-statistics easier to use
- Issues
- Issues
- The import and export tool always assume that the main statistics repo is being processed making it difficult to successfully process an individual shard.
- The import tool often fails when attempting to import records due to _version issues.
- Error messages are confusing from these tools.
- The export and re-index tools often fail due to the presence of existing export files.
- Proposed Changes
- Proposal: Do not force the inclusion of a "-i statistics" parameter to the function. Rather, set "-i statistics" as a default when no "-i" parameter is found.
Jira server DuraSpace JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key DS-3456
- DSpace 5x PR: https://github.com/DSpace/DSpace/pull/1623/files
- DSpace 6x PR: https://github.com/DSpace/DSpace/pull/1624/files
- Proposal: Make the import process more tolerant during record ingest
- Proposal: Make import/export failure messages more explicit. Include the repository, the export file, and the reason for failure in error and log messages.
Jira server DuraSpace JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key DS-3456
- Dspace 5x PR: https://github.com/DSpace/DSpace/pull/1623/files
- DSpace 6x PR: https://github.com/DSpace/DSpace/pull/1624/files
- Proposal: Add a command line option allowing export files to be overwritten on export
- DSpace 5x PR: https://github.com/DSpace/DSpace/pull/1623/files
- DSpace 6x PR: https://github.com/DSpace/DSpace/pull/1624/files
- Proposal: Add a command line option allowing export files to be overwritten on re-index
- Proposal: Do not force the inclusion of a "-i statistics" parameter to the function. Rather, set "-i statistics" as a default when no "-i" parameter is found.
- Pull Requests
- DSpace 5x PR: https://github.com/DSpace/DSpace/pull/1623/files
- DSpace 6x PR: https://github.com/DSpace/DSpace/pull/1624/files
- When sharding, the destination repo name is off by one calendar year
Jira server DuraSpace JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key DS-3437 - Note that this issue has been found at Georgetown. Tom Desair could not reproduce this issue.
- solr-reindex-statistics does not work for a shard
Jira server DuraSpace JIRA serverId c815ca92-fd23-34c2-8fe3-956808caf8c5 key DS-3464