All Versions
DSpace Documentation
...
The script can be executed through the DSpace command-line interface (the script is also available from the UI but requires administrative permission to run the script):
./dspace solr-core-management
-m <mode:{export|import}>
-c <core:{audit|statistics|...}>
-d <directory>
[-f <format:{csv|json}>]
[-t <threads:integer>=1]
[-s <start-date:yyyy-MM-dd>]
[-e <end-date:yyyy-MM-dd>]
[-i <increment:{WEEK|MONTH|YEAR}>]
[-h]
| Parameter | Required | Description |
|---|---|---|
-m, --mode | ✔ | Operation mode: either export or import. |
-c, --core | ✔ | Name of the Solr core to manage (e.g., statistics, authority, audit). |
-d, --directory | ✔ | Directory where exported data will be stored or imported from. |
-f, --format | File format for export/import. Supported formats: csv (default) or json. | |
-t, --threads | Number of threads used for parallel processing (default: 1). | |
-s, --start-date | Start date (in yyyy-MM-dd format) for time-based filtering during export. | |
-e, --end-date | End date (in yyyy-MM-dd format) for time-based filtering during export. | |
-i, --increment | Split the export into time-based chunks: WEEK, MONTH, or YEAR. Useful for very large datasets.(default is MONTH) | |
-h, --help | Displays help and usage information. |
...
./dspace solr-core-management --mode export --core audit --directory /tmp/export --format csv --threads 4 --increment WEEK
This command exports the content of the audit core into the directory /tmp/export, splitting data by weekly increments.
The export is performed in CSV format, using 4 parallel threads for faster processing.
Incremental export is useful when the Solr core contains a large volume of records — for example, exporting weekly chunks prevents single massive files and allows resuming operations in case of partial failure.
...
./dspace solr-core-management --mode import --core audit --directory /tmp/export --format csv --threads 2
This command imports previously exported data (from /tmp/export) back into the audit Solr core.
It uses 2 threads to parallelize document ingestion and supports the same format used during export (csv or json).
This operation is typically used when:
Rebuilding a Solr core after corruption or reindexing purposes.
Migrating Solr data between environments (e.g., production → test).
...
Better stoping DSpace activities before performing imports to avoid inconsistencies.
(this also depends on the data where exporting)
Run exports with multiple threads when working with large datasets to reduce execution time.
Be aware that multiple thread execution could lead to an increased workload on the Solr installation.
...
The export process is designed to operate over date ranges rather than a single continuous dataset.
This approach serves two purposes:
It makes data more manageable and modular, allowing administrators to back up or transfer only specific time periods (e.g., weekly or monthly exports).
It avoids the need for deep pagination over very large result sets, which would require Solr to maintain an explicit sort order (sort=<sort-field>), significantly increasing memory usage and query time.
By splitting exports into smaller date-based ranges, the process minimizes Solr load, reduces the risk of timeouts, and ensures that each export segment can be completed efficiently even on heavily populated cores.