Page History

...

The script can be executed through the DSpace command-line interface (the script is also available from the UI but requires administrative permission to run the script):

Parameters

./dspace solr-core-management -m <mode:{export|import}> -c <core:{audit|statistics|...}> -d <directory> [-f <format:{csv|json}>] [-t <threads:integer>=1] [-s <start-date:yyyy-MM-dd>] [-e <end-date:yyyy-MM-dd>] [-i <increment:{WEEK|MONTH|YEAR}>] [-h]

Parameter Description

Parameter	Required	Description
`-m`, `--mode`	✔	Operation mode: either `export` or `import`.
`-c`, `--core`	✔	Name of the Solr core to manage (e.g., `statistics`, `authority`, `audit`).
`-d`, `--directory`	✔	Directory where exported data will be stored or imported from.
`-f`, `--format`		File format for export/import. Supported formats: `csv` (default) or `json`.
`-t`, `--threads`		Number of threads used for parallel processing (default: 1).
`-s`, `--start-date`		Start date (in `yyyy-MM-dd` format) for time-based filtering during export.
`-e`, `--end-date`		End date (in `yyyy-MM-dd` format) for time-based filtering during export.
`-i`, `--increment`		Split the export into time-based chunks: `WEEK`, `MONTH`, or `YEAR`. Useful for very large datasets. (default is `MONTH`)
`-h`, `--help`		Displays help and usage information.

...

Examples

Export Example

./dspace solr-core-management --mode export --core audit --directory /tmp/export --format csv --threads 4 --increment WEEK

This command exports the content of the audit core into the directory /tmp/export, splitting data by weekly increments.
The export is performed in CSV format, using 4 parallel threads for faster processing.

Incremental export is useful when the Solr core contains a large volume of records — for example, exporting weekly chunks prevents single massive files and allows resuming operations in case of partial failure.

...

Import Example

./dspace solr-core-management --mode import --core audit --directory /tmp/export --format csv --threads 2

This command imports previously exported data (from /tmp/export) back into the audit Solr core.
It uses 2 threads to parallelize document ingestion and supports the same format used during export (csv or json).

This operation is typically used when:

Rebuilding a Solr core after corruption or reindexing purposes.
Migrating Solr data between environments (e.g., production → test).

...

Best Practices

Better stoping DSpace activities before performing imports to avoid inconsistencies.
(this also depends on the data where exporting)
Run exports with multiple threads when working with large datasets to reduce execution time.
Be aware that multiple thread execution could lead to an increased workload on the Solr installation.

...

Note

The export process is designed to operate over date ranges rather than a single continuous dataset.
This approach serves two purposes:

It makes data more manageable and modular, allowing administrators to back up or transfer only specific time periods (e.g., weekly or monthly exports).
It avoids the need for deep pagination over very large result sets, which would require Solr to maintain an explicit sort order (sort=<sort-field>), significantly increasing memory usage and query time.

By splitting exports into smaller date-based ranges, the process minimizes Solr load, reduces the risk of timeouts, and ensures that each export segment can be completed efficiently even on heavily populated cores.

All Versions

DSpace Documentation

Page tree

Versions Compared

Old Version 2

New Version Current

Key