Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

See Trading reviews on Pull Requests for how to get immediate attention to that PR!

Notes

DSpace-DSpace-CRIS Merger

  • Leadership met on Nov 19 and voted on the merger of DSpace and DSpace-CRIS. A public announcement communicating the result of this vote is expected in the next few days

Provenance and Audit Trail System Overlap

The majority of the meeting focused on whether proposed pull requests should enhance

  • The provenance metadata field, or
  • The new audit trail system

Context

Several PRs were submitted to extend provenance tracking. These overlapped with a larger audit trail feature already in development, which led to confusion and the need for alignment.

Background

Audit Trail is for: 

  • System actions occurring within DSpace (machine-recorded)

  • Tracking what happened, when, and by whom, while an item resides inside the repository

Provenance is for: 

  • Major history or chain-of-custody information that should travel outside of DSpace

  • Information that may pre-date or post-date DSpace storage

  • Metadata that should “outlive the system” if the item is exported to another repository

Concerns

Risk of Duplication & Confusion:

  • Several participants worried that both systems tracking the same information would confuse administrators and future developers

Storage Bloat:

  • Writing too many detailed system-level events into provenance could produce very large metadata fields that offer little value if the information already exists in the audit trail.

Audit Trail Should Be Automated

  • Audit trails should not be manually editable – otherwise auditors would distrust them. Provenance can contain manually added entries when needed.

GDPR / Privacy Implications

  • Tracking additional information may have GDPR implications and must be considered carefully.

Proposed Resolution Path

Working Definition Moving Forward:

  • Keep major curated, historically important object changes in provenance.

  • Keep day-to-day system events in the audit trail.

Examples of what may belong where:

Belongs in ProvenanceBelongs in Audit Trail
Permanent rights changesMetadata edits
Major embargo changesBitstream changes
High-level change summaries

Next Actions

  • Tim Donohue will create a ticket outlining the working rules for what belongs in each system. 

  • Community may consult with external curation experts (e.g., DCAT) for professional guidance on what archival standards recommend.

Pull Requests and Implementation Notes

  • Some functionality (metadata changes, bitstream changes) is already tracked in the audit trail, while others (e.g., item collection transfer, certain policy changes) would need additions.

  • PR authors may be asked to reframe their contributions to use the audit trail framework instead of provenance, where appropriate. 

  • Where overlapping work already exists, PRs should be cross-linked to avoid isolated development.

GitHub Cleanup – Automatically Marking Stale Issues

  • Tim Donohue proposed enabling GitHub automation to:

    • Flag inactive PRs after long periods

    • Automatically close them after 2 weeks if no action is taken

  • High-priority items would never be automatically closed

  • Some PRs and tickets have been open for nearly 10 years, and this system would help with backlog grooming

  • Briefly looked over and discussed https://github.com/DSpace/DSpace/pull/11550

Key Decisions & Agreements

Consensus Emerging

  • Audit trail and provenance should not completely duplicate each other.

  • Audit trail should track technical changes within DSpace.

  • Provenance should focus on essential, long-term descriptive history.

Expected Actions

  • Create explicit written guidance on what belongs where.

  • Align overlapping PRs and possibly rework some to use the audit trail.

  • Configure systems so sites can choose provenance, audit trail, or both.

  • Possibly consult with digital curation experts before finalizing policy.

  • Introduce GitHub automation for aging PRs (experimental).