Contribute to the DSpace Development Fund

The newly established DSpace Development Fund supports the development of new features prioritized by DSpace Governance. For a list of planned features see the fund wiki page.

Title (Goal)Quality Control Reports
Primary ActorAdmin
Scope 
Level 
Story (A paragraph or two describing what happens)

Administrator or collection administrator can launch quality control reports that identify instances of a specific use case on a collection by collection basis.

Collection administrators can use these reports to perform quality control checks. Software developers can use these reports to identify interesting use cases for system testing.

Possible reports

  • Number of items without a bitstream
  • Number of items with multiple original bitstreams
  • Number of items without a license
  • Number of items containing document bitstreams (pdf, word)
  • Number of items containing image bitstreams (jpg, jp2)
  • Number of items containing audio bitstreams
  • Number of items containing video bitstreams
  • Number of items containing compound bitstreams (zip)
  • Number of items referencing streamable media
  • Number of items with non-standard bitstream formats
  • Number of items with very large bitstreams
  • Number of items with very small (possibly corrupt) bitstreams
  • Number of items without thumbnails
  • Number of items with multiple thumbnails
  • Number of items with custom (non-generated) thumbnails
  • Number of document items without a full text extract bitstream
  • Number of items with a specific metadata field
  • Number of items with multiple instances of a metadata field
  • Number of items missing a required metadata field
  • Number of items with an improperly formatted metadata field (date field)
  • Number of items visible to anonymous user
  • Number of items not visible to an anonymous user
  • Number of items with bitstreams accessible to an anonymous user
  • Number of items with bitstreams not accessible to an anonymous user
  • Number of items with a finite embargo end date
  • Number of items with an indefinite embargo end date
  • Number of items with an embargo end date in the past
  • Number of items with metadata containing a URL
  • Number of items with metadata containing non-ASCII characters
  • Number of items containing full text (per the provenance field)
  • Number of items modified in the last X days
  • Number of items containing provenance data or annotation data within a non-public bitstream

 

3 Comments

  1. Hi Terrence W Brady,  Just a clarification question.

    Is this essentially the same use case as you described in Admin UI - Collection Admin can construct a Quality Control report ?

    I'm just curious if we should either merge these two use cases into one, or minimally interlink them.

  2. Tim Donohue, these are similar but not the same.  

    • This use case lists counts of items collection by collection
    • The other use case is initiated with a metadata query and lists items that satisfy a query

    I have uploaded some code that proposes how this use case could be satisfied by the REST API.

    https://github.com/Georgetown-University-Libraries/DSpaceRestQCReports

    Let me know if you have any suggestions on how to start a dialog on this proposal.

    I hope to enhance this proposal to illustrate how to satisfy the query use case.

    Terry

     

     

    1. Thanks again for the additional info, Terry.

      I'll leave these as separate Use Cases and just link them up as related.

      As far as starting a dialog on your existing work, I think it definitely helps that you are linking up your existing work to these use cases (so we know you already have code that can do this).  Second would be to likely see what others (in DCAT especially, possible developers/committers too) think of these use cases and your work on them, as that can help us in prioritizing these use cases (for possible inclusion in DSpace).

      In general, the tools you are making look like they would likely be of interest to others. But, I'd like some DCAT folks to provide a stamp of approval if they see these as high priority / highly useful tools.