DSpace Architectural Review
Notes from Wednesday, 25 Oct 2006 (JSE)
I. Review of Agenda
1. Workflow
2. ID management and ePeople
2. Authorization & Policy Implementation
3. Other?
II. Workflow
See diagram at []
1. Current Ingest "Workflow"
- Submission
- "workflow" (post-submission)
- install
2. Event Mechanism (Larry Stone, MS, RR)
- a general purpose notification system
- policy driven
- customizable
3. History System
- creates an audit trail
- follows an ABC ontology
- writes to triple-store
4. Preservation
5. Versioning
6. Issues
what*are the first-class items we're worried about for long-term preservation?
- establishing precedent for life-cycle management; not much experience in the field
- lifecycle management currently is done poorly in all systems
- little things (improving current workflow) vs big things (event mechanism)
- future robust content integrty service
7. Providing hooks and innovations tthat enable experimentation in this area
RECOMMENDATION: Move towards improving some sort of improved history system
- which includes rigorously improved event system
8. Discussion of relationship between versioning, events and history system
9. Workflow: There have proposals to RIP OUT DSpace workflow "engine"
- and replace with third-party system
- (rob) brief history of dspace workflow system
- (rj) definitely need more flexible workflow capability
- (sp) degree to which Manakin helps
- (rob) BUT aspects are baked into data model
10. Workflow: (rob) Opportunity to use flexible worklow system for implementation e.g. preservation workflows, etc.
- (ms) Also: a generic workflow system would help untangle administration system
RECOMMENDATION (ms) Keep "lightweight" system whilst enabling access by other systems via LNI??
- Or: re-implement "Workflow" module in DS based on third-party, open-source worflow engine & language
- Q: Are there any plugin mechanisms that work esp well with any workflow systems, etc?
- (md) See Open Symphony
*Do as much as we can to improve workflow with Manakin
*Investigate and recommend a third-party open-source workflow engine
11. Identify Management (hj)
- (hj) trouble with changing eperson records
- (jse) what is req'd? what is 'identity" used for?
- permission control (persons and groups)
- "role" management
- permissions and responsibility
- auditing (events/history)
- eperson record is the source of the data
- who did what (name, email)
- authority control
- persistent query
- notification services
- creator metadata
- every item has the submitting ePerson
- (jse) how is "role" specified?
- policy table
- eperson|group, action, object
- problems occur with administration
- relationship between DSpace ePerson and e.g. LDAP
- protected data in record
- (rob) Three basic ways that identity manifest
- There is the "stuff" to do with roles and permissions
- getting authoritative assertions from third-party services
- Records in the metadata
- Notifications
- e.g. email address
- (rj) could abstract how notifications are done
RECOMMENDATION: It would be useful to have persistent IDs for ePeople
- that are valid URIs
- format that the URIs could take
- aggregating metadata associated with ePerson
- should they be actionable
- they could be handles
- they could be managed by some other system
- Reminder: "Out-of-the-box" is in the manifesto
- Application-specifiable
- format
- some way of minting them
- Ways of importing epeople?
- (ms) making people equivalent to items
- (ms) What about the Info URI system the Rob proposed years ago
12. Authorization
- (ms) today we have a home-grown but okay for "version 1" solution
- do we re-factor for "glamorous"?
- do we fix specific problems?
- (rob) what do permissions really mean?
- e.g. what are the semantics of a particular permision
- bigger problem is managing permissions, ui, etc
- there are certain inconsistencies in management
- set of behaviors that are undocumented; e.g. changing permissions on collection, impact on other
- whole load of unconfigurable, invisible baked-in logic
- roles and permissions are*conflated, which makes making a UI hard
- (rj) can roles be aggregations of permissions, to which people are assigned?
- set of actions that are distinguished
- roles and actions are currently mixed up, need to be clarified
- these defined roles, these defined permissions
- (rj) do we need a way to define roles?
- are e.g. WfS1, WfS2 states or actions or????
RECOMMENDATION: Current conflation isn't working
-
- do we incrementally change vs refactor and adopt
- Clean up/carify specification of model
- re-implement (or tweak) based on cleaned model
- Rob: strawman model
- role, permissions, objects, actions
- eperson, group
RECOMMENDATION: For workflows, rely on the AuthZ engines of an adopted Workflow Engine
- Conversely, make Workflow AuthZ a criteria/requirement of Workflow Engine selection
- (rj) presumably such an AuthZ is specific
13. Topics for Thursday and Friday
- in the perfect world, setting up a Community is a workflow step
- also, extending
- Abstract data model
- communities/collections
- bitstream relationships
- concrete data model & storage
- history, provenance, audit
- admin, curatorial -> workflow
- Friday: