Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Introduction, Agenda
  • Planning and Scheduling 1.5.1
  • Communications and bug tracking - JIRA?
  • Fedora collaboration possibilities: To what extent should we attempt to align data models? Where do we envision possible points of contact?
  • OAI-ORE: In or out of scope? Influence on DSpace+2.0 data model?
  • Completed work and revisions to 2.0 goals to date.
  • Status of UI support: Will 2.0 work be exposed in both interfaces?
  • Workflow engine: Do we integrate, and if so, which candidates?
  • Review of http://wiki.dspace.org/index.php/Frequently+Asked+ForImage Removed
    (lunch break)

...

MediaWiki needs to stay - we can't migrate again.** On that note: http://confluence.atlassian.com/display/CONFEXT/Universal+Wiki+ConverterImage Removed

SF deficiencies need addressing - JIRA may be an option.

We need to work ourselves to integrate the services seamlessly.

...

  • in the abstract data model.
  • Mark hopes the data model work done will have certain component that is abstract and defining how the DSpace data model will exist and can use as a guide when new functionality is added.
  • Tool for mapping into standards and other systems.
  • Jim D- means that will push a data API for storage, and moved up the agenda.
  • Rob suggests DSpace developers join the Fedora mailing list to see how the community works.

    OAI-ORE

    What is it? - http://www.openarchives.org/ore/Image Removed

We'll cover this as we go as appropriate.

...

  • Wiki page for existing model: http://www.dspace.org/index.php?option=com_content&task=view&id=149Image Removed
  • PDF for recommended model: http://wiki.dspace.org/static_files/0/0e/DSpace-recs.pdfImage Removed
  • Wiki page for recommended model: http://wiki.dspace.org/index.php/ArchReviewDataModelImage Removed
  • We agree the data model should not require hierarchical identifiers, we should call for hierarchical identifiers within an item. Having opaque identifiers for everything as the only source of relationships might not be the best strategy from the preservation viewpoint. If the data-model could be represented in a hierarchical method, than the identifiers should mirror this relationship and be exposed.
  • The introduction of ORE does not change the need/desire for a FRBR type data model.
  • There are several manifestations of the FRBR model to be reviewed. A simplistic model is illustrated in the 2.0 recommendations report, however this may not be the most appropriate manifestation to be implemented.
  • SWAP is actually an intersection between FRBR and the DCMI Abstract Data Model. So what we have is a representation of the "Entities" of FRBR as Work --> Expression --> Manifestation - -> Item instead as ScholarlyWork - -> Expression - -> Manifestation - -> Copy, I've started to have serious concerns with this. Explicitly, both "Copy" and "Item" are abstract concepts and not necessarily representative of the "file parts" of our composite or complex digital object. I am concerned that there is a repeated effort to "shoe-horn" Ontologies like FRBR and SWAP into a fixed, overly simplified and strictly linear hierarchies like DSpace's Collection/Item/Bundle/Bitstream and that every time this happens, its a lossy mapping that results in limiting the users ability to accurately represent the structural nature of their composite digital content. In FRBR, Works and Manifestations (and possibly Expressions) are Containers which may contain themselves, thus the following is possible based on the model (Work - -> Work - -> Expression - -> Manifestation - -> Manifestation --> Item) likewise there are many interrelationship properties (predicates) between siblings and between parent child Entities in the model that go far beyond just Containership, thus introducing a rigid structure reminiscent of FRBR or SWAP here can be quite dangerous and confusing. -Mark Diggory (2008-06-30)I find myself returning to an original viewpoint (originally expressed by Richard Rodgers a couple years ago) that These relationships need to be expressed as Metadata (Statements) attached to the DSpace data model (not dissimilar to that of rels-ext found in Fedora). I take this viewpoint further and suggest that at its heart the DSpace data model needs to be rooted in a generic node-property graph that will initially have a more concrete or explicit SWAP/FRBR Application profile enforced on it. -Mark Diggory (2008-06-30)
  • Mark comments work in regard to metadata???mark needs to comment further

...

  • This is actually what SWAP is for SWAP (Scholarly Works Application Profile) -Mark Diggory (2008-06-30)
  • Proposed data model as detailed on the wiki. Need to be able to attach metadata at multiple levels, and describe the relationships between the items/bitstreams etc.
  • Aaron comments to possibly look at JCR for versioning the datamodel vs. the approach described in the wiki. If versioning is captured within datamodel, it will force incompatibility with other JCR compatible solutions. In order to support Fedora and other JCR implementations- expose the versioning but let another application do the work.
  • Rob comments one of DSpaces strength is the data model is easily understood and this in an important concept to carry forward.
  • Aaron comments to try and keep the data model as simple as possible.
  • Brad comments for each function/service is there an existing tool/service we can use or do we need to build it into our existing code repository? Want to be able to extend the datamodel vs. building all the functionality within the trunk when possible. However, have to acknowledge the relationships exist with the datamodel.

    Review of FAF wiki page -http://wiki.dspace.org/index.php/Frequently+Asked+ForImage Removed

  • Statistics
  • Versioning
  • Distributed Community / Collection Management
  • Embargo
  • Streaming Media
  • Electronic Theses & Dissertations (ETDs)
  • Support for hierarchical LDAP servers
  • Better Windows O/S support
  • Branding (name of service)
  • Hit highlighting in search results
  • The point of reviewing the list is to take any of these requests into account as we make a plan for the "funded" work in regard to the datamodel.
  • Comment from Aaron:Putting these features in before refactoring can make the overall work harder.
  • Brad- there is going to be work going on that is not the "funded" work, going on at the same time, so we need to agree on how to deal with potential conflicts as a result.Make sure people associated with these various features/projects listed the team is in close contact with.
  • Mark- comments that some of the items on this list may not be applicable for the team to consider, given the nature of the request and possible resolution to the issue.

    Discussion before Michele left

    Q: Jim D - How can we move towards getting proper funding for core development projects?

...

  • Brad outlined the challenges:Right now it workflow is integrated and hardcoded into the core code, but if we want to work with others (e.g. Fedora) then we would want to take it out as a service. How do we want to position ourselves in this area? Do we want to keep going in our current way, pull it all out into some workflow engine, or migrate slowly.
  • Rob wanted to clarify the definition of a workflow. For examples is it a normal deposit workflow, or do we include things like the workflow for creating a collection, or some of the offline jobs. What do we mean by "a workflow"?
  • Brad considered it as how to string together the steps required for an ingest step (the activities that need to occur, with rules and constraints), rather than going down to the level of moving from page to page in a particular activity. So somewhere in the range of what Rob described.
  • Rob defined this as an interaction requiring more than one user, not just small atomic changes.
  • Mark explained some past experiences with struts and workflows. Workflows are always very independent from user interaction with a system, so struts would never be a good workflow engine, you should use a workflow engine instead, and interact with that using something like struts.
  • There is work in this area in Kuali which is probably worth following up on. (http://www.kuali.org/)(http://confluence.arizona.edu/confluence/display/KUALI/What+is+Kuali+Enterprise+WorkflowImage Removed)
  • There is some work in this area with Sakai and assignments. They might be a good source for asking for advice? (http://bugs.sakaiproject.org/confluence/display/FRAME/Workflow+ConsiderationsImage Removed)
  • Brad: Is this an area which is urgent? Does it need to feed into core 2.0 work, or can is be dealt with separately?
  • Aaron: Maybe we should address this when we talk about services?
  • Rob: The API needs to be for the data model. Lets learn from the situation we have at present - Rob can talk about ths if required.
  • Mark: Does the atMire GSOC student have anything to feed in here?

    Services

    Two activities: 1) List the services DSpace provides, 2) The mechanics, potential toolkits etc.DSpace services:

...

  • Read
  • Write
  • Asset store
  • Transformation
  • Data extraction
  • Monitoring / integrity
  • Logging / statistics
  • Events
  • Users
  • Describe (introspection)
  • Structure
  • Versioning
  • Metadata store
  • Metadata schema
  • Service locator
    Snapshot of whiteboard used during the discussion:


Image Removed Image Added

Service frameworks

...

  • service management,
  • service location,
  • service configuration
  • (and for it to not show up in the code - annotations are okay)
  • we don't want to be forced to code in a certain way.
    Shortlist

...

Extra Notes for people as out of the loop as Aaron:

Maven 2 site: http://projects.dspace.org/dspace/Image Removed

Maven SNAPSHOT and Release repositories http://maven.dspace.org/Image Removed

Source repository (SVN @ sourceforge): https://dspace.svn.sourceforge.net/svnroot/dspace/trunk/dspaceImage Removed

Issue tracker (sourceforge): http://sourceforge.net/tracker/?group_id=19984Image Removed

Continuous Integration (Bamboo @ Johns Hopkins):

Note: The maven 2 site should be updated to have this information on it for 2.0 (some is here)

...

  • Cambridge: Existing LNI pulling items from DSpace no ingest.
    • Success at services layer, plus existing protocol implementations should address this.
  • SWORD2?
  • (provide) Backend data management and preservation for publishing systems such as Open Journal System (OJS)
  • Improved support of SOA approach through further development of LNI(Light network interface), standard protocols to move data into and out of the DSpace repository.RESTful interface, entity system?
  • Ingest, import, export provided via assortment of protocols (see existing packaging frameworks, crosswalks).
  • Ability to customize the interface and workflow engine to better suit your institutions needs

...