Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Attendees: Kevin Van De Velde, Graham Triggs, Peter Dietz, Andrea Schweer, Kim Shepherd, Hardy Pottinger, Mark Wood, Tim Donohue

Primary Discussion topics:

  • Interaction/Code Share via GitHub.
    • Can we come up with some best practices for forking GitHub, so that we can all start to share code more easily? I.e. allow institutions to pull in code from one another, etc.
    • We may need to spend some time developing some GitHub best practices.
    • We should also leverage Maven to obviously create separate GitHub projects for features that can just extend DSpace API, etc.
  • Business Logic API discussion led into talking about REST API as a potential area for this "common business logic"
  • REST API - Who is using it? What should it do, and how should it interact with SWORD / Solr / etc.
    • No one in attendance is using REST API in Production (though a few have used it for development)
    • We should try and "show it off" more by writing some simple Javascript "embed" code to embed DSpace Content in a website. We can post that code up to http://demo.dspace.org and enable the REST API there. This could also allow for more testing.
    • REST API should never create its own 'conventions' for common tasks. Rather it should use existing standards (e.g. SWORD for deposit, etc)
    • Question: Should REST API "wrap" things like SWORD & Solr (which can be accessed RESTFully themselves)? Could "RESTFul" DSpace just be a combination of SWORD + Solr + Admin REST API + other RESTFul interfaces?
      • Several in disagreement here. Pros/Cons to various approaches
      • Fedora actually has a separate REST API from Browse/Search. You Browse/Search via RSearch, and then lookup an item by ID via REST API. This approach allows you to simplify the REST API to very specific tasks, and use other existing services/interfaces where they are more 'standard' or widely accepted.
      • However, AuthZ may be simplified if you 'wrap' everything (even with a thin wrapper).
      • Also if you 'wrap' the Search/Browse in REST API, then at a later time you could swap out Solr (or whatever) for something like Elastic Search, and your clients would not need to change.

...

Attendees: Kevin Van De Velde, Mark Wood, Andrea Schweer, Richard Rodgers, Scott Phillips, Mark Diggory, Hardy Pottinger, Peter Dietz, Tim Donohue

Primary Discussion topics:

  • Revisiting REST API Discussion
    • Seems to be an assumption that we should only have one REST API
    • No reason why we cannot have multiple APIs, or even different implementations (and let the best one win out in the end)
    • Revisiting whether one REST API should 'wrap' all calls (several question this). Tim noticed Twitter (and many other sites) have many many REST APIs
      • https://dev.twitter.com/docs - Twitter specifically has a basic REST API, Search API, Streaming API, etc. They even do OAuth / AuthZ via REST API.
      • Comments from Bojan Suzic (via dspace-devel):

        I think this is a good idea. In some cases Twitter has multiple APIs for the reason of different underlying architecture and techical implementations. The other reason could be an iterative evolution of their services and infrastructure. In the case of DSpace - maybe this point could be considered from the functional point of view. Generally, all the REST API versions would depend on the same DSpace-API, so the rationale for separation could be based on some other asumption. For instance, user-centric (browse), admin-centric (update) or similar, and/or based on the development effort or resources necessary to carry out the change. The example for the latter may be an outstanding decision about future development which may hold development/release of the component or part which is already clear and non-disputable.

  • Business Logic API
    • Can we "mature" design/modeling of the Business Logic API by thinking of it more as a REST API?
    • What should make up a Business Logic API:
      • Item Submission processes (special ingest workflow)
      • Reviewer Workflow processes
      • Creation processes for Collections / Communities (also includes initializing roles, template items, etc. – almost like a "workflow" process to create these objects)
      • Creation processes for Groups / EPeople?
      • Running Curation Tasks
        • Richard Rodgers notes he's already building a demo REST API for Curation Framework using Jersey (http://jersey.java.net/). This will be posted to GitHub in near future
      • Smaller Activities: User feedback, Statistics, metadata registry, bitstream format registry
      • Authority Control? (managing internal or external controlled vocabularies & taxonomies)
    • Questions / Concerns on Business Logic API
      • Need to avoid it being too "large" / all-encompassing.
      • Where do we draw the line between underlying "DSpace Business Logic" and UI? E.g. Is Search/Browse part of Business Logic API? Or is it a separate API?
        • Mentioned that DSpace Discovery Module has been made more generic (no longer Solr specific) – may be the basis for a separate "Search API"?
      • Can we simplify? Is the "Business Logic API" just the REST API (GET/POST/PUT/DELETE objects)? Or is it still more than that?
  • REST API usage of Sakai bus
    • Are we satisfied with the Sakai-based REST implementation? Sounds like several have concerns about complexity & ongoing maintanence
    • May need to work towards a new implementation – based on Spring WebMVC or something else (Jersey? Apache CXF?)
    • Should be able to still reuse much of existing REST API work – especially the modeling of DSpace Objects as Resources bound to URLs
  • Mobile Device Support - DSpace is lagging behind. How do we plan to move in this direction?
    • Several seem interested in this, but no one known to be working directly on it.
    • Two levels of mobile support: (1) Making current UIs more 'mobile friendly' , (2) making DSpace more RESTful to support building native apps/clients
    • Suggestion from Bojan Suzic (via dspace-devel):

      One approach in this direction could be based on ClientUI idea (from GSoC 2011). Atomization and usage of lightweight components/architecture could lead to easier and less resource intensive development, maintenance and customization of UI. Also usage of cross-platform tools such as jQueryMobile, phoneGap or similar (+ REST API) could provide better coverage and require a less effort.

We took notes & shared additional links via IRC. http://irclogs.duraspace.org/index.php?date=2012-02-28

Day #2 Notes (Feb 29)

Attendees: Mark Wood, Peter Dietz, Tim Donohue, Andrea Schweer, Graham Triggs, Hardy Pottinger, Richard Rodgers, Brad McLean, Robin Taylor, Mark Diggory, Stuart Lewis

Primary Discussion topics:

  • Questions around DSpace w/ Fedora Inside updates.
    • Essentially DuraSpace still highly interested in this initiative. However, no active development going on. There have been ongoing discussions/brainstorms.
  • DuraCloud & ReplicationTaskSuite questions
    • Talking through how Replication Suite works with DuraCloud --> DSpace generates AIPs (in either METS or BagIt packages), those get replicated to DuraCloud. In your DuraCloud acct you could choose which underlying storage providers (e.g. Amazon) to use. Can use one, or even replicate across multiple providers (or move between providers at a later time).
    • Main use case is backup & restore (and helping DSpace admins to automate it as much as we can)
      • Currently an automated backup can happen via a ReplicateConsumer which can queue AIPs for upload whenever something changes in the system. Still a work in progress though. Working on stabilizing it.
    • Brainstorming some future use cases – one that came up is storing high-res archival quality images/videos in Cloud (DuraCloud), and using DSpace for access copy.
      • Replication Task Suite can support this as it lets you decide which Item Bundles you want included in AIPs. So, you could only include the "high-res" Bundle in AIPs, and exclude any Bundle which had access copies.
  • Metadata For All Objects (briefly touched on)
    • Is the "biggest bang for buck" just Bitstream metadata (esp. preservation/technical metadata)?
    • No, also need for enhanced Collection & Community metadata. Some use cases include:
      • Multi-lingual Communities & Collections
      • Subject Headings on Communities and Collections (as a way to more easily search them or organize them)
    • Bitstream metadata also very important. Especially looking towards JHOVE or DROID. Essentially, storing preservation/technical metadata about files is important.
    • Richard's Modernized DSpace work allows for Bitstream metadata (at least skeleton code is there).
    • Question: How should Bitstream metadata be exposed via Search/Browse?
      • Likely Configurable? But configuration can be difficult, if Bitstream metadata is very heterogenous (different schemas, etc.)
  • Discovery / Solr
    • Discovery is working towards making facets pluggable (develop new facets), also working towards making indexing pluggable.
      • The latter (pluggable indexing) may be useful in allowing for different types of objects/content to be indexed in different ways (This goes back to question about how to index/expose Bitstream metadata)
    • Big Question: Should we think about making Discovery the only Search/Browse? It would allow us to potentially simplify indexing issues as we get into Metadata on All Objects – since we can work towards one solution (Discovery/Solr) rather than multiple at once.
      • What would this mean? Essentially we'd replace the "/search/" directory (old Lucene Indexes) and the DB Browse tables with Solr.
      • Some Concerns / Questions posed:
        • Does it need to be Solr? Can't we also achieve this abstraction using just Lucene (which now has browse-based libraries), or even a Solr alternative?
        • Also concerns that Discovery in DSpace 1.7 was a bit buggy (possibly even a small "step back"). However, DSpace 1.8 looks to be better. Still, Discovery may need more testing to ensure it's stable & ready.
        • What about JSPUI? Discovery doesn't work yet on JSPUI. Two options: either port it to work with JSPUI (there was some interest in past), or else we'd have to think about deprecating JSPUI altogether.
      • Discovery in DSpace 1.8 offers more abstraction. It's now more "pluggable" and you could hypothetically use something besides Solr. Though Discovery obviously assumes that whatever it is using has similar features to Solr (faceting, etc)
      • Seems to be some interest in consolidating browse/search around Discovery. But, also several outstanding concerns (see above) that need to be answered.
      • One reason to like Discovery (from Stuart Lewis): http://blog.stuartlewis.com/2011/08/26/the-collection-is-dead-long-live-the-collection/Image Added
      • @Mire has some extra documentation about DSpace 1.8 Discovery changes it will be sharing. It's imperative that we work to ensure Discovery directions are brought towards central Committer control. That way we can get the proper 'buy-in' to make sure we can all support it, etc.
  • SkylightUI - https://github.com/skylightui/skylightImage Added
    • Different sort of UI for DSpace (built out of Auckland). Uses DSpace's Solr indexes
    • Uses Solr directly, rather than DSpace REST API, as Solr provides native replication support.
    • Essentially just goes against Solr's own REST interfaces. Uses Solr as a "read" API.

We took notes & shared additional links via IRC. http://irclogs.duraspace.org/index.php?date=2012-02-29Image Added