This is a place to record thoughts on the interaction of resource versions and WAC authorization in the context of the Fedora API alignment sprint.
JIRA issue:
Design Questions
One way to fix this is to update the system so that links to other resources are URIs and not node references
The initial version of this change will not allow snapshot(tree) versioning. The first pass at spec compliance will only include versioning one resource at a time.
The discussion was to have the ldp:containment triples reference the cannonical URL of the resources.
When versioning a tree of resources, what happens if one of the resources in the tree does not have the version interaction model attached to it?
Given that we will not be doing snapshot (tree) versioning at this point, this should not be an issue to consider at the present time.
For versioning: Versioning Delta/Specification Notes
For authorization: TBD
Before reading through this, it would be good to review the Fedora Specification Versioning Section as well as understand the Memento Terminology.
This design relates specifically to how versioning could be done in the Modeshape Implementation of Fedora 4
A PUT or POST request to create an object will make a resource versionable if it includes header Link: rel="type" with type of http://fedora.info/definitions/fcrepo#VersionedResource
A LDPR will be created as a LDPRv with the versioning type.
A LDPCv will be created, from which a TimeMap can be generated.
A LDPRm will be generated, contained by the LDPCv.
Any subsequent responses from the LDPRv will include the appropriate memento links in the header: Timegate, Timemap
A PUT request to an Existing LDPR will make a resource versionable if it includes header Link: rel="type" with type of http://fedora.info/definitions/fcrepo#VersionedResource
The versioning type will be added to the LDPR, making it a LDPRv.
A LDPCv will be created, from which a TimeMap can be generated.
A LDPRm will be generated, contained by the LDPCv.
Any subsequent responses from the LDPRv will include the appropriate memento links in the header: timegate, timemap
A HEAD request on the LDPRv will return response with Link rel="type" http://fedora.info/definitions/fcrepo#VersionedResource which indicates versioning support and a 'Link rel="timemap"' points to the URL of the LDPCv/TimeMap.
An OPTIONS request on LDPCv/TimeMap that contains an "Allow: POST" header
indicates that versions can be created by a client.
Note: when creating a new version of the LDPRv, only the single resource itself will be versioned. There is no concept of "tree" snapshots anymore.
A POST request to the LDPCv with an empty body and no "Memento-Datetime" header will cause a new memento of the LDPRv to be created with current date/time.
A POST request to the LDPCv with header "Memento-Datetime" and no body will create a historic verision with current state of the LDPRv an empty resource with the specificed date/time.
A POST request to the LDPCv with header "Memento-Datetime" and a body will create a historic version with the specified body and date/time.
A POST request to the LDPCv with a body and no "Memento-Datetime" header to create a version with the specified body and the current datetime.
A GET request to the LDPCv with the "Accept: application/link-format" header will cause the TimeMap to be returned.
A GET request to the LDPCv with no "Accept:" header, or one specificying an RDF format will result in the LDPCv being returned in rdf format.
The response from the GET will include a "Vary-Post: Memento-Datetime" to indicate that a client can request a specific time be associated with a memento when it's created via a POST.
A GET request to the TimeGate Resource (the LDPRv itself) with "Accept-Datetime" header specified will return a 302 response, with a 'Location' header providing the URI of the LDPRm associated with that datetime, or the closest one if there is not an exact match.
example header usage: "Accept-Datetime: Thu, 31 May 2007 20:35:00 GMT"
See: Datetime negotiation algorithm example for Accept-Datetime negotiation details.
We are currently planning to follow Memento Datetime negotiation pattern 1.1, see: section 4.1.1.
A GET request to LDPRm/Memento (if the LDPRm/Memento has its own URI), will result in the memento being returned if it exists.
Any response from the LDPRv will include link relation headers of type "timegate" (referring to the LDPRv), "original" (also referring to this LDPRv), and "timemap" (referencing the URI of the LDPCv).
A DELETE request to LDPRm/Memento will result in that memento being deleted.
Note: This interaction still needs to be ironed out as this is currently under discussion in Spec Issue 217
A PUT request to LDPRv/TimeGate with header (can't be Content-Location
, but something like it) pointing to the LDPRm/Memento URI to indicate the version to restore
Memento has very little to say about security, mainly just that it is up to the server in terms of how access to previous versions work (most likely we want it to behave the same way it behaved at the point of the snapshot, but that is something that needs to be decided), and what memento headers to expose during authentication:
https://tools.ietf.org/html/rfc7089#section-7
There are three separate entities when looking at ACLs.
This is a slightly modified version of how the SOLID WebAC recommends finding the ACL.
Given this, the following is then true:
If we are preserving ACLs as part of a version rather than using the original resource's, then for resources without an assigned or inherited ACLs Fedora would need to record the Default ACL at the time of the snapshot somehow. 5.3 Inheritance and Default ACLs
These are just some notes written down largely to think through things - there might be a better way of doing this type of work.
An alternate design might be to serialize each memento into a binary resource and have that contained by the LDPCv. The key thing is to figure out how you store the archival date of each memento and retrieve that info to produce the TimeMap. Should it be in the memento itself? In the LDPCv? If so, how is that stored?
Here's an example of a LDPRv - what signifies that it is a LDPRv is that a request on the LDPRv returns memento related 'Link' headers in the reponse (and possibly a VersionedResource (spec issue 233) header as well) . These 'Link' headers point to the TimeMap and TimeGate for this resource . The current behavior is that a 'hasVersions' triple is returned when a LDPR is requested.
$ curl http://localhost:8080/rest/xyz HTTP/1.1 200 OK Date: Mon, 18 Sep 2017 14:58:26 GMT Link: <http://localhost:8080/rest/xyz>; rel="original timegate" Link: <http://localhost:8080/rest/xyz/fcr:versions>; rel="timemap"; from="Fri, 8 Sep 2017 21:35:19 GMT"; until="Mon, 23 Sep 2017 15:41:04 GMT"; @prefix premis: <http://www.loc.gov/premis/rdf/v1#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix fedora: <http://fedora.info/definitions/v4/repository#> . @prefix ldp: <http://www.w3.org/ns/ldp#> . <http://localhost:8080/rest/xyz> rdf:type fedora:Container ; rdf:type fedora:Resource ; rdf:type ldp:RDFSource ; rdf:type ldp:Container ; fedora:lastModifiedBy "bypassAdmin"^^<http://www.w3.org/2001/XMLSchema#string> ; fedora:createdBy "bypassAdmin"^^<http://www.w3.org/2001/XMLSchema#string> ; fedora:lastModified "2017-09-18T20:01:33.501Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ; fedora:created "2017-09-15T21:19:49.731Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ; fedora:writable "true"^^<http://www.w3.org/2001/XMLSchema#boolean> ; fedora:hasParent <http://localhost:8080/rest> ; ldp:contains <http://localhost:8080/rest/xyz/abc> ; |
Below were just a few initial thoughts on what the internal structure of a LDPCv & LDPRm might look like.
But first, here's a sample TimeMap in 'application/link-format' MIME type straight from the Memento Spec:
HTTP/1.1 200 OK Date: Thu, 21 Jan 2010 00:06:50 GMT Server: Apache Content-Length: 4883 Content-Type: application/link-format Connection: close <http://a.example.org>;rel="original", <http://arxiv.example.net/timemap/http://a.example.org> ; rel="self";type="application/link-format" ; from="Tue, 20 Jun 2000 18:02:59 GMT" ; until="Wed, 09 Apr 2008 20:30:51 GMT", <http://arxiv.example.net/timegate/http://a.example.org> ; rel="timegate", <http://arxiv.example.net/web/20000620180259/http://a.example.org> ; rel="first memento";datetime="Tue, 20 Jun 2000 18:02:59 GMT" ; license="http://creativecommons.org/publicdomain/zero/1.0/", <http://arxiv.example.net/web/20091027204954/http://a.example.org> ; rel="last memento";datetime="Tue, 27 Oct 2009 20:49:54 GMT" ; license="http://creativecommons.org/publicdomain/zero/1.0/", <http://arxiv.example.net/web/20000621011731/http://a.example.org> ; rel="memento";datetime="Wed, 21 Jun 2000 01:17:31 GMT" ; license="http://creativecommons.org/publicdomain/zero/1.0/", <http://arxiv.example.net/web/20000621044156/http://a.example.org> ; rel="memento";datetime="Wed, 21 Jun 2000 04:41:56 GMT" ; license="http://creativecommons.org/publicdomain/zero/1.0/", ... |
A possible LDPCv is below. One issue to work through is that the LDPCv might have an ACL that applies to it, and then perhaps there is a separate ALC that applies to all the mementos. How do you indicate that other ACL? Check out the section on the algorithm for finding an ACL. You may end up deciding on a simplier authorization setup.
Note that there is no memento ontology - we may want to make one up...?
@prefix acl: <http://www.w3.org/ns/auth/acl#> . @prefix iana: <http://www.iana.org/assignments/relation/> . @prefix ldp: <http://www.w3.org/ns/ldp#> . @prefix memento: <http://example.com/memento#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix time: <http://www.w3.org/2006/time#> . </path/to/resource/xyz/fcr:versions> a ldp:Container ; acl:hasAccessControl </path/to/acls> ; # this is for the LDPCv itself, for the TimeMap retrieval prov:startedAtTime "2017-09-08T21:35:19Z"^^xsd:dateTime ; # first memento prov:endedAtTime "2017-09-11T15:41:04Z"^^xsd:dateTime ; # last memento memento:hasAccessControl </path/to/acls> ; memento:hasOriginalResource </path/to/orig/resource/xyz> ; # how else can we represent this? is this a given based on url? memento:hasTimeGate </path/to/orig/resource/xyz> ; # how else can we represent this? is this a given based on url? iana:first </path/to/resource/xyz/fcr:versions/12344> ; iana:last </path/to/resource/xyz/fcr:versions/12347> ; ldp:contains </path/to/resource/xyz/fcr:versions/12344>, </path/to/resource/xyz/fcr:versions/12345>, </path/to/resource/xyz/fcr:versions/12347>, </path/to/resource/xyz/fcr:versions/12346> . |
The timemap needs to have the archival time in it - so that needs to be stored some where. One thought was to store that data in the memento (as a hidden property that the user doesn't know about). Not sure if that's a good idea or not. Also, should the memento returned contain information about the next / prev memento? Does it still qualify as a memento if the resource has that data in it?
@prefix acl: <http://www.w3.org/ns/auth/acl#> . @prefix iana: <http://www.iana.org/assignments/relation/> . @prefix ldp: <http://www.w3.org/ns/ldp#> . @prefix memento: <http://example.com/memento#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix time: <http://www.w3.org/2006/time#> . </path/to/resource/xyz/fcr:versions/12345> a ldp:RDFSource , prov:InstantaneousEvent; prov:atTime "2012-04-30T20:40:40"^^xsd:dateTime; memento:hasTimegate </path/to/orig/resource/xyz> ; # how else can we represent this? is this this a given based on url? memento:hasOriginalResource </path/to/orig/resource/xyz> ; # how else can we represent this? Is this a given based on url? iana:next </path/to/xyz/fcr:versions/12346> ; # to memento iana:prev </path/to/xyz/fcr:versions/12344> ; # to memento ... triples from original resource at the time of versioning... or, if one decided to put the mememento one more layer down to keep it totally separate, it might look like this (as in this LDPRm really just wraps the actual resource): (I'm not clear on how a binary and it's metadata would be represented) ldp:contains </path/to/xyz/fcr:versions/12345/version> , </path/to/xyz/fcr:versions/12345/version/fcr:metadata> ; |
Versioning/Authorization Use-Cases
It seems to be difficult to determine the identity of the "parent" of a resource via ldp:contains with versioning.
The current fedora implementation creates (and links to) new LDPRs when a non-empty LDPRv is versioned. These resources are neither an LDPRv, nor LDPRm
For example, consider a container A and A/B where A ldp:contains B.
Creating a version v1 of <A> creates a resource <A/fcr:versions/v1>. This is essentially an LDPRm, and contains triple <A/fcr:versions/v1> ldp:contains <A/fcr:versions/v1/B>.
Issues are:
All of this may be fine, but it lies outside of any specification. Essentially, "when you create a new version of a resource, that resource versions now points new and different things that it didn't previously point to, and have nothing to do with versioning"
If some of this is crazy or overkill, please don't do it. I'm finding it harder to create interface and classes than I expected without actually writing any code.
(POST w/o body) → LDPCv → (copy) LDPRv → LDPRm
Questions:
(POST w body) → LDPCv → (compare LDP subtype) LDPRv → LDPRm
General interaction questions.