Date

Attendees

Goals

  • ARK spec suffixes and signposting

Discussion items

TimeItemWhoNotes

announcements


upcoming meetings, calls for papers, submission deadlines
iPRES 2021 paper/poster submissions has been extended to May 31. iPRES will be a hybrid conference for both attendees and presenters.  http://ipres2021.ac.cn/dct/page/1

Suffix normalization. Currently, the spec says the final normalization rule is:

R: If there are any components with a period on the left and a slash on the right, either the component and the preceding period must be moved to the end of the Name part or the ARK must be thrown out as malformed.

I think we all know why, but I’d like to confirm our collective understanding. For example, let’s say that this thing

ark:12345/b6cd7/8fg.pdf

contains something called “km”. What is the contained thing’s ARK? It is tempting but improper to say

(1) ark:12345/b6cd7/8fg.pdf/km

but rule R says this should be rewritten (forgiving the tempted end user who is guessing that this might lead to a contained thing) or rejected (forgiving the end resolver for having a simple parser). If rewritten, the result would be

(2) ark:12345/b6cd7/8fg/km.pdf

The rule was put in to simplify ARK parsing (all suffixes come at the end, all containment comes before all suffixes) and to eliminate confusing questions, such as

Does this imply that “pdf/km” is a suffix?

If form (2) suggests a pdf format, what format does “.pdf/km” suggest?


TC: the reserved nature of the period '.' has always seemed problematic; rejecting seems better than rewriting

KH: rewriting my have unintended consequences

GJ: I would never rewrite someone's id; there's no way to determine the correctness of someone's id; maybe we should encourage the / and . to denote containment and/or variants w.o. requiring it

JK: what if it's the end user who rewrote-by-guessing the provider ARK but got it wrong

DV: should play well with URL
CM: this seems confusing
All: agree with rejecting instead of rewriting

ACTION: drop the rewriting option for this normalization step

TC: could be done via "did you mean..." with a 404 page


Signposting revisited. This was first broached in 2020-03-16 meeting, discussed in 2020-12-21 Technical WG Agenda and Notes, and recently as an N2T github issue.

Currently, the ARK spec is silent on conveying URLs/ids for related information. Should it instead make a recommendation?

As has been pointed out, there are “signposting” conventions for this using HTTP response headers. So, with a returned resource, the end resolver could return “here are some links to various kinds of metadata about this resource”. Example:

Link: </ark:/12345/x9876?info> rel="describedby" type="application/ld+json";

Conversely, with returned metadata the resolver could return “here are some links to various forms of the thing that this metadata describes”.

Btw, how do people feel about using “cite-as”?

  • propose adding it to example session in ARK spec, but leaving real spec to post-RFC tech work

TC: this is a great idea; should we recommend RFC8288 controlled vocabulary?

GJ: this isn't purely in scope for the ARK spec

DV: yes, but signposting could be added as an implementation guideline

All: agreed


Post-RFC tech work – where?

  • github? lyrasis wiki? arks.org?

JK: git for collaborative dev?
DV: a couple example sites we might follow: the simple approach used by ESIP science https://github.com/ESIPFed/science-on-schema.org/
DV: at other extreme (of learning curve and bells and whistles), there's https://jupyterbook.org/intro.html
TC: I'm ok with github or jupyter
All: github looks promising; will discuss options further

Action items

  • John Kunze drop the rewriting option for this normalization step in the ARK spec