You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 28 Next »

What are ARKs?

ARKs (Archival Resource Keys) are high-functioning identifiers that lead you to things and to descriptions of those things. For example, this ARK,

     https://n2t.net/ark:/67531/metadc107835/

gets you to a dissertation, and adding a '?' on the end of the ARK gets you to its description:

     https://n2t.net/ark:/67531/metadc107835/?

What's an identifier?

On the internet, an identifier is a URL, or part of a URL. For example, this basic ARK identifier,

                            ark:/12148/btv1b8449691v/f29 

appears inside two different URLs (Uniform Resource Locators, also known as web links or web addresses):

     https://gallica.bnf.fr/ark:/12148/btv1b8449691v/f29

            https://n2t.net/ark:/12148/btv1b8449691v/f29

ARKs are especially good at being persistent identifiers.

What's a persistent identifier?

The average lifetime of a URL has been said to be 44 days. At the end of its life, a URL link breaks, which means it gives you the dreaded "404 Not Found" error. Irritating as that is – and most of us have seen it happen – it's a disaster for libraries, archives, museums, and other memory organizations. A persistent identifier (sometimes abbreviated PID) is an identifier that in principle keeps working far into the future, even as things move between websites. Normally when things move, everyone using the old links would need to update those links to use the new URLs. This is hard, error-prone, and expensive, and that's where identifier resolvers come in.

What's a resolver?

resolver is a website that is especially good at forwarding an incoming identifier (the one originally advertised to users) to another website that's currently best able to deal with it. Technically, forwarding is called resolution, one step of which is redirection. To make it work, the hostname of the resolver must be carefully chosen so it never has to be changed. Memory organizations, some of them centuries old, tend to have website hostnames that are well-suited to be resolvers (eg, bnf.fr). Some well-known, younger resolvers are n2t.net, identifiers.org, doi.org, handle.net, and purl.org.

How does the ARK differ from identifier types, like DOI, Handle, PURL, and URN?

Here's the short answer. These are all major types of persistent identifiers. Among them, ARKs are the only mainstream, non-siloed, non-paywalled identifiers that you can register to use in about 24 hours. Over 500 registered organizations have created an estimated 3.2 billion ARKs in the world, and no one has ever paid for the right to create them. That is not to say that keeping identifiers persistent is free of the usual costs of content management, hosting, monitoring, etc.

Is there a longer answer to that question?

First, let's dispense with what ARKs, DOIs, Handles, PURLs, and URNs have in common. Their resolvers all forward using ordinary redirection and they have similar structure. In these examples,

 https://n2t.net/ark:/99999/12345

   https://doi.org/10.99999/12345

https://handle.net/10.99999/12345

           https://purl.org/12345

      https://???/urn:99999:12345

they all start with the protocol (https://) plus a hostname, followed by the Name Assigning Authority (99999, 10.99999, or purl.org), which is the organization that created the identifiers. Finally there's the name, or local identifier, that it assigned (12345). Here are some other things these identifier types have in common.

  • All identifiers fail to stop the major causes of broken links: loss of funding, natural disaster, war, deliberate removal, human error, and provider neglect.
  • All identifier types make the end provider responsible for the labor of keeping redirection URLs updated.
  • All identifier types, can give access to any kind of thing, whether digital, physical, abstract, etc.
  • All identified content is subject to change on future visits.

So how do these identifiers differ? Here's a short list.

  1. When (not if, since all things pass) the https:// protocol and the hostname cease to exist, only ARKs and URNs will still indicate the kind of identifier that remains.
  2. For DOIs, Handles, and PURLs, you are required to use their respective resolvers. ARKs and URNs, permit you to use your own resolver.
  3. To create DOIs and Handles, you are required to pay a membership fee and, for DOIs, per-DOI charges. There are no fees for ARKs, PURLs, and URNs.
  4. Although you can use your own or a vendor resolver for your ARKs and URNs, all ARKs can be resolved via n2t.net, making it the closest thing to a "global ARK resolver".
  5. For URNs there is no single global resolver. In order to register to create URNs, you must apply for a URN namespace.

When should I use ARKs compared to other identifier types?

There is nothing inherent in ARKs, DOIs, Handles, PURLs, or URNs that make them more or less suitable to identify any kind of thing in any field, domain, or sector. In that sense they are all equally suitable.

Where they differ are in the nature of services and sociology and buzz. xxx

any of these identifier It is hard to generalize how people use these identifiers. DOIs, for example, used to be known primarily as identifiers for scientific and scholarly publications, with a mature community and service offering around "Crossref DOIs", but newer kinds of DOIs, such as those from DataCite and EIDR, are changing the nature of the DOI.

XXX

Don't identifier types differ in metadata flexibility, content negotiation, inflections, and suffix passthrough?

Only one resolver, n2t.net, supports all of these features, and it does so for any identifier stored with appropriate metadata. Contrary to popular belief, identifiers don't do anything – it's their resolvers that do or don't support these features. For example, suffix passthrough is a feature supported by n2t.net and purl.org ("partial redirect"), but not by doi.org or handle.net.

By metadata flexibility is meant the ability to store any metadata you want, including repeated elements, such as multiple authors and forwarding URLs, or no metadata at all. N2T has full metadata flexibility, while Crossref and DataCite have specific requirements (eg, the DataCite schema) to create their DOIs.

Content negotiation is a way for software to request descriptions of things that are not already in formats that might represent descriptions. Fortunately, to request descriptions without restriction, both humans and software can use inflections, exemplified by the '?' in the first answer. Backed by the right metadata, N2T is one of the few that does both.

What exactly is N2T (n2t.net)?

N2T.net is a resolver originally built for ARKs. N2T stands for Name-to-Thing because strong values of openness prevented it from becoming just another DOI/Handle/PURL-type silo. As a result, the "global ARK resolver" also resolves DOIs, Handles, PURLs, and URNs, along with 600 other kinds of identifier.

This counter-silo principle is also found in micro-service tools such as noid, which was built for ARKs but is routinely used by organizations that mint ARKs and those that mint Handles.

I've heard of ORCIDs and UUIDs – where do they fit in?

Those are special kinds of persistent identifiers. ORCIDs identify researchers, and they link to research works using ARKs, DOIs, etc. ORCIDs look like

     https://orcid.org/0000-0001-7604-8041

UUIDs are globally unique, 37-character strings that are easy for software to generate but only become usable as web addresses when made part of a URL, for example, in this ARK:

           https://n2t.net/ark:/65665/3c2e39526-e0c3-41ae-be4f-07558a9458eb


  • No labels