Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To anything digital, physical, abstract. That can include things that don't yet exist but to which you need to refer from objects that you're in the process of creating or planning, such as a link from a draft article to a dataset under preparation, or a link from an archived digital letter to a planned finding aid.

One caution is that you should generally assign ARKs to things that you own, control, or manage. Assigning ARKs to things you don't control is discouraged because such identifiers tend to be fragile.

...

You are free to create ARK strings as you wish, provided you use only digits, letters upper- and lower-case letters (ASCII, no diacritics), and the following characters:

= ~ * + @ _ $ . /

The last two characters are reserved in the event you wish to disclose ARK relationships. A unique feature of ARKs is that hyphens ('-') may appear but are identity inert, meaning that strings that differ only by hyphens are considered identical; for example, these strings

ark:/12345/141e86dc-d396-4e59-bbc2-4c3bf5326152

ark:/12345/141e86dcd3964e59bbc24c3bf5326152

identify the same thing. The reason for this feature is that text formatting processes out in the world routinely introduce extra hyphens into identifiers, breaking links to any server that treats hyphens as significant.

Regarding assignment strategy, it is common to leverage legacy identifiers. For example, a museum moth specimen number cd456_f987 might be advertised under the ark:/12345/cd456_f987. Some legacy identifiers may need to be altered in view of ARK character restrictions.

The second common strategy is to make up entirely new strings for your ARKs. In this case it is important to consider whether to make them opaque or non-opaque (or a bit of both). 

What are opaque identifiers?

Persistent identifier strings are typically opaque, deliberately revealing little about what they're assigned to, because non-opaque identifiers do not age or travel well. Organization names are notoriously transient, which is why NAANs are opaque numbers. As titles and dates are corrected, word meanings evolve (eg, innocent older acronyms may become offensive or infringing), strings meant to be persistent can become confusing or politically challenging. The generation and assignment of completely opaque strings comes with risk too, for example, numbers assigned sequentially reveal timing information and strings containing letters can unintentionally spell words. 

Example strings with a range of opacity
non-opaqueNetscape Permanent ArchiveGay_Divorcee_1934_April_1Name-to-Thing Resolver
opaque-ishx0001, x0002, ..., x9998GD/1934/04/01n2t.net
opaquer141e86dc-d396-4e59-bbc2-4c3bf532615219340401n2t
opaquest141e86dcd3964e59bbc24c3bf5326152h8k74926g12148

With ARKs there is no requirement for opacityARKs are not required to be opaque, but it is recommended that at least the base ARK object name (the ARK minus any hostname or qualifier) be made opaque, since the base object it tends to be name the main focus of persistence efforts. Persistence, hence opacity, of any qualifiers that might follow tends to . If any qualifier strings follow that name, it may be less important .You will probably want to consider tradeoffs regarding that they be opaque. To help decide what your approach to opacity, you may wish to consider compatibility with legacy identifiers , as well as and ease of string generation and transcription (eg, brevity, check digits). New strings can be created (minted) with date/time, UUID, and number generators, as well as Noid (Nice Opaque Identifiers) minters. 

Finally, opaque strings can be hard to administer because they are so "mute". The situation is greatly relieved by identifiers that tell you about themselves, or ARKs that return  metadata to the rescue!)Opaque strings are "mute" and therefore challenging to manage, which is why ARKs were designed to be "talking" identifiers. This means that if there's metadata, an ARK that comes in to your server with the '?' inflection should be able to talk about itself.

How do I serve, resolve, and advertise my ARKs?

First, decide what the user experience of accessing your ARKs will be, for example, a spreadsheet file, a PDF, an image, a landing page filled with formatted metadata and a range of choices, etc. Whichever you choose, plan for your server to be able to respond with metadata if your ARK should arrive with a '?' inflection after it.

Otherwise, serving ARKs is like serving URLs. Normally incoming URL strings address (get mapped to) content that your web server returns. If your server is ARK-aware, incoming ARKs (expressed as URLs) must be mapped to the same content. A common approach is to map the ARK to the URL using a software table that you update whenever the URL changes. In this case your server is acting as a local resolver. If you don't want to implement this yourself, there are ARK software tools and services that can help.

Another approach is to run your web server without change, but instead of updating local tables, you would update ARK-to-URL mapping tables residing at a non-local resolver. Examples of this can be found among vendors and in any organization that updates tables via ezidEZID.cdlib.org (which, due to a special relationship, updates resolver tables at n2t.net).

How do I advertise my ARKs?

An important decision is whether you will advertise (release, publish, disseminate) your URL-based ARKs under a local hostname or the N2T.net resolver. If local control or branding is important enough, you would advertise ARKs based at your local resolver. If you're concerned about the stability of your local hostname, you would advertise your ARKs based at n2t.net (see examples of both).

Resolving your ARKs through n2t.net is N2T is always possible for users, regardless of how you advertise them.

Anchor
#n2t
#n2t
What is N2T?

N2T.net is a global ARK resolver that also happens to know where to redirect send over 600 other types of identifier – ARK, DOI, PMID, Taxon, PDB, ISSN, etc.

N2T, which stands for Name-to-Thing, is a generic service for mapping names to into things. It has two main operational modesuses two kinds of stored data. First, it stores has individual metadata records for over 20 million things, including their ARK or DOI identifiers and redirection (target) URLs. For any identifier with a stored target, N2T can perform 

Can I make changes to a NAAN? xxx

. ItYou may request a NAAN by filling out an an online form. The NAAN you obtain will be listed alongside all other NAANs in the public NAAN registry, which you are free to browse through. Use that same form to update your registry entry, for example, if you make a change to the URL of your resolver, or if you have negotiated with another organization to carry on your work and take over your NAAN. If you transition into or out of a vendor relationship, there is no problem taking your NAAN with you.

NAANs subdivide the set of all possible ARKs (the ARK namespace). The subset of ARKs under a given NAAN can be further subdivided into shoulders (eg, 12345/x2, 98765/b4), which can make it easy to delegate autonomous ARK assignment to departments in a large organization. ARK resolution is loosely based on NAANs, but because organizations split, ARKs accommodate the namespace splitting problem by supporting management of a namespace by more than one organization.

...

object identifiers (eg, ARKs, DOIs), from three sources: EZID.cdlib.org, Internet Archive, and YAMZ.net. When such records include a redirection URL (target) and descriptive metadata, N2T can act on inflections and perform suffix passthrough.

Second, it has records for over 1500 "prefixes", including redirection "rules" for each NAAN (found in the NAAN registry) and for each of over 600 types of identifier (the latter is in partnership with identifiers.org).

Anchor
tools
tools
Are there tools and services to help with ARKs?

Here's a partial list of software tools for persistent identification that includes 

There are also some vendors, such as ezid.cdlib.org, and some more information on concepts and best practices.

Anchor
granularity
granularity
Can I assign ARKs to things inside something that already has an ARK?

Yes, ARKs can be assigned at any level of granularity, such as to a manuscript, to chapters inside it, to chapter sections, subsections, etc. An ARK can also be assigned to a thing that encloses other things. In ARKs the character '/' is reserved to help the recipient understand about containment, for example, the first ARK below contains the second ARK:

                            ark:/12148/btv1b8449691v

                            ark:/12148/btv1b8449691v/f29

That's the containment qualifier. There's only one other ARK qualifier, and it indicates variant forms of a thing by using the reserved character '.' in front of a suffix. For example, if these ARKs identify documents,

Here's a partial list of software tools for persistent identification that includes 

There are also some vendors, such as ezid.cdlib.org, and some more information on concepts and best practices.

...

Yes, ARKs can be assigned at any level of granularity, such as to a manuscript, to chapters inside it, to chapter sections, subsections, etc. An ARK can also be assigned to a thing that encloses other things. In ARKs the character '/' is reserved to help the recipient understand about containment, for example, the first ARK below contains the second ARK:

                            ark:/12148/btv1b8449691v/f29.pdf

                            ark:/12148/btv1b8449691v/f29

That's the containment qualifier. There's only one other ARK qualifier, and it indicates variant forms of a thing by using the reserved character '.' in front of a suffix. For example, if these ARKs identify documents,

                            ark:/12148/btv1b8449691v/f29.pdf

                            ark:/12148/btv1b8449691v/f29.html

...

                            ark:/12148/btv1b8449691v/f29.html

because they differ only by the suffix .pdf or .html, it can be inferred that they identify two different forms of the same document.

What are NAANs for and can I make changes to it?

NAANs subdivide the set of all possible ARKs (the ARK namespace). The subset of ARKs under a given NAAN can be further subdivided into shoulders (eg, 12345/x2, 98765/b4), which can make it easy to delegate autonomous ARK assignment to departments in a large organization. ARK resolution is loosely based on NAANs, but because organizations split, ARKs accommodate the namespace splitting problem by supporting management of a namespace by more than one organization.

You can change a NAAN by filling out the same online form for requesting a new NAAN.  You would fill it out, for example, to notify N2T if the URL of your resolver changed, or if you have negotiated with another organization that is to carry on your work and take over your NAAN. If you transition into or out of a vendor relationship, there is no impediment to taking your NAAN with you.

ARKs and other identifiers

...